6. Databases

Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Chapter 6. Databases

6.0. Introduction

Storing data in a database is not an uncommon task for developers—in this day and age, it’s practically a given. As with nearly every language under the sun, there is a bevy of drivers and clients to interact with databases from Clojure. What sets Clojure apart, however, is its ability to compose.

As we’ve said before in this book: in Clojure, data is king. You’ll find many of the database client libraries do a little legwork to connect you to the datastore, then promptly get out of your way. Such libraries don’t do so out of laziness (at least, we hope), but rather out of the principle of separation of concerns: I’ll handle connecting to the database; you handle the domain (your data). In fact, the best APIs are built out of data, providing only one or two functions and letting you manipulate queries and data to be inserted directly as Clojure data structures.

In this chapter, we’ll visit a wide number of databases and techniques, including the SQLs, full-text search, Mongo, Redis, and Datomic.

Datomic is one of the more interesting recent developments in the database landscape. Invented and maintained by Rich Hickey (who you will probably recognize as the same person who wrote Clojure itself), it is a scalable, transactional, value-oriented, time-aware database built around the same principles and philosophies as Clojure. If you like Clojure, you should definitely give Datomic a try, both as your application’s datastore and also as a learning tool to further explore functional, data-oriented programming.

6.1. Connecting to an SQL Database

by Tom Hicks; originally submitted by Simone Mosciatti

Problem

You want to connect your program to an SQL database.

Solution

Use the clojure.java.jdbc library for JDBC-based access to SQL databases.

To follow along with this recipe, you’ll need a running SQL database and an existing table to connect to. We suggest PostgreSQL.^[18]

After you have PostgreSQL running (presumably on localhost:5432), run the following command to create a database for this recipe:

# On Mac:
$ /Applications/Postgres93.app/Contents/MacOS/bin/createdb cookbook_experiments

# Everyone else:
$ createdb cookbook_experiments

Before starting, add [org.clojure/java.jdbc "0.3.0"] to your project’s dependencies. You’ll also need a JDBC driver for the RDBMS of your choice. If you’re following along with this sample, use [org.postgresql/postgresql "9.2-1003-jdbc4"]. To start a REPL using lein-try, enter the following Leiningen command:

$ lein try org.clojure/java.jdbc "0.3.0" 
           java-jdbc/dsl "0.1.0" 
           org.postgresql/postgresql "9.2-1003-jdbc4"

To interact with a database using clojure.java.jdbc, all you need is a connection specification. This specification takes the form of a plain Clojure map with values indicating the database driver type, location, and authentication credentials:

(def db-spec {:classname "org.postgresql.Driver"
              :subprotocol "postgresql"
              :subname "//localhost:5432/cookbook_experiments"
              ;; Not needed for a non-secure local database...
              ;; :user "bilbo"
              ;; :password "secret"
              })

Create a relation in the specified database by invoking the clojure.java.jdbc/create-table function with the specification and any number of column specifications:

(require '[clojure.java.jdbc :as jdbc]
         '[java-jdbc.ddl :as ddl])

(jdbc/db-do-commands db-spec false
  (ddl/create-table
     :tags
     [:id :serial "PRIMARY KEY"]
     [:name :varchar "NOT NULL"]))
;; -> (0)

Many other functions that query and manipulate a database, such as clojure.java.jdbc/insert!, take a database specification directly as their first argument:

(require '[java-jdbc.sql :as sql])

(jdbc/insert! db-spec :tags
                      {:name "Clojure"}
                      {:name "Java"})
;; -> ({:name "Clojure", :id 1} {:name "Java", :id 2})

(jdbc/query db-spec (sql/select * :tags (sql/where {:name "Clojure"})))
;; -> ({:name "Clojure", :id 1})

Discussion

The clojure.java.jdbc library provides functions that wrap the basic capabilities of the Java JDBC specification. The additional java-jdbc.sql and java-jdbc.ddl namespaces from the java-jdbc/dsl project implement small DSLs to generate basic SQL DML and DDL statements.

Because it relies upon Java JDBC, the clojure.java.jdbc library is usable with many of the most popular SQL databases, including Apache Derby, HSQLDB, Microsoft SQL Server, MySQL, PostgreSQL, and SQLite.

The parameters necessary to set up and access a data source are called the database specification (often abbreviated “db-spec”) and are provided in a simple Clojure map. The specification usually includes such parameters as the driver class name, the subprotocol for a particular RDBMS type, the hostname, the port number, the database name, and the username and password.

The clojure.java.jdbc library also permits several other forms of data source specification, including Java URIs, already-open connections, JNDI connections, and plain strings. For example, a complete URI string may be provided under the :connection-uri key:

;; As a spec string
(def db-spec
  "jdbc:postgresql://bilbo:secret@localhost:5432/cookbook_experiment")

;; As a connection URI map...
;; with a username and password...
(def db-spec
  {:connection-uri (str "jdbc:postgresql://localhost:5432/cookbook_experiments?"
                       "user=bilbo&password=secret")})

;; or without
(def db-spec
  {:connection-uri "jdbc:postgresql://localhost:5432/cookbook_experiments"})

Database records are represented as Clojure maps, with the table’s column names used as keys. Retrieval of a set of database records produces a sequence of maps that can then be processed with all the normal Clojure functions:

(jdbc/query db-spec (sql/select * :tags))
;; -> ({:name "Clojure", :id 1}
;      {:name "Java", :id 2})

(filter #(not (.endsWith (:name %) "ure"))
        (jdbc/query db-spec (sql/select * :tags)))
;; -> ({:name "Java", :id 2})

There are other Clojure libraries to access relational databases, and each provides a different abstraction and DSL for the manipulation of SQL data and expressions, such as Korma. The clojure.java.jdbc library, however, covers a large portion of everyday database access needs.

6.2. Connecting to an SQL Database with a Connection Pool

by Tom Hicks and Filippo Diotalevi

Problem

You would like to connect to an SQL database efficiently using a connection pool.

Solution

Use the BoneCP connection and statement pooling library to wrap your JDBC-based drivers, creating a pooled data source. The pooled data source is then usable by the clojure.java.jdbc library, as described in Recipe 6.1, “Connecting to an SQL Database”.

To follow along with this recipe, you’ll need a running SQL database and an existing table to connect to. We suggest PostgreSQL.^[19]

After you have PostgreSQL running (presumably on localhost:5432), run the following command to create a database for this recipe:

# On Mac:
$ /Applications/Postgres93.app/Contents/MacOS/bin/createdb cookbook_experiments

# Everyone else:
$ createdb cookbook_experiments

Before starting, add the BoneCP dependency ([com.jolbox/bonecp "0.8.0.RELEASE"]), as well as the appropriate JDBC libraries for your RDBMS, to your project’s dependencies. You’ll also need a valid SLF4J logger. Alternatively, you can follow along in a REPL using lein-try:

$ lein try com.jolbox/bonecp "0.8.0.RELEASE" 
           org.clojure/java.jdbc "0.3.0" 
           java-jdbc/dsl "0.1.0" 
           org.postgresql/postgresql "9.2-1003-jdbc4" 
           org.slf4j/slf4j-nop  # Just do not log anything

First, create a database specification containing the parameters for accessing the database. This includes keys for the initial and maximum pool sizes, as well as the number of partitions:

(def db-spec {:classname "org.postgresql.Driver"
              :subprotocol "postgresql"
              :subname "//localhost:5432/cookbook_experiments"
              :init-pool-size 4
              :max-pool-size 20
              :partitions 2})

To create a pooled BoneCPDataSource object, define a function (for convenience) that uses the parameters in the database specification map:

(import 'com.jolbox.bonecp.BoneCPDataSource)

(defn pooled-datasource [db-spec]
  (let [{:keys [classname subprotocol subname user password
                init-pool-size max-pool-size idle-time partitions]} db-spec
        min-connections (inc (int (/ init-pool-size partitions)))
        max-connections (inc (int (/ max-pool-size partitions)))
        cpds (doto (BoneCPDataSource.)
                   (.setDriverClass classname)
                   (.setJdbcUrl (str "jdbc:" subprotocol ":" subname))
                   (.setUsername user)
                   (.setPassword password)
                   (.setMinConnectionsPerPartition min-connections)
                   (.setMaxConnectionsPerPartition max-connections)
                   (.setPartitionCount partitions)
                   (.setStatisticsEnabled true)
                   (.setIdleMaxAgeInMinutes (or idle-time 60)))]
       {:datasource cpds}))

Use the convenience function to define a pooled data source for connecting to your database:

(def pooled-db-spec (pooled-datasource db-spec))

pooled-db-spec
;; -> {:datasource #<BoneCPDataSource ...>}

Pass the database specification as the first argument to any clojure.java.jdbc functions that query or manipulate your database:

(require '[clojure.java.jdbc :as jdbc]
         '[java-jdbc.ddl :as ddl]
         '[java-jdbc.sql :as sql])

(jdbc/db-do-commands pooled-db-spec false
  (ddl/create-table
    :blog_posts
    [:id :serial "PRIMARY KEY"]
    [:title "varchar(255)" "NOT NULL"]
    [:body :text]))
;; -> (0)

(jdbc/insert! pooled-db-spec
              :blog_posts
              {:title "My first post!" :body "This is going to be good!"})
;; -> ({:body "This is going to be good!", :title "My first post!", :id 1})

(jdbc/query pooled-db-spec
            (sql/select * :blog_posts (sql/where{:title "My first post!"})))
;; -> ({:body "This is going to be good!", :title "My first post!", :id 1})

Discussion

As shown in the solution, the clojure.java.jdbc library can create database connections from JDBC data sources, which allows connections to be easily pooled by the BoneCP or other pooling libraries.

The BoneCP library wraps existing JDBC classes to allow the creation of efficient data sources. It can adapt traditional unpooled drivers and data sources by augmenting them with transparent pooling of Connection and PreparedStatement instances.

While the library offers several ways to create data sources, most users will find the examples provided here to be the easiest.

BoneCP offers several dozen configuration parameters that control the operation of the data source and its connections. Luckily, most of these configuration parameters have built-in defaults. Parameters may be specified to control such facets as the min, max, and initial pool size; the number of idle connections; the age of connections; transaction handling; the use of PreparedStatement pooling; and if, when, and how pooled connections are tested.

Pooled data resources (threads and database connections) may be released by calling the close method on the BoneCPDataSource class of the library. Attempting to reuse the pooled data source after it is closed will result in an SQL exception:

(.close (:datasource pooled-db-spec))
;; -> nil

6.3. Manipulating an SQL Database

by Tom Hicks

Problem

You want your Clojure program to manipulate tables and records in an SQL database.

Solution

Use the clojure.java.jdbc library for JDBC-based access to SQL databases.

To follow along with this recipe, you’ll need a running SQL database and an existing table to connect to. We suggest PostgreSQL.^[20]

After you have PostgreSQL running (presumably on localhost:5432), run the following command to create a database for this recipe:

# On Mac:
$ /Applications/Postgres93.app/Contents/MacOS/bin/createdb cookbook_experiments

# Everyone else:
$ createdb cookbook_experiments

Before starting, add [org.clojure/java.jdbc "0.3.0"] and [java-jdbc/dsl "0.1.0"] to your project’s dependencies. You’ll also need a JDBC driver for the RDBMS of your choice. If you’re following along with this sample, use [org.postgresql/postgresql "9.2-1003-jdbc4"]. To start a REPL using lein-try, enter the following Leiningen command:

$ lein try org.clojure/java.jdbc "0.3.0" 
           java-jdbc/dsl "0.1.0" 
           org.postgresql/postgresql "9.2-1003-jdbc4"

Then, define how the database should be accessed:

(def db-spec {:classname "org.postgresql.Driver"
              :subprotocol "postgresql"
              :subname "//localhost:5432/cookbook_experiments"})

To create a new table, use the java-jdbc.ddl/create-table function to generate the necessary DDL statement, and then pass the statement to the jdbc/db-do-commands function to execute it:

(require '[clojure.java.jdbc :as jdbc]
         '[java-jdbc.ddl :as ddl])

(jdbc/db-do-commands db-spec
  (ddl/create-table :fruit
    [:name "varchar(16)" "PRIMARY KEY"]
    [:appearance "varchar(32)"]
    [:cost :int "NOT NULL"]
    [:unit "varchar(16)"]
    [:grade :real]))
;; -> (0)

Insert complete records into a table using the clojure.java.jdbc/insert! function, invoking it with a vector of the column values for each row. Be sure to provide the column values in the order in which the columns were declared in the table:

(jdbc/insert! db-spec :fruit
  nil ; column names omitted
  ["Red Delicious" "dark red" 20 "bushel" 8.2]
  ["Plantain" "mild spotting" 48 "stalk" 7.4]
  ["Kiwifruit" "fresh"  35 "crate" 9.1]
  ["Plum" "ripe" 12 "carton" 8.4])
;; -> (1 1 1 1)

To query the database, generate the SQL for the query with the java-jdbc.sql/select function, then invoke clojure.java.jdbc/query with the result:

(require '[java-jdbc.sql :as sql])

(jdbc/query db-spec
  (sql/select * :fruit (sql/where {:appearance "ripe"})))
;; -> ({:grade 8.4, :unit "carton", :cost 12, :appearance "ripe", :name "Plum"})

If you no longer need a particular table, invoke clojure.java.dbc/jdb-do-commands with the appropriate DDL statements generated by java-jdbc.ddl/drop-table:

(jdbc/db-do-commands db-spec
  (ddl/create-table :delete_me
    [:name "varchar(16)" "PRIMARY KEY"]))

(jdbc/db-do-commands db-spec (ddl/drop-table :delete_me))
;; -> (0)

Discussion

The clojure.java.jdbc library provides functions that wrap the basic capabilities of the Java JDBC specification. The java-jdbc/dsl project’s java-jdbc.sql and java-jdbc.ddl namespaces implement small DSLs to generate basic SQL DML and DDL statements.

Note

java-jdbc/dsl used to be a part of clojure.java.jdbc, but was removed to keep the API of the core library as small as possible.

The java-dbc.ddl/create-table function generates the DDL needed to create a table. The arguments are a table name and a vector for each column specification. At the time of this writing, table-level specifications are not yet supported.

Inserting and updating records

Records may be inserted into a table in a variety of ways. In addition to the vector method illustrated, the clojure.java.jdbc/insert! function can accept one or more maps with column names as keys:

(jdbc/insert! db-spec :fruit
  {:name "Banana" :appearance "spotting" :cost 35}
  {:name "Tomato" :appearance "rotten" :cost 10 :grade 1.4}
  {:name "Peach" :appearance "fresh" :cost 37 :unit "pallet"})
;; -> ({:grade nil, :unit nil, :cost 35, :appearance "spotting", :name "Banana"}
;;     {:grade 1.4, :unit nil, :cost 10, :appearance "rotten", :name "Tomato"}
;;     {:grade nil, :unit "pallet", :cost 37, :appearance "fresh",
;;      :name "Peach"})

If you want to insert rows but only specify some columns’ values, you can invoke clojure.java.jdbc/insert! with a vector of column names followed by one or more vectors containing values for those columns:

(jdbc/insert! db-spec :fruit
  [:name :cost]
  ["Mango" 84]
  ["Kumquat" 77])
;; -> (1 1)

To update existing records, invoke clojure.java.jdbc/update! with a map of column names to new values. The optional java-jdbc.sql/where clause controls which rows will be updated:

(jdbc/update! db-spec :fruit
  {:grade 7.0 :appearance "spotting" :cost 75}
  (sql/where {:name "Mango"}))
;; -> (1)

Transactions

Database transactions are available to ensure that multiple operations are performed atomically (i.e., all or none). The clojure.java.jdbc/with-db-transaction macro creates a transaction-aware connection from the database specification. Use the transaction-aware connection for the duration of the transaction:

;; Insert two new fruits atomically
(jdbc/with-db-transaction [trans-conn db-spec]
  (jdbc/insert! trans-conn :fruit {:name "Fig" :cost 12})
  (jdbc/insert! trans-conn :fruit {:name "Date" :cost 14}))
;; -> ({:grade nil, :unit nil, :cost 14, :appearance nil, :name "Date"})

If an exception is thrown, the transaction is rolled back:

;; Query how many items the table has now
(defn fruit-count
  "Query how many items are in the fruit table."
  [db-spec]
  (let [result (jdbc/query db-spec (sql/select "count(*)" :fruit))]
    (:count (first result))))

(fruit-count db-spec)
;; -> 11

(jdbc/with-db-transaction [trans-conn db-spec]
  (jdbc/insert! trans-conn :fruit
    [:name :cost]
    ["Grape" 86]
    ["Pear" 86])
  ;; At this point the insert! call is complete, but the transaction
  ;; is not. An exception will cause the transaction to roll back,
  ;; leaving the database unchanged.
  (throw (Exception. "sql-test-exception")))
;; -> Exception sql-test-exception ...

;; The table still has the same number of items
(fruit-count db-spec)
;; -> 11

Transactions can be explicitly set to roll back with the clojure.java.jdbc/db-set-rollback-only! function. This setting can be unset with the clojure.java.jdbc/db-unset-rollback-only! function and tested with the clojure.java.jdbc/is-rollback-only function:

(fruit-count db-spec)
;; -> 11

(jdbc/with-db-transaction [trans-conn db-spec]
  (jdbc/db-set-rollback-only! trans-conn)
  (jdbc/insert! trans-conn :fruit {:name "Pear" :cost 69}))
;; -> ({:grade nil, :unit nil, :cost 69, :appearance nil, :name "Pear"})

;; The table still has the same number of items
(fruit-count db-spec)
;; -> 11

Reading and processing records

Database records are returned from queries as Clojure maps, with the table’s column names used as keys. Retrieval of a set of database records produces a sequence of maps that can then be processed with all the normal Clojure functions. Here, we query all the records in the fruit table, gathering the name and grade of any low-quality fruit:

(->> (jdbc/query db-spec (sql/select "name, grade" :fruit))
     ;; Filter all fruits by fruits with grade < 3.0
     (filter (fn [{:keys [grade]}] (and grade (< grade 3.0))))
     (map (juxt :name :grade)))
;; -> (["Tomato" 1.4])

The preceding example uses the SQL DSL provided by the java-jdbc.sql namespace. The DSL implements a simple abstraction over the generation of SQL statements. At present, it provides some basic mechanisms for selects, joins, where clauses, and order-by clauses:

(defn fresh-fruit []
  (jdbc/query db-spec
    (sql/select [:f.name] {:fruit :f}
      (sql/where {:f.appearance "fresh"})
      (sql/order-by :f.name))))

(fresh-fruit)
;; -> ({:name "Kiwifruit"} {:name "Peach"})

The use of the SQL DSL is entirely optional. For more direct control, a vector containing an SQL query string and arguments can be passed to the query function. The following function also finds low-quality fruit but does it by passing a quality threshold value directly to the SQL statement:

(defn find-low-quality [acceptable]
  (jdbc/query db-spec
              ["select name, grade from fruit where grade < ?" acceptable]))

(find-low-quality 3.0)
;; -> ({:grade 1.4, :name "Tomato"})

The jdbc/query function has several optional keyword parameters that control how it constructs the returned result set. The :result-set-fn parameter specifies a function that is applied to the entire result set (a lazy sequence) before it is returned. The default argument is the doall function:

(defn hi-lo [rs] [(first rs) (last rs)])

;; Find the highest- and lowest-cost fruits
(jdbc/query db-spec
            ["select * from fruit order by cost desc"]
            :result-set-fn hi-lo)
;; -> [{:grade nil, :unit nil, :cost 77, :appearance nil, :name "Kumquat"}
;;     {:grade 1.4, :unit nil, :cost 10, :appearance "rotten", :name "Tomato"}]

The :row-fn parameter specifies a function that is applied to each result row as the result is constructed. The default argument is the identity function:

(defn add-tax [row] (assoc row :tax (* 0.08 (row :cost))))

(jdbc/query db-spec
             ["select name,cost from fruit where cost = 12"]
             :row-fn add-tax)
;; -> ({:tax 0.96, :cost 12, :name "Plum"} {:tax 0.96, :cost 12, :name "Fig"})

The Boolean :as-arrays? parameter indicates whether to return the results as a set of vectors or not. The default argument value is false:

(jdbc/query db-spec
            ["select name,cost,grade from fruit where appearance = 'spotting'"]
            :as-arrays? true)
;; -> ([:name :cost :grade] ["Banana" 35 nil] ["Mango" 75 7.0])

Finally, the :identifiers parameter takes a function that is applied to each column name in the result set. The default argument is the clojure.string/lower-case function, which lowercases the table’s column names before they are converted to keywords. If your application needs to perform some different conversion of column names, provide an alternate function using this keyword parameter.

The clojure.java.jdbc library is a good choice for quick and easy access to most popular relational databases. Its use of Clojure’s vectors and maps to represent records blends well with Clojure’s emphasis on data-oriented programming. Novice users of SQL can conveniently utilize the provided DSLs while expert users can more directly construct and execute complex SQL statements.

6.4. Simplifying SQL with Korma

by Dmitri Sotnikov and Chris Allen

Problem

You want to work with data stored in a relational database without writing SQL by hand.

Solution

Use Korma as a DSL for generating SQL queries and traversing relationships.

Before starting, add [korma "0.3.0-RC6"] and [org.postgresql/postgresql "9.2-1002-jdbc4"] to your project’s dependencies or start a REPL using lein-try:

$ lein try korma org.postgresql/postgresql

To follow along with this recipe, you’ll need a running SQL database and an existing table to connect to. We suggest PostgreSQL.^[21]

After you have PostgreSQL running (presumably on localhost:5432), run the following command to create a database for this recipe:

# On Mac:
$ /Applications/Postgres.app/Contents/MacOS/bin/createdb learn_korma

# Everyone else:
$ createdb learn_korma

To connect to the learn_korma database, use defdb with the postgres helper. Because Korma is a rather large DSL, it is acceptable to :refer :all its contents into model namespaces:

(require '[korma.db :refer :all])

(defdb db
  (postgres {:db "learn_korma"}))

To interact with a table in your database, define and create what Korma calls entities. Here you’ll define an entity for blog posts:

(defentity posts
  (pk :id)
  (table :posts) ; Table name
  (entity-fields :title :content)) ; Default fields to SELECT

Normally you’d use a proper migration library for your schema, but for the sake of simplicity, we’ll create a table manually. Use the exec-raw function to execute raw SQL statements against the database. You should only do this where strictly necessary:

(def create-posts (str "CREATE TABLE posts "
                       "(id serial, title text, content text,"
                       "created_on timestamp default current_timestamp);"))

(exec-raw create-posts)

Now that the posts table exists, you can invoke insert against posts with a map’s values to add records to the database. Each record is represented by a map. The names of the keys in the map must match the names of the columns in the database:

(insert posts
        (values nil {:title "First post" :content "blah blah blah"}))

To retrieve values from the database, query using select. Successful queries will return a sequence of maps, each containing keys representing the column names:

(select posts (limit 1))
;; -> [{:created_on #inst "2013-11-01T19:21:10.652920000-00:00",
;;      :content "blah blah blah",
;;      :title "First post",
;;      :id 1}]

To correct or change existing records, use the update macro. Invoke update against posts, providing a set-fields declaration to specify what should change and a where declaration narrowing what records to make those changes to:

(update posts
        (set-fields {:title "Best Post"})
        (where {:title "First post"}))
;; -> {:title "Best Post", :id 1 ...}

The delete macro works similarly to update, but doesn’t take a set-fields declaration:

(delete posts
        (where {:title "Best Post"}))

(select posts)
;; -> []

Discussion

Korma provides a simple and intuitive way to construct SQL queries from Clojure. The advantage of using Korma is that the queries are written as regular code instead of SQL strings. You can easily compose queries and abstract common operations.

Korma exposes these abilities through its entity system. Entities are an abstraction over traditional SQL tables that mask the complexity of SQL’s crufty and complicated DDL (data definition language). Via the defentity macro, you have access to all of the power of traditional SQL, packaged in a readable, Clojure-based DSL.

When defining entities with defentity, you can pass in a number of options. Some common options include table to specify a table name, pk to specify the default ID field (primary key), entity-fields to specify the default fields for SELECT statements, or even db to specify which database the entity belongs in.

Entities also simplify defining relations between tables. Entity declaration statements such as has-one, has-many, belongs-to, and many-to-many define relationships to other entities. Consider adding an author to each of our blog posts:

;; Create authors, assuming posts has an author_id
(defentity authors
  ;; By default, foreign-key will be :authors_id, but that is a little
  ;; awkward
  (has-many posts {:fk :author_id}))

;; Redefine posts such that it assumes it has an author_id
(defentity posts
  (belongs-to authors {:fk :author_id}))

;; Create the authors table
(exec-raw "CREATE TABLE authors (id serial, name text);")

;; Add the authors_id field to posts
(exec-raw "ALTER TABLE posts ADD COLUMN author_id int;")

(def ryan (insert authors (values {:name "Ryan"})))
ryan
;; -> {:name "Ryan", :id 1}

(insert posts (values [{:title "My first post!", :author_id (:id ryan)}
                       {:title "My second post.", :author_id (:id ryan)}]))
(select posts
        (where {:author_id (:id ryan)}))
;; -> [{:author_id 1,
;;      ...
;;      :title "My first post!",
;;      :id 4}
;;     {:author_id 1,
;;      ...
;;      :title "My second post.",
;;      :id 5}]

Stemming from its entity system, Korma provides DSL versions of common SQL statements such as select, update, insert, and delete. One of the most interesting query types is select, which provides support for most every SELECT statement option, include simplified table joins (via its relation helpers). Some notable helpers include aggregate, join, order, group, and having. Chances are, if it is an SQL statement feature, Korma has a helper for it.

Korma’s DSL isn’t only convenient, it’s also composable. Using select* instead of select returns a query as a value, instead of an evaluated result. You can pipeline query values through regular select helpers to build up or store partial queries. Finally, invoke select on a query value to execute it and receive its result:

(defn authors-posts
  "Retrieve all posts for a person with a given name"
  [name]
  (-> (select* posts)
      (with authors)
      (where {:authors.name name})))

;; Find the title of all posts by the author named "Ryan"
(-> (authors-posts "Ryan")
    (where (like :title "%second%"))
    (fields :title)
    select)
;; -> [{:title "My second post."}]

Another convenience Korma provides is default connections. You may have noticed in the examples that we never referred to the db we defined. When only a single connection is defined, it will be used by default and you don’t have to pass it explicitly. If you like, you can define multiple connections and wrap series of statements in a with-db call:

(with-db db
  (select (authors-posts "Ryan")))

6.5. Performing Full-Text Search with Lucene

by Osbert Feng

Problem

You want to support flexible full-text search over an unstructured or semistructured dataset using Lucene. For example, you want to return all people in the United States that have “Clojure” anywhere in their job descriptions.

Solution

Use Clucy, a Clojure wrapper for Lucene. Clucy provides the tools to build and query indexes from within a Clojure process.

To follow along with this recipe, create a new project (lein new text-search), add [clucy "0.4.0"] to its dependencies, and start a REPL using lein repl.^[22]

The following code creates and queries a simple in-memory index:

(require '[clucy.core :as clucy])

(def index (clucy/memory-index))
;; -> #'user/index


(clucy/add index
   {:name "Alice" :description "Clojure expert"
    :location "North Carolina, United States"}
   {:name "Bob" :description "Clojure novice"
    :location "Berlin, Germany"}
   {:name "Eve" :description "Eavesdropper"
    :location "Maryland, United States"})
;; -> nil

(clucy/search index "description:clojure AND location:"united states"" 10)
;; -> ({:name "Alice",
;;      :location "North Carolina, United States",
;;      :description "Clojure expert"})

Discussion

Lucene is a Java library for information retrieval. To use Lucene, you generate documents and index them for later retrieval. Documents consist of fields and terms. In this example, the documents are quite small, but Lucene is capable of efficiently indexing large numbers of very large documents as well.

Clucy wraps Lucene in a convenient manner for use in Clojure and is capable of generating Lucene documents directly from simple Clojure maps, where keys map to fields and values map to textual data to be indexed.

clucy.core/search takes an index, a query string, and the number of results to return as parameters. Lucene is able to efficiently query in part because it is not necessary to return all matching documents, just the top n best matches.

Note

Clucy does not work as well out of the box with nested values in your maps. Be sure to flatten out values into simple strings for proper indexing and retrieval.

This example uses a memory-index, which stores the index in system memory. In most real applications, you’ll want to persist the index to disk, which allows it to grow larger than the available memory and allows you to restart your process without re-indexing. Clucy lets you construct a Lucene disk index via the disk-index function:

(def index (clucy.core/disk-index "/tmp/index"))

As part of the process for generating documents, Lucene calls an analyzer on your strings to generate tokens for indexing. The default StandardAnalyzer is sufficient for most purposes and can be customized with a list of “stop words” to be ignored during token generation:

(import 'org.apache.lucene.analysis.standard.StandardAnalyzer)
;; -> org.apache.lucene.analysis.standard.StandardAnalyzer

(import 'org.apache.lucene.analysis.util.CharArraySet)
;; -> org.apache.lucene.analysis.util.CharArraySet

(def stop-words
  (doto (CharArray. clucy.core/*version* 3 true)
    (.add "do")
    (.add "not")
    (.add "index")))

(binding [clucy.core/*analyzer* (StandardAnalyzer.
                                 clucy.core/*version*
                                 stop-words)]
  ;; Invoke index add and search forms here, within the binding
  )

However, in other situations you may need to use a different analyzer or write your own. For example, the EnglishAnalyzer uses Porter stemming and other techniques better suited to taking into account pluralization or possessives:

(import org.apache.lucene.analysis.en.EnglishAnalyzer)
;; -> org.apache.lucene.analysis.en.EnglishAnalyzer

(binding [clucy.core/*analyzer* (EnglishAnalyzer. clucy.core/*version*)]
  ;; Invoke index add and search forms here, within the binding
  )

The basic search query syntax is field:term. By default, multiple clauses will perform an OR search, so an explicit AND is required if both clauses must be true.

If no field is specified, there is an implicit field _content that indexes all map values. Documents returned are ordered by Lucene’s default relevance algorithm, which takes into account term frequency, distance, and document length:

(clucy.core/search index "clojure united states" 10)
;; -> ({:name "Alice",
;;      :location "North Carolina, United States",
;;      :description "Clojure expert"}
;;     {:name "Eve",
;;      :location "Maryland, United States",
;;      :description "Eavesdropper"}
;;     {:name "Bob",
;;      :location "Berlin, Germany",
;;      :description "Clojure novice"})

6.6. Indexing Data with ElasticSearch

by Michael Klishin

Problem

You want to index data using the ElasticSearch indexing and search engine.

Solution

Use Elastisch, a minimalistic Clojure wrapper around the ElasticSearch Java APIs.

In order to successfully work through the examples in this recipe, you should have ElasticSearch installed and running on your local system. You can find details on how to install it on the ElasticSearch website.

ElasticSearch supports multiple transports (e.g., HTTP, native Netty-based transport, and Memcached). Elastisch supports HTTP and native transports. This recipe will use an HTTP transport client for the examples and explain how to switch to the native transport in the discussion section.

To follow along with this recipe, add [clojurewerkz/elastisch "1.2.0"] to your project’s dependencies, or start a REPL using lein-try:

$ lein try clojurewerkz/elastisch

Before you can index and search with Elastisch, it is necessary to tell Elastisch what ElasticSearch node to use. To use the HTTP transport, you use the clojurewerkz.elastisch.rest/connect! function that takes an endpoint as its sole argument:

(require '[clojurewerkz.elastisch.rest :as esr])

(esr/connect! "http://127.0.0.1:9200")

Indexing

Before data can be searched over, it needs to be indexed. Indexing is the process of scanning the text and building a list of search terms and data structures called a search index. Search indexes allow search engines such as ElasticSearch to efficiently retrieve relevant documents for a query.

The process of indexing involves a few steps:

Create an index.
[Optional] Define mappings (how documents should be indexed).
Submit documents for indexing via HTTP or other APIs.

To create an index, use the clojurewerkz.elastisch.rest.index/create function:

(require '[clojurewerkz.elastisch.rest.index :as esi])

(esr/connect! "http://127.0.0.1:9200")

;; Create an index with the given settings and no custom mapping types
(esi/create "test1")

;; Create an index with custom settings
(esi/create "test2" :settings {"number_of_shards" 1}))

A full explanation of the available indexing settings is outside the scope of this recipe. Please refer to the Elastisch documentation on indexing for full details.

Creating mappings

Mappings define the fields in a document and what the indexing characteristics are for each field. Mapping types are specified when an index is created using the :mapping option:

(esr/connect! "http://127.0.0.1:9200")

;; Mapping types map structure is the same as in the ElasticSearch API reference
(def mapping-types {"person"
                    {:properties {:username  {:type "string" :store "yes"}
                                 :first-name {:type "string" :store "yes"}
                                 :last-name  {:type "string"}
                                 :age        {:type "integer"}
                                 :title      {:type "string"
                                              :analyzer "snowball"}
                                 :planet     {:type "string"}
                                 :biography  {:type "string"
                                              :analyzer "snowball"
                                              :term_vector
                                              "with_positions_offsets"}}}})

(esi/create "test3" :mappings mapping-types)))

Indexing documents

To add a document to an index, use the clojurewerkz.elastisch.rest.document/create function. This will cause a document ID to be generated automatically:

(require '[clojurewerkz.elastisch.rest.document :as esd])

(esr/connect! "http://127.0.0.1:9200")

(def mapping-types {"person"
                    {:properties {:username  {:type "string" :store "yes"}
                                 :first-name {:type "string" :store "yes"}
                                 :last-name  {:type "string"}
                                 :age        {:type "integer"}
                                 :title      {:type "string" :analyzer "snowball"}
                                 :planet     {:type "string"}
                                 :biography  {:type "string"
                                              :analyzer "snowball"
                                              :term_vector
                                              "with_positions_offsets"}}}})

(esi/create "test4" :mappings mapping-types)

(def doc {:username "happyjoe"
          :first-name "Joe"
          :last-name "Smith"
          :age 30
          :title "The Boss"
          :planet "Earth"
          :biography "N/A"})


(esd/create "test4" "person" doc)
;; => {:ok true, :_index people, :_type person,
;;     :_id "2vr8sP-LTRWhSKOxyWOi_Q", :_version 1}

clojurewerkz.elastisch.rest.document/put will add a document to the index but expects a document ID to be provided:

(esr/put "test4" "person" "happyjoe" doc)

Discussion

Whenever a document is added to the ElasticSearch index, it is first analyzed.

Analysis is a process of several stages:

Tokenization (breaking field values into tokens)
Filtering or modifying tokens
Combining tokens with field names to produce terms

How exactly a document was analyzed defines what search queries will match (find) it. ElasticSearch is based on Apache Lucene and offers several analyzers developers can use to achieve the kind of search quality and performance they need. For example, different languages require different analyzers: English, Mandarin Chinese, Arabic, and Russian cannot be analyzed the same way.

It is possible to skip performing analysis for fields and specify whether field values are stored in the index or not. Fields that are not stored still can be searched over but will not be included into search results.

ElasticSearch allows users to define exactly how different kinds of documents are indexed, analyzed, and stored.

ElasticSearch has excellent support for multitenancy: an ElasticSearch cluster can have a virtually unlimited number of indexes and mapping types. For example, you can use a separate index per user account or organization in a SaaS (Software as a Service) product.

There are two ways to index a document with ElasticSearch: you can submit a document for indexing without an ID or update a document with a provided ID, in which case if the document already exists, it will be updated (a new version will be created).

While it is fine and common to use automatically created indexes early in development, manually creating indexes lets you configure a lot about how ElasticSearch will index your data and, in turn, what kinds of queries it will be possible to execute against it.

How your data is indexed is primarily controlled by mappings. They define which fields in documents are indexed, if/how they are analyzed, and if they are stored. Each index in ElasticSearch may have one or more mapping types. Mapping types can be thought of as tables in a database (although this analogy does not always stand). Mapping types are the heart of indexing in ElasticSearch and provide access to a lot of ElasticSearch functionality.

For example, a blogging application may have types such as article, comment, and person. Each has distinct mapping settings that define a set of fields documents of the type have, how they are supposed to be indexed (and, in turn, what kinds of queries will be possible over them), what language each field is in, and so on. Getting mapping types right for your application is the key to a good search experience. It also takes time and experimentation.

Mapping types define document fields and their core types (e.g., string, integer, or date/time). Settings are provided to ElasticSearch as a JSON document, and this is how they are documented on the ElasticSearch site.

With Elastisch, mapping settings are specified as Clojure maps with the same structure (schema). A very minimalistic example:

{"tweet" {:properties {:username  {:type "string" :index "not_analyzed"}}}}

Here is a brief and very incomplete list of things that you can define via mapping settings:

Document fields, their types, and whether they are analyzed
Document time to live (TTL)
Whether a document type is indexed
Special fields ("_all", default field, etc.)
Document-level boosting
Timestamp field

When an index is created using the clojurewerkz.elastisch.rest.index/create function, mapping settings are passed with the :mappings option, as seen previously.

When it is necessary to update mapping for an index, you can use the clojurewerkz.elastisch.rest.index/update-mapping function:

(esi/update-mapping "myapp_development" "person"
                    :mapping {:properties
                              {:first-name {:type "string" :store "no"}}})

In a mapping configuration, settings are passed as maps where keys are names (strings or keywords) and values are maps of the actual settings. In this example, the only setting is :properties, which defines a single field—a string that is not analyzed:

{"tweet" {:properties {:username  {:type "string" :index "not_analyzed"}}}}

There is much more to the indexing and mapping options, but that’s outside the scope of a single recipe. See the Elastisch indexing documentation for an exhaustive list of the capabilities provided.

6.7. Working with Cassandra

by Alex Petrov

Problem

You want to work with data stored in Cassandra.

Solution

Use the Cassaforte library to connect to a Cassandra cluster and work with the records in the database.

In order to successfully work through the examples in this recipe, you should have Cassandra installed. You can find details on how to install Cassandra on the GettingStarted page of the wiki.

To follow along with this recipe, add [clojurewerkz/cassaforte "1.1.0"] to your project’s dependencies, or start a REPL using lein-try:

$ lein try clojurewerkz/cassaforte

In order to connect to your Cassandra cluster and create and use your first keyspace, you will need the clojurewerkz.cassaforte.client, clojurewerkz.cassaforte.cql, and clojurewerkz.cassaforte.query namespaces.

clojurewerkz.cassaforte.client is responsible for connection—the other two provide an easy interface to execute queries:

(require '[clojurewerkz.cassaforte.client :as client]
         '[clojurewerkz.cassaforte.cql :as cql]
         '[clojurewerkz.cassaforte.query :as q])

;; Connect to 2 nodes in a cluster
(client/connect! ["localhost" "another.node.local"])

;; Create a keyspace named `cassaforte_keyspace`, using
;; the Simple Replication Strategy and a replication factor of 2
(cql/create-keyspace "cassaforte_keyspace"
                     (q/with {:replication
                              {:class "SimpleStrategy"
                              :replication_factor 2 }}))

;; Switch to the keyspace
(cql/use-keyspace "cassaforte_keyspace")

Now, you can create tables and start inserting data into them. For that, invoke the create-table and insert functions of the clojurewerkz.cassaforte.cql namespace:

(cql/create-table "users"
                  (q/column-definitions {:name :varchar
                                         :city :varchar
                                         :age  :int
                                         :primary-key [:name]}))

Now, insert several users into the table:

(cql/insert "users" {:name "Alex" :city "Munich" :age (int 26)})
(cql/insert "users" {:name "Robert" :city "Brussels" :age (int 30)})

You can access these records using a select query. For example, if you want to retrieve all the users from the table or use limit in your query, you can run:

;; Will retrieve all users
(cql/select "users")

;; Will retrieve top 10 users
(cql/select "users" (q/limit 10))

Alternatively, if you want to retrieve information about a single person by a given name, you can add a where clause to it:

(cql/select "users" (q/where :name "Alex"))

Discussion

Cassandra is an open source implementation of many of the ideas in Amazon’s landmark Dynamo Paper. It’s a key/value datastore, and it’s not aware of any relationships between tables and data points. Cassandra is a distributed datastore and is designed to be highly available. For that, it replicates data within the cluster. The data is stored redundantly on multiple nodes. If one node fails, data is still available for retrieval from a different node or multiple nodes.

Cassandra starts making sense when your data is rather big. Because it was built for distribution, you can scale your reads and writes, and fine-tune and manage your database’s consistency and availability. Cassandra handles network partitions well, so even if several of your nodes are unavailable for some time, you will still be able to read and write data until the network partition heals. If your dataset is rather small, you don’t expect it to grow significantly anytime soon, and you need to run many ad hoc queries against the dataset, then Cassandra may not make sense.

Consistency and availability are tunable values. You can get better availability by sacrificing data consistency: due to network partitions, not all the nodes will hold the latest snapshot of data at all times, but you’ll be still able to respond to writes and receive reads. If you choose to have strong consistency, conversely, the latency will increase, since more nodes should respond successfully for reads and writes. Eventual consistency guarantees that, if no conflicting writes are made for the data point, eventually all nodes will hold the latest value.

Like most datastores, Cassandra has concepts of separate databases (keyspaces in Cassandra terminology). Every keyspace holds tables (sometimes called column families). Tables hold rows, and rows consist of columns. Each column has a key (column name), value, write timestamp, and time to live.

Cassandra uses two different communication protocols: an older binary protocol called Thrift, and CQL (Cassandra Query Language). All query operators in Cassaforte generate CQL code under the hood. Here are a couple of examples of how these operations translate to CQL internally:

(cql/select "users" (q/where :name "Alex"))
;; SELECT * FROM users WHERE name='Alex';

(cql/insert "users" {:name "Alex" :city "Munich" :age (int 26)})
;; INSERT INTO users (name, city) VALUES ('Munich', 26);

There’s much more to Cassandra than just creating tables and inserting values. If you want to update records in your database, you can call the update function:

(cql/update "users"
             {:city "Berlin"}
             (q/where :name "Alex"))

Deleting records from the database is just as easy:

;; Will delete just one user
(cql/delete :users (q/where :name "Alex"))

;; Will delete all users whose names match within IN clause
(cql/delete :users (q/where :name [:in ["Alex" "Robert"]]))

If you’d like to execute some arbitrary CQL statements, outside of Cassaforte’s macro-based DSL, you can pass a string to the client/execute function:

(client/execute
  "INSERT INTO users (name, city, age) VALUES ('Alex', 'Munich', 19);")

For each issued write, you can specify an optional time to live to expire the data after a certain period of time. This is useful for caching and for data that you only want to hold for a certain period of time (like user sessions). For example, if you want the record to live for just 60 seconds, you can run:

(cql/insert "users" {:name "Alex" :city "Munich" :age (int 26)}
                    (q/using :ttl 60))

Another concept that people like about Cassandra is distributed counters. Counter columns provide an efficient way to count or sum anything you need. This is achieved by using atomic increment/decrement operations on values. In order to create a table with a counter from Cassaforte, you can use the :counter column type:

(cql/create-table :scores
                  (q/column-definitions {:username :varchar
                                         :score    :counter
                                         :primary-key [:username]}))

You can increment and decrement counters by using the increment-by and decrement-by queries:

(cql/update :scores
            {:score (q/increment-by 50)}
            (q/where :name "Alex"))

(cql/update :scores
            {:score (q/decrement-by 5)}
            (q/where :name "Robert"))

6.8. Working with MongoDB

by Clinton Dreisbach

Problem

You want to work with data stored in MongoDB.

Solution

Use Monger to connect to MongoDB and search or manipulate the data. Monger is a Clojure wrapper around the Java MongoDB driver.

Before using Mongo from your Clojure code, you must have a running instance of MongoDB to connect to. See MongoDB’s installation guide for instructions on how to install MongoDB on your local system.

When you’re ready to write a Clojure MongoDB client, start a REPL using lein-try:

$ lein try com.novemberain/monger

To connect to MongoDB, use the monger.core/connect! function. This will store your connection in the *mongodb-connection* dynamic var. If you want to get a connection to use without storing it in a dynamic var, you can use monger.core/connect with the same options:

(require '[monger.core :as mongo])

;; Connect to localhost
(mongo/connect! {:host "127.0.0.1" :port 27017})

;; Disconnect when you are done
(mongo/disconnect!)

Once you are connected, you can insert and query documents easily:

(require '[monger.core :as mongo]
         '[monger.collection :as coll])
(import '[org.bson.types ObjectId])

;; Set the database in the *mongodb-database* var
(mongo/use-db! "mongo-time")

;; Insert one document
(coll/insert "users" {:name "Jeremiah Forthright" :state "TX"})

;; Insert a batch of documents
(coll/insert-batch "users" [{:name "Pete Killibrew" :state "KY"}
                            {:name "Wendy Perkins" :state "OK"}
                            {:name "Steel Whitaker" :state "OK"}
                            {:name "Sarah LaRue" :state "WY"}])

;; Find all documents and return a com.mongodb.DBCursor
(coll/find "users")

;; Find all documents matching a query and return a DBCursor
(coll/find "users" {:state "OK"})

;; Find documents and return them as Clojure maps
(coll/find-maps "users" {:state "OK"})
;; -> ({:_id #<ObjectId 520...>, :state "OK", :name "Wendy Perkins"}
;;     {:_id #<ObjectId 520...>, :state "OK", :name "Steel Whitaker"})

;; Find one document and return a com.mongodb.DBObject
(coll/find-one "users" {:name "Pete Killibrew"})

;; Find one document and return it as a Clojure map
(coll/find-one-as-map "users" {:name "Sarah LaRue"})
;; -> {:_id #<ObjectId 520...>, :state "WY", :name "Sarah LaRue"}

Discussion

MongoDB, especially with Monger, can be a natural choice for storing Clojure data. It stores data as BSON (binary JSON), which maps well to Clojure’s own vectors and maps.

There are several ways to connect to Mongo, depending on how much you need to customize your connection and whether you have a map of options or a URI:

;; Connect to localhost, port 27017 by default
(mongo/connect!)

;; Connect to another machine
(mongo/connect! {:host "192.168.1.100" :port 27017})

;; Connect using more complex options
(let [options (mongo/mongo-options :auto-connect-retry true
                                   :connect-timeout 15
                                   :socket-timeout 15)
      server (mongo/server-address "192.168.1.100" 27017)]
  (mongo/connect! server options))

;; Connect via a URI
(mongo/connect-via-uri! (System/genenv "MONGOHQ_URL"))

When inserting data, giving each document an _id is optional. One will be created for you if you do not have one in your document. It often makes sense to add it yourself, however, if you need to reference the document afterward:

(require '[monger.collection :as coll])
(import '[org.bson.types ObjectId])

(let [id (ObjectId.)
      user {:name "Lola Morales"}]
  (coll/insert "users" (assoc user :_id id))
  ;; Later, look up your user by id
  (coll/find-map-by-id "users" id))
;; -> {:_id #<ObjectId 521...>, :name "Lola Morales"}

In its idiomatic usage, Monger is set up to work with one connection and one database, as monger.core/connect! and monger.core/use-db! set dynamic vars to hold their information.

It is easy to work around this, though. You can use binding to set these explicitly around code. In addition, you can use the monger.multi.collection namespace instead of monger.collection. All functions in the monger.multi.collection namespace take a database as their first argument:

(require '[monger.core :as mongo]
         '[monger.multi.collection :as multi])

(mongo/connect!)

;; use-db! takes a string for the database, as it is a convenience function,
;; but for monger.multi.collection and other functions, we need to use
;; get-db to get the database
(let [stats-server (mongo/connect "stats.example.org")
      app-db (mongo/get-db "mongo-time")
      geo-db (mongo/get-db "geography")]

  ;; Record data in our stats server
  (binding [mongo/*mongodb-connection* stats-server]
    (multi/insert (mongo/get-db "stats") "access"
                  {:ip "127.0.0.1" :time (java.util.Date.)}))

  ;; Find users in our application DB
  (multi/find-maps app-db "users" {:state "WY"})

  ;; Insert a square in our geography DB
  (multi/insert geo-db "shapes"
                {:name "square" :sides 4
                 :parallel true :equal true}))

The basic find functions in monger.collection will work for simple queries, but you will soon find yourself needing to make more complex queries, which is where monger.query comes in. This is a domain-specific language for MongoDB queries:

(require '[monger.query :as q])

;; Find users, skipping the first two and getting the next three.
(q/with-collection "users"
  (q/find {})
  (q/skip 2)
  (q/limit 3))

;; Get all the users from Oklahoma, sorted by name.
;; You must use array-map with sort so you can keep keys in order.
(q/with-collection "users"
  (q/find {:state "OK"})
  (q/sort (array-map :name 1)))

;; Get all users not from Oklahoma or with names that start with "S".
(q/with-collection "users"
  (q/find {"$or" [{:state {"$ne" "OK"}}
                  {:name #"^S"}]}))

6.9. Working with Redis

by Jason Webb

Problem

You want to work with data in Redis.

Solution

Use Carmine to connect to and interact with Redis.

Note

To use this recipe, you should first install Redis and have it running locally. You can find details on how to install Redis at the official Redis download page. If you are on Windows, you will want to look at the Microsoft Open Tech GitHub Redis project.

To follow along with this recipe, add [com.taoensso/carmine "2.2.0"] to your project’s dependencies, or start a REPL using lein-try:

$ lein try com.taoensso/carmine

To use Carmine, you must first define a connection spec:

(def server-connection {:pool {:max-active 8}
                        :spec {:host     "localhost"
                               :port     6379
                               ;;:password ""
                               :timeout  4000}})

Carmine supports all of the Redis commands, and the names (for the most part) match the Redis documentation. Use the wcar function and the connection specification server-connection to send all the Redis commands you already know and love:

(require `[taoensso.carmine :as car :refer (wcar)])

(wcar server-connection (car/set "Nick" "Nack"))
;; -> "OK"
(wcar server-connection (car/get "Nick"))
;; -> "Nack"
(wcar server-connection (car/hset "founder" "name" "Tim"))
;; -> 0
(wcar server-connection (car/hset "founder" "age" 59))
;; -> 0
(wcar server-connection (car/hgetall "founder"))
;; -> [name Tim age 59]

Passing in multiple commands will pipeline them and return the results together as a vector:

(wcar server-connection (car/set "paddywhacks" 0)
                        (car/incr "paddywhacks")
                        (car/get "paddywhacks"))
;; -> ["OK" 1 "1"]

Discussion

Redis describes itself as a data structure server. With data structures similar to the core data structures in Clojure, they make a natural pairing for a wide range of problems. Redis’s speed and key/value storage make it especially useful for caching and memoization applications (more on that later).

You can remove some boilerplate by wrapping the call to wcar in a macro that passes the connection specification for you:

(defmacro wcar* [& body] `(car/wcar server-connection ~@body))

(wcar* (car/set "Nick" "Nack"))
;; -> "OK"
(wcar* (car/get "Nick"))
;; -> "Nack"

Serialization is handled automatically and for most cases just works. Simply pass in the data you want to store, and Carmine will automatically serialize/deserialize it for you:

(wcar* (car/set "some-key" {:event "An Event", :timestamp (new java.util.Date)})
       (car/get "some-key"))
;; -> [OK {:event An Event, :timestamp #inst "2013-08-18T21:31:33.993-00:00"}]

This works great as long as you stick to core Clojure data types. However, if you need to support storing custom data types, you will need to deal with the underlying serialization library, called Nippy. For more information, see the Nippy GitHub project.

Redis is great to use as a memoization storage backend. Obviously, there are some serious trade-offs to consider when weighing against an in-memory solution, such as the core.cache library. But for the right situation, it can be an incredible boost. Consider, for example, memoizing a function that hits an external web service to fetch the current weather. With minimal effort, multiple servers can share the latest data and even have stale data automatically expire and refresh. The following is an example for just such a situation:

(defn redis-memoize
  "Convert a function to one that is memoized using Redis as storage."
  [key-prefix ttl-seconds connection-spec f]
  (fn [& args]
    (let [key-name [key-prefix args]]
      (if-let [found-result (wcar connection-spec (car/get key-name))]
        found-result
        (let [new-result (apply f args)]
          (wcar connection-spec (car/set key-name new-result)
                                (car/expire key-name ttl-seconds))
          new-result)))))

This makes a couple of assumptions worth noting. First, it assumes that the arguments for the function being memoized are supported by Nippy (see the earlier serialization example). Second, it assumes that the memoized data should be expired after a specified number of seconds. To use redis-memoize, simply pass in a function. The following is a highly contrived example that uses the server-connection defined previously:

(defn square [x]
  (printf "Ran square for: %s
" x)
  (* x x))

(def redis-squared
  (redis-memoize "squared" 10 server-connection square))

(redis-squared 99)
;; -> Ran square for: 99
;; -> 9801
(redis-squared 99)
;; -> 9801

In addition to the features showcased earlier, Carmine includes (among other things) a message queue, distributed locks, a Ring session store, and even an implementation of DynamoDB (which is in alpha at the time of writing). These features are outside the scope of this recipe, but they’re well documented and straightforward to use. Consult the Carmine GitHub project for more information.

6.10. Connecting to a Datomic Database

by Robert Stuttaford

Problem

You need to connect to a Datomic database.

Solution

Before starting, add [com.datomic/datomic-free "0.8.4218"] to your project’s dependencies or start a REPL using lein-try:

$ lein try com.datomic/datomic-free

To create and connect to an in-memory database, use database.api/create-database and datomic.api/connect:

(require '[datomic.api :as d])

(def uri "datomic:mem://sample-database")

(d/create-database uri)
;; -> true

(def conn (d/connect uri))

conn
;; -> #<LocalConnection datomic.peer.LocalConnection@49384d99>

Once you have a connection, you can use it to get a database value with datomic.api/db. This value is used to query a database:

(def db (d/db (d/connect uri)))

db
;; -> datomic.db.Db@7b7fea26

You can also use the connection to transact data using datomic.api/transact:

;; Transact the schema for your Next Big Thing
(def my-great-schema []) ; This vector intentionally left blank
(d/transact (d/connect uri) my-great-schema)

Discussion

You’ll notice in the solution that we not only connected to a database, but we created it too. This pattern is common when using in-memory databases, as no in-memory databases exist in a fresh JVM. It is not strictly necessary to call create-database if the database already exists, but it is safe to do so—create-database is idempotent and will return false if one already exists. When connecting to a database that isn’t in memory, it is necessary for the relevant transactor and storage service to be running.

The return value of d/connect is used when querying a database value or when transacting data. It is also used when reading the transaction log, when consuming the transaction report queue, or when performing administrative tasks such as requesting an indexing job, garbage collecting storage, and disposing of resources associated with the connection.

Connections are thread-safe and are cached by URI internally, so there is no need to pool connections yourself. There is no performance overhead for creating many connections to the same URI.

Storage services

Datomic transactor processes have a limit on the number of concurrently connected peer processes. Datomic Free has a limit of two peers per transactor. For nondistributed applications, this may well be sufficient. If you’re building a larger service, then you may need a Datomic Pro license for more peers.

There are several options for storage services that back Datomic. Three are built-in, and the rest use external services. Datomic Free includes access to the in-memory and :free storage backends. Datomic Pro and Pro Starter Edition include access to all services.

Built-in storage options

The built-in storage options are:

In local memory: "datomic:mem://[db-name]"
Free, for use with Datomic Free, subject to a two-peer limit: "datomic:free://host[:port]/[db-name]"
Dev, for use with Datomic Pro, subject to the licensed peer limit: "datomic:dev://host[:port]/[db-name]"

Free and Dev can also be configured to use alternate ports for storage: "datomic:free://host[:port]/[db-name]?h2-port=[port]&h2-web-port=[port]".

By default, these ports will be one and two more than the transactor port, respectively.

External storage service options

Several external storage options also exist. These include:

DynamoDB: "datomic:ddb://[aws-region]/[dynamodb-table]/[db-name]?aws_access_key_id=[XXX]&aws_secret_key=[YYY]"
Riak: "datomic:riak://host[:port]/bucket/dbname[?interface=http|protobuf]" (default is protobuf)
Couchbase: "datomic:couchbase://host/bucket/dbname[?password=xxx]"
Infinispan: "datomic:inf://[cluster-member-host:port]/[db-name]"
SQL: "datomic:sql://[db-name][?jdbc-url]"

For SQL storage services, the map format can be used instead of the string format. This is useful when specifying objects that can’t be embedded in URI strings, like DataSources. The format for the SQL map is:

{:protocol :sql                         ;; keyword or string
 :db-name "[db-name]"                   ;; keyword or string
 :data-source aDataSourceObject
  ;; OR
 :factory aCallableReturningConnection}

6.11. Defining a Schema for a Datomic Database

by Robert Stuttaford

Problem

You need to define how your data will be modeled in Datomic. For example, you need to model users and their user groups, relating the two in some way.

Solution

Datomic schemas are defined in terms of attributes. It’s probably easiest to jump straight to an example.

To follow along with this recipe, complete the steps in the solution in Recipe 6.10, “Connecting to a Datomic Database”. After doing this, you should have an in-memory database and connection, conn, to work with.

Consider the attributes a user might have:

One email address, which must be unique to the database
One name, which we index for fast search
Any number of roles (guest, author, and editor)

To define this schema, create a vector with attribute maps for email, name, and role, as well as insertions of the three static roles:

(def user-schema
  [{:db/doc "User email address"
    :db/ident :user/email
    :db/valueType :db.type/string
    :db/cardinality :db.cardinality/one
    :db/unique :db.unique/identity
    :db/id #db/id[:db.part/db]
    :db.install/_attribute :db.part/db}

   {:db/doc "User name"
    :db/ident :user/name
    :db/valueType :db.type/string
    :db/cardinality :db.cardinality/one
    :db/index true
    :db/id #db/id[:db.part/db]
    :db.install/_attribute :db.part/db}

   {:db/doc "User roles"
    :db/ident :user/roles
    :db/valueType :db.type/ref
    :db/cardinality :db.cardinality/many
    :db/id #db/id[:db.part/db]
    :db.install/_attribute :db.part/db}

   [:db/add #db/id[:db.part/user] :db/ident :user.roles/guest]
   [:db/add #db/id[:db.part/user] :db/ident :user.roles/author]
   [:db/add #db/id[:db.part/user] :db/ident :user.roles/editor]])

We define a group as having:

One UUID, which must be unique to the database
One name, which we index for fast search
Any number of related users

Define the group as follows:

(def group-schema
  [{:db/doc "Group UUID"
    :db/ident :group/uuid
    :db/valueType :db.type/uuid
    :db/cardinality :db.cardinality/one
    :db/unique :db.unique/value
    :db/id #db/id[:db.part/db]
    :db.install/_attribute :db.part/db}

   {:db/doc "Group name"
    :db/ident :group/name
    :db/valueType :db.type/string
    :db/cardinality :db.cardinality/one
    :db/index true
    :db/id #db/id[:db.part/db]
    :db.install/_attribute :db.part/db}

   {:db/doc "Group users"
    :db/ident :group/users
    :db/valueType :db.type/ref
    :db/cardinality :db.cardinality/many
    :db/id #db/id[:db.part/db]
    :db.install/_attribute :db.part/db}])

Finally, transact both schema definitions into a database via a connection:

(require '[datomic.api :as d])

@(d/transact (d/connect "datomic:mem://sample-database")
             (concat user-schema group-schema))
;; -> {:db-before datomic.db.Db@25b48c7b,
;;     :db-after datomic.db.Db@5d81650c,
;;     :tx-data [#Datum{:e ... :a ... :v ... :tx  :added true}, ...],
;;     :tempids {-... ..., ...}}

Discussion

A Datomic schema is represented as Clojure data and is added to the database in a transaction, just like any other data we would store. The :db.install/_attribute :db.part/db key/value pair is used by the transactor to make the schema available to the rest of the system.

The schema is placed in the :db.part/db database partition, a partition reserved for schemas. All user data is placed in user partition(s)—either the default of :db.part/user or a custom partition. Partitions are useful for optimizing how indexes sort data, which is useful for optimizing a query. Schema entities require that at least :db/ident, :db/valueType, and :db/cardinality values are present.

Aside from the schema, Datomic does not enforce how attributes are combined for any given entity. Datomic only requires that a schema be defined up front, enforcing type and uniqueness constraints at runtime.

Use namespaces in schema :db/ident values to help classify entities (such as user in :user/email). Datomic doesn’t do anything specific with namespaces, so using them is optional. There are several options for :db/valueType, listed in Table 6-1.

Table 6-1. :db/valueType options

`:db.type/keyword`	`:db.type/string`	`:db.type/long`
`:db.type/boolean`	`:db.type/bigint`	`:db.type/float`
`:db.type/double`	`:db.type/bigdec`	`:db.type/instant`
`:db.type/ref`	`:db.type/uuid`	`:db.type/uri`
`:db.type/bytes`

See the Datatomic schema documentation for an exhaustive listing of their semantics.

Attributes with :db/valueType :db.type/ref can only have other entities as their value(s). You use this type to model relationships between entities. Datomic does not enforce which entities are related to on a given :db/valueType :db.type/ref attribute. Any other entity can be related to—this means that entities can relate to themselves!

You also use :db/valueType :db.type/ref and lone :db/ident values to model enumerations, such as the user roles that you defined. These enumerations are not actually schemas; they are normal entities with a single attribute, :db/ident. An entity’s :db/ident value serves as a shorthand for that entity; you may use this value in lieu of the entity’s :db/id value in transactions and queries.

Attributes with :db/valueType :db.type/ref and :db/unique values are implicitly indexed as though you had added :db/index true to their definitions.

It is also possible to use Lucene full-text indexing on string attributes, using :db/fulltext true and the system-defined fulltext function in Datalog.

There are two options for specifying a uniqueness constraint at :db/unique:

:db.unique/value: Disallows attempts to insert a duplicate value for a different entity ID.
:db.unique/identity: Designates that the attribute value is unique to each entity and enables “upserts”; any attempts to insert a duplicate value for a temporary entity ID will cause all attributes associated with that temporary ID to be merged with the entity already in the database.

In the case where you are modeling entities with subentities that only exist in the context of those entities, such as order items on an order or variants for a product, you can use :db/isComponent to simplify working with such subentities. It can only be used on attributes of type :db.type/ref.

When you use the :db.fn/retractEntity function in a transaction, any entities on the value side of such attributes for the retracted entity will also be retracted. Also, when you use d/touch to realize all the lazy keys in an entity map, component entities will be realized too. Both the retraction and realization behaviors are recursive.

By default, Datomic stores all past values of attributes. If you do not wish to keep past values for a particular attribute, use :db/noHistory true to have Datomic discard previous values. Using this attribute is much like using a traditional update-in-place database.

6.12. Writing Data to Datomic

by Robert Stuttaford

Problem

You need to add data to your Datomic database.

Solution

Use a Datomic connection to transact data.

To follow along with this recipe, complete the steps in the solutions to Recipe 6.10, “Connecting to a Datomic Database”, and Recipe 6.11, “Defining a Schema for a Datomic Database”.

After doing this, you will have a connection, conn, and a schema installed against which you can insert data:

(require '[datomic.api :as d :refer [q db]])

(def tx-data [{:db/id (d/tempid :db.part/user)
               :user/email "[email protected]"
               :user/name "Martin Fowler"
               :user/roles [:user.roles/author :user.roles/editor]}])

@(d/transact conn tx-data)

(q '[:find ?name
     :where [?e :user/name ?name]]
   (:db-after tx-result))
;; -> #{["Martin Fowler"]}

Discussion

This map-based syntax for representing the data expands to a series of :db/add statements. This transaction is identical to the previous one:

(def new-id (d/tempid :db.part/user))
new-id
;; -> #db/id[:db.part/user -1000013]

(def tx-data2 [[:db/add new-id :user/email "[email protected]"]
               [:db/add new-id :user/name "Ryan Neufeld"]
               [:db/add new-id :user/roles [:user.roles/author
                                            :user.roles/editor]]])

(def tx-result @(d/transact conn tx-data2)) ;; Keep this for later...

(q '[:find ?name
     :where [?e :user/name ?name]]
   (db conn))
;; -> #{["Ryan Neufeld"] ["Martin Fowler"]}

Of course, you can use statements like these yourself, or you can use the map syntax shown in the solution. You can also mix the two. This is how you would transact multiple entries (e.g., (d/transact conn [person1-map person2-map])).

One difference you’ll note between the map and the expanded form is the lack of a :db/add statement for the :db/id key. In the expanded form, this value comes immediately after the action (:db/add) and must be identical between all statements to correlate attributes to a single entity. When specifying an entity as a map, you provide a single ID, which the transactor transparently affixes to each attribute.

What is an appropriate ID? Any new entities are assigned temporary, negative ID values, which can be used to model relationships within the transaction. Upon successfully completing a transaction, all the temporary IDs are assigned in-storage positive ID values. When working with code, the correct approach is to use the datomic.api/tempid function to obtain a temporary ID. The datomic.api/tempid function takes a partition keyword and an optional ID number as its arguments; for most purposes, :db.part/user will suffice.

When working with nonexecutable data, you’ll need to use the data-literal form for temporary IDs. The literal #db/id [:db.part/user] is equivalent to (d/tempid :db.part/user). This form is most useful when you store transaction data in an .edn file, which is most often the case with schema definitions. Again, you should use d/tempid in your code—the #db/id literal will evaluate once at compile time, which means that any code that expects the ID value to change from one execution to the next will fail, because it’ll only ever have one value.

Consider our example file, user-bootstrap.edn:

[{:db/id #db/id [:db.part/user]
  :user/email "[email protected]"
  :user/name "Martin Fowler"
  :user/roles [:user.roles/author :user.roles/editor]}]

When a transaction completes, you’ll receive a completed future. If you prefer to transact asynchronously, you can use d/transact-async instead, which will return its future immediately. In this case, as with all futures, when you dereference it, it will block until the transaction completes. Either way, dereferencing the future returns a map, with four keys:

:db-before: The value of the database just before the transaction was committed
:db-after: The value of the database just after the transaction was committed
:tx-data: A vector of all the datoms that were transacted
:tempids: A mapping of the temporary IDs to the in-storage IDs, one per temporary ID in the transaction

You can use the :db-after database to query the database directly after the transaction:

(def db-after-tx (:db-after tx-result))

(q '[:find ?name :in $ ?email :where
     [?entity :user/email ?email]
     [?entity :user/name ?name]]
   db-after-tx
   "[email protected]")
;; -> #{["Martin Fowler"]}

You can use the :tempids map to find the in-storage IDs for any new entities you care about, much like you would when retrieving the last insert ID in SQL databases. Invoke datomic.api/resolve-tempid with the :db-after value, the :tempids value, and the original temporary ID to retrieve the realized ID:

(d/resolve-tempid db-after-tx (:tempids tx-result) new-id)
;; -> 17592186045421

6.13. Removing Data from a Datomic Database

by Robert Stuttaford

Problem

You need to remove data from your Datomic database.

Solution

To remove a value for an attribute, you should use the :db/retract operation in transactions.

To follow along with this recipe, complete the steps in the solutions to Recipe 6.10, “Connecting to a Datomic Database”, and Recipe 6.11, “Defining a Schema for a Datomic Database”. After doing this, you will have a connection, conn, and a schema installed against which you can insert data.

To start things off, add a user, Barney Rubble, and verify that he has an email address:

(def new-id (d/tempid :db.part/user))

(def tx-result @(d/transact conn
                            [{:db/id new-id
                              :user/name "Barney Rubble"
                              :user/email "[email protected]"}]))

(def after-tx-db (:db-after tx-result))

(def barney-id (d/resolve-tempid after-tx-db
                                 (:tempids tx-result)
                                 new-id))

barney-id
;; -> 17592186045429

(d/q '[:find ?email :in $ ?entity-id :where
       [?entity-id :user/email ?email]]
     after-tx-db
     barney-id)
;; -> #{["[email protected]"]}

To retract Barney’s email, transact a transaction with the :db/retract operation:

(def retract-tx-result @(d/transact conn [[:db/retract barney-id
                                           :user/email "[email protected]"]]))

(def after-retract-db (:db-after retract-tx-result))

(d/q '[:find ?email :in $ ?entity-id :where
       [?entity-id :user/email ?email]]
     after-retract-db
     barney-id)
;; -> #{}

To retract entire entities, use the :db.fn/retractEntity built-in transactor function:

(def retract-entity-tx-result
  @(d/transact conn [[:db.fn/retractEntity barney-id]]))

(def after-retract-entity-db (:db-after retract-entity-tx-result))

(d/q '[:find ?entity-id :in $ ?name :where
       [?entity-id :user/name ?name]]
     after-retract-entity-db
     "Barney Rubble")
;; -> #{}

Discussion

When using :db/retract, you provide the value to retract so that in the case of cardinality-many attributes, it’s clear which value to retract from the set of values for that attribute. Regardless of the cardinality, if you provide a value that isn’t in storage, nothing will be retracted. This means that you have to know what value you want to retract; you can’t simply retract everything for an attribute by only providing the entity ID and the attribute.

If you retract values for an attribute that does not use :db/noHistory, you will be able to query past database values to find past values for the attribute.

If you retract values for an attribute that uses :db/noHistory, that data will be permanently deleted.

When using :db.fn/retractEntity, all attribute values for all the attributes on that entity will be retracted, as will all :db/ref attributes that have the entity as a value. Any component entities of the entity being retracted will themselves be recursively retracted.

You’ll find that the actual entity ID itself is not retracted, but that it will have no attributes associated with it. This is because once an entity is created, it cannot be retracted. Removing all the attributes and references to the entity has the same effect as if it had been permanently removed, though!

If you need to permanently remove data due to legal concerns or because the data in question falls outside of your domain-specified retention period, use excision to remove the data permanently.

6.14. Trying Datomic Transactions Without Committing Them

by Robert Stuttaford

Problem

You want to test a transaction prior to committing it using Datalog or the entity API.

Solution

Build your transaction as usual, but instead of calling d/transact or d/transact-async, use d/with to produce an in-memory database that includes the changes your transaction provides.

First, add some data to the database about Fred Flintstone. As of about 4000 BCE, Fred didn’t have an email, but we at least know his name:

(require '[datomic.api :as d])

(def new-id (d/tempid :db.part/user))


(def tx-result @(d/transact conn
                            [{:db/id new-id
                              :user/name "Fred Flintstone"}]))

Fast-forward to today: Fred is thawed, after having been frozen in ice for 6,000 years, and he gets his first email address. Prepare a transaction to add an email to the Fred entity:

;; Grab Fred's ID from the original transaction
(def fred-id (d/resolve-tempid (:db-after tx-result)
                               (:tempids tx-result)
                               new-id))

fred-id
;; -> 17592186045421

(def add-freds-email-tx [[:db/add fred-id
                          :user/email "[email protected]"]])

Now, prepare an in-memory database with this new transaction applied. First, get the current database value to use as a basis, then create an in-memory database. Finally, grab the :db-after value so that you can test that the email was properly added:

(defn db-with
  "Return a new database with tx applied"
  [db tx]
  (-> (d/with db tx)
      :db-after))

(def db-after (db-with (d/db conn) add-freds-email-tx))

Compare the value of Fred’s email in the current database with that of Fred’s email in the in-memory database:

(defn users-email
  "Retrieve a user's email given the user's name."
  [db name]
  (-> (d/q '[:find ?email
            :in $ ?name
            :where
            [?entity :user/name ?name]
            [?entity :user/email ?email]]
          db
          name)
       ffirst))

(users-email db-after "Fred Flintstone")
;; -> "[email protected]"

(users-email (d/db conn) "Fred Flintstone")
;; -> nil

As you can see, the current database remains unaffected by this transaction, but the database at db-after now displays the new value.

Discussion

Databases produced by d/with can be used with any of the other API functions that accept a database, including d/with itself. This means that you can layer multiple transactions on top of one another without first having to commit them to the transactor!

One of the things that makes Datomic so powerful is its ability to treat a database as a value. For this reason, the helper functions we’ve written take a database as an argument, not a connection. Now it is not only possible to query the current database, but other values of the database as well.

6.15. Traversing Datomic Indexes

by Alan Busby and Ryan Neufeld

Problem

You want to execute simple Datomic queries with high performance.

Solution

Use the datomic.api/datoms function to directly access the core Datomic indexes in your database.

For example, to quickly find the entities that have the provided attribute and value set, invoke datomic.api/datoms, specifying the :avet index (attribute, value, entity, transaction) and the desired attribute and value:

(require '[datomic.api :as d])

(d/transact conn [{:db/id (d/tempid :db.part/user)
                   :user/name "Barney Rubble"
                   :user/email "[email protected]"}])

(defn entities-with-attr-val
  "Return entities with a given attribute and value."
  [db attr val]
  (->> (d/datoms db :avet attr val)
       (map :e)
       (map (partial d/entity db))))


(def barney (first (entities-with-attr-val (d/db conn)
                                           :user/email
                                           "[email protected]")))

(:user/email barney)
;; -> "[email protected]"

Warning

This will only work for attributes where :db/index is true or :db/unique is not nil.

To quickly determine all of the attributes an entity has, use the :eavt-ordered index:

(defn entities-attrs
  "Return attrs of an entity"
  [db entity]
  (->> (d/datoms db :eavt (:db/id entity))
       (map :a)
       (map (partial d/entity db))
       (map :db/ident)))

(entities-attrs (d/db conn) barney)
;; -> (:user/email :user/name)

To quickly find entities that refer, via :db.type/ref, to a provided entity, use the :vaet-ordered index:

;; Add a person that refers to a :user.roles/author role
(d/transact conn [{:db/id (d/tempid :db.part/user)
                   :user/name "Ryan Neufeld"
                   :user/email "[email protected]"
                   :user/roles [:user.roles/author :user.roles/editor]}])

(defn referring-to
  "Find all entities referring to an entity as a certain attribute."
  [db entity]
  (->> (d/datoms db :vaet (:db/id entity) )
       (map :e)
       (map (partial d/entity db))))

(def author-entity (d/entity (d/db conn) :user.roles/author))

;; The names of all users with a :user.roles/author role
(map :user/name (referring-to (d/db conn) author-entity))
;; -> ("Ryan Neufeld")

Discussion

For simple lookup queries, like “find by attribute” or “find by value”, nothing beats Datomic’s raw indexes in terms of performance. The datomic.api/datoms interface provides access to all of Datomic’s indexes and conveniently lets you dive in any number of levels, “biting off” only the data you need.

As with most Datomic functions, datoms takes a db as its first argument. You’ll note that in our examples, and elsewhere in the book, we too accept a database as a value, and not a connection—this idiom allows API users to perform varying numbers of operations on the same database value. You should always try to do this yourself.

The second argument to datoms indicates the particular index you want to access. Each value is a permutation of the letters e (entity), a (attribute), v (value), and t (transaction). The order of the letters in an index indicates how it is indexed. For example, :eavt should be traversed by entity, then attribute, and so on and so forth. The four indexes and what they include are as follows:

:eavt: An entity-first index that includes all datoms. This index provides a view over your database very much like a traditional relational database.
:aevt: An attribute-then-entity index that includes all datoms. This index provides columnar access to your database, much like a data warehouse.
:avet: An attribute-value index that only includes attributes where :db/index is true. Incredibly useful as a lookup index (e.g., “I need the entity with an email of [email protected]”).
:vaet: A value-first index that only includes :db.type/ref values. This is a very interesting index that can be used to treat your data a bit like a graph database.

After specifying an index ordering, you can optionally provide any number of components to pre-traverse the index. This serves to reduce the number of elements returned. For example, specifying just an attribute component for AVET traversal will return any entity with that attribute. Specifying an attribute and a value component, on the other hand, will return only entities with that specific attribute and value pair.

What is returned by datoms is a stream of Datum objects. Each datum responds to :a, :e, t, :v, and :added as functions.

Table of Contents for 6. Databases

Create new playlist

Sign In

Sign Up

Chapter 6. Databases

6.0. Introduction

6.1. Connecting to an SQL Database

Problem

Solution

Discussion

See Also

6.2. Connecting to an SQL Database with a Connection Pool

Problem

Solution

Discussion

See Also

6.3. Manipulating an SQL Database

Problem

Solution

Discussion

Note

Inserting and updating records

Transactions

Reading and processing records

See Also

6.4. Simplifying SQL with Korma

Problem

Solution

Discussion

See Also

6.5. Performing Full-Text Search with Lucene

Problem

Solution

Discussion

Note

See Also

6.6. Indexing Data with ElasticSearch

Problem

Solution

Indexing

Creating mappings

Indexing documents

Discussion

See Also

6.7. Working with Cassandra

Problem

Solution

Discussion

See Also

6.8. Working with MongoDB

Problem

Solution

Discussion

See Also

6.9. Working with Redis

Problem

Solution

Note

Discussion

See Also

6.10. Connecting to a Datomic Database

Problem

Solution

Discussion

Storage services

Built-in storage options

External storage service options

See Also

6.11. Defining a Schema for a Datomic Database

Problem

Solution

Discussion

See Also

6.12. Writing Data to Datomic

Problem

Solution

Discussion

See Also

6.13. Removing Data from a Datomic Database

Problem

Table of Contents for
6. Databases