Storing data in a database is not an uncommon task for developers—in this day and age, it’s practically a given. As with nearly every language under the sun, there is a bevy of drivers and clients to interact with databases from Clojure. What sets Clojure apart, however, is its ability to compose.
As we’ve said before in this book: in Clojure, data is king. You’ll find many of the database client libraries do a little legwork to connect you to the datastore, then promptly get out of your way. Such libraries don’t do so out of laziness (at least, we hope), but rather out of the principle of separation of concerns: I’ll handle connecting to the database; you handle the domain (your data). In fact, the best APIs are built out of data, providing only one or two functions and letting you manipulate queries and data to be inserted directly as Clojure data structures.
In this chapter, we’ll visit a wide number of databases and techniques, including the SQLs, full-text search, Mongo, Redis, and Datomic.
Datomic is one of the more interesting recent developments in the database landscape. Invented and maintained by Rich Hickey (who you will probably recognize as the same person who wrote Clojure itself), it is a scalable, transactional, value-oriented, time-aware database built around the same principles and philosophies as Clojure. If you like Clojure, you should definitely give Datomic a try, both as your application’s datastore and also as a learning tool to further explore functional, data-oriented programming.
Use the clojure.java.jdbc
library for JDBC-based access to SQL
databases.
To follow along with this recipe, you’ll need a running SQL database and an existing table to connect to. We suggest PostgreSQL.[18]
After you have PostgreSQL running (presumably on localhost:5432), run the following command to create a database for this recipe:
# On Mac:
$ /Applications/Postgres93.app/Contents/MacOS/bin/createdb cookbook_experiments
# Everyone else:
$ createdb cookbook_experiments
Before starting, add [org.clojure/java.jdbc "0.3.0"]
to your
project’s dependencies. You’ll also need a JDBC driver for the RDBMS
of your choice. If you’re following along with this sample, use
[org.postgresql/postgresql "9.2-1003-jdbc4"]
. To start a REPL using
lein-try
, enter the following Leiningen command:
$ lein try org.clojure/java.jdbc "0.3.0"
java-jdbc/dsl "0.1.0"
org.postgresql/postgresql "9.2-1003-jdbc4"
To interact with a database using clojure.java.jdbc
, all you need is
a connection specification. This specification takes the form of a
plain Clojure map with values indicating the database driver type,
location, and authentication credentials:
(
def
db-spec
{
:classname
"org.postgresql.Driver"
:subprotocol
"postgresql"
:subname
"//localhost:5432/cookbook_experiments"
;; Not needed for a non-secure local database...
;; :user "bilbo"
;; :password "secret"
})
Create a relation in the specified database by invoking the
clojure.java.jdbc/create-table
function with the specification
and any number of column specifications:
(
require
'
[
clojure.java.jdbc
:as
jdbc
]
'
[
java-jdbc.ddl
:as
ddl
])
(
jdbc/db-do-commands
db-spec
false
(
ddl/create-table
:tags
[
:id
:serial
"PRIMARY KEY"
]
[
:name
:varchar
"NOT NULL"
]))
;; -> (0)
Many other functions that query and manipulate a database, such as
clojure.java.jdbc/insert!
, take a database specification directly as
their first argument:
(
require
'
[
java-jdbc.sql
:as
sql
])
(
jdbc/insert!
db-spec
:tags
{
:name
"Clojure"
}
{
:name
"Java"
})
;; -> ({:name "Clojure", :id 1} {:name "Java", :id 2})
(
jdbc/query
db-spec
(
sql/select
*
:tags
(
sql/where
{
:name
"Clojure"
})))
;; -> ({:name "Clojure", :id 1})
The clojure.java.jdbc
library provides functions that wrap the
basic capabilities of the Java JDBC specification. The additional
java-jdbc.sql
and java-jdbc.ddl
namespaces from the
java-jdbc/dsl
project implement small DSLs to generate basic SQL DML
and DDL statements.
Because it relies upon Java JDBC, the clojure.java.jdbc
library is usable
with many of the most popular SQL databases, including Apache Derby, HSQLDB,
Microsoft SQL Server, MySQL, PostgreSQL, and SQLite.
The parameters necessary to set up and access a data source are called the database specification (often abbreviated “db-spec”) and are provided in a simple Clojure map. The specification usually includes such parameters as the driver class name, the subprotocol for a particular RDBMS type, the hostname, the port number, the database name, and the username and password.
The clojure.java.jdbc
library also permits several other forms of data source
specification, including Java URIs, already-open connections, JNDI connections,
and plain strings. For example, a complete URI string may be provided under the
:connection-uri
key:
;; As a spec string
(
def
db-spec
"jdbc:postgresql://bilbo:secret@localhost:5432/cookbook_experiment"
)
;; As a connection URI map...
;; with a username and password...
(
def
db-spec
{
:connection-uri
(
str
"jdbc:postgresql://localhost:5432/cookbook_experiments?"
"user=bilbo&password=secret"
)})
;; or without
(
def
db-spec
{
:connection-uri
"jdbc:postgresql://localhost:5432/cookbook_experiments"
})
Database records are represented as Clojure maps, with the table’s column names used as keys. Retrieval of a set of database records produces a sequence of maps that can then be processed with all the normal Clojure functions:
(
jdbc/query
db-spec
(
sql/select
*
:tags
))
;; -> ({:name "Clojure", :id 1}
; {:name "Java", :id 2})
(
filter
#
(
not
(
.endsWith
(
:name
%
)
"ure"
))
(
jdbc/query
db-spec
(
sql/select
*
:tags
)))
;; -> ({:name "Java", :id 2})
There are other Clojure libraries to access relational databases, and each
provides a different abstraction and DSL for the manipulation of SQL data and
expressions, such as Korma. The clojure.java.jdbc
library, however, covers a large portion
of everyday database access needs.
c3p0
and
clojure.java.jdbc
.
clojure.java.jdbc
to interact with an SQL database.
clojure.java.jdbc
GitHub repository for more
detailed information on the library.
java-jdbc/dsl
GitHub repository for more
information on the SQL query generation capabilities it provides.
Alternatively, investigate the Honey
SQL, SQLingvo, or
Korma libraries for SQL query generation.
Korma is covered in Recipe 6.4, “Simplifying SQL with Korma”.
Use the BoneCP connection and statement pooling library to wrap your
JDBC-based drivers, creating a pooled data source. The pooled data
source is then usable by the clojure.java.jdbc
library, as described
in Recipe 6.1, “Connecting to an SQL Database”.
To follow along with this recipe, you’ll need a running SQL database and an existing table to connect to. We suggest PostgreSQL.[19]
After you have PostgreSQL running (presumably on localhost:5432), run the following command to create a database for this recipe:
# On Mac:
$ /Applications/Postgres93.app/Contents/MacOS/bin/createdb cookbook_experiments
# Everyone else:
$ createdb cookbook_experiments
Before starting, add the BoneCP dependency ([com.jolbox/bonecp
"0.8.0.RELEASE"]
), as well as the appropriate JDBC libraries for your
RDBMS, to your project’s dependencies. You’ll also need a valid SLF4J
logger. Alternatively, you can follow along in a REPL using lein-try
:
$ lein try com.jolbox/bonecp "0.8.0.RELEASE"
org.clojure/java.jdbc "0.3.0"
java-jdbc/dsl "0.1.0"
org.postgresql/postgresql "9.2-1003-jdbc4"
org.slf4j/slf4j-nop # Just do not log anything
First, create a database specification containing the parameters for accessing the database. This includes keys for the initial and maximum pool sizes, as well as the number of partitions:
(
def
db-spec
{
:classname
"org.postgresql.Driver"
:subprotocol
"postgresql"
:subname
"//localhost:5432/cookbook_experiments"
:init-pool-size
4
:max-pool-size
20
:partitions
2
})
To create a pooled BoneCPDataSource
object, define a function (for
convenience) that uses the parameters in the database
specification map:
(
import
'com.jolbox.bonecp.BoneCPDataSource
)
(
defn
pooled-datasource
[
db-spec
]
(
let
[{
:keys
[
classname
subprotocol
subname
user
password
init-pool-size
max-pool-size
idle-time
partitions
]}
db-spec
min-connections
(
inc
(
int
(
/
init-pool-size
partitions
)))
max-connections
(
inc
(
int
(
/
max-pool-size
partitions
)))
cpds
(
doto
(
BoneCPDataSource.
)
(
.setDriverClass
classname
)
(
.setJdbcUrl
(
str
"jdbc:"
subprotocol
":"
subname
))
(
.setUsername
user
)
(
.setPassword
password
)
(
.setMinConnectionsPerPartition
min-connections
)
(
.setMaxConnectionsPerPartition
max-connections
)
(
.setPartitionCount
partitions
)
(
.setStatisticsEnabled
true
)
(
.setIdleMaxAgeInMinutes
(
or
idle-time
60
)))]
{
:datasource
cpds
}))
Use the convenience function to define a pooled data source for connecting to your database:
(
def
pooled-db-spec
(
pooled-datasource
db-spec
))
pooled-db-spec
;; -> {:datasource #<BoneCPDataSource ...>}
Pass the database specification as the first argument to any
clojure.java.jdbc
functions that query or manipulate your database:
(
require
'
[
clojure.java.jdbc
:as
jdbc
]
'
[
java-jdbc.ddl
:as
ddl
]
'
[
java-jdbc.sql
:as
sql
])
(
jdbc/db-do-commands
pooled-db-spec
false
(
ddl/create-table
:blog_posts
[
:id
:serial
"PRIMARY KEY"
]
[
:title
"varchar(255)"
"NOT NULL"
]
[
:body
:text
]))
;; -> (0)
(
jdbc/insert!
pooled-db-spec
:blog_posts
{
:title
"My first post!"
:body
"This is going to be good!"
})
;; -> ({:body "This is going to be good!", :title "My first post!", :id 1})
(
jdbc/query
pooled-db-spec
(
sql/select
*
:blog_posts
(
sql/where
{
:title
"My first post!"
})))
;; -> ({:body "This is going to be good!", :title "My first post!", :id 1})
As shown in the solution, the clojure.java.jdbc
library can create database
connections from JDBC data sources, which allows connections to be easily
pooled by the BoneCP or other pooling libraries.
The BoneCP library wraps existing JDBC classes to allow the creation of
efficient data sources. It can adapt traditional unpooled drivers and
data sources by augmenting them with transparent pooling of Connection
and PreparedStatement
instances.
While the library offers several ways to create data sources, most users will find the examples provided here to be the easiest.
BoneCP offers several dozen configuration parameters that control
the operation of the data source and its connections. Luckily, most of these
configuration parameters have built-in defaults. Parameters may be specified
to control such facets as the min, max, and initial pool size; the number of
idle connections; the age of connections; transaction handling; the use of
PreparedStatement
pooling; and if, when, and how pooled connections are
tested.
Pooled data resources (threads and database connections) may be released by
calling the close
method on the BoneCPDataSource
class of the
library. Attempting to reuse the pooled data source after it is closed will result
in an SQL exception:
(
.close
(
:datasource
pooled-db-spec
))
;; -> nil
clojure.java.jdbc
clojure.java.jdbc
to interact with an SQL database
clojure.java.jdbc
GitHub repository for more detailed information on the library
Use the clojure.java.jdbc
library for JDBC-based access to SQL databases.
To follow along with this recipe, you’ll need a running SQL database and an existing table to connect to. We suggest PostgreSQL.[20]
After you have PostgreSQL running (presumably on localhost:5432), run the following command to create a database for this recipe:
# On Mac:
$ /Applications/Postgres93.app/Contents/MacOS/bin/createdb cookbook_experiments
# Everyone else:
$ createdb cookbook_experiments
Before starting, add [org.clojure/java.jdbc "0.3.0"]
and
[java-jdbc/dsl "0.1.0"]
to your project’s dependencies. You’ll also
need a JDBC driver for the RDBMS of your choice. If you’re following
along with this sample, use [org.postgresql/postgresql
"9.2-1003-jdbc4"]
. To start a REPL using lein-try
, enter the
following Leiningen command:
$ lein try org.clojure/java.jdbc "0.3.0"
java-jdbc/dsl "0.1.0"
org.postgresql/postgresql "9.2-1003-jdbc4"
Then, define how the database should be accessed:
(
def
db-spec
{
:classname
"org.postgresql.Driver"
:subprotocol
"postgresql"
:subname
"//localhost:5432/cookbook_experiments"
})
To create a new table, use the java-jdbc.ddl/create-table
function to generate the necessary DDL statement, and then pass the
statement to the jdbc/db-do-commands
function to execute it:
(
require
'
[
clojure.java.jdbc
:as
jdbc
]
'
[
java-jdbc.ddl
:as
ddl
])
(
jdbc/db-do-commands
db-spec
(
ddl/create-table
:fruit
[
:name
"varchar(16)"
"PRIMARY KEY"
]
[
:appearance
"varchar(32)"
]
[
:cost
:int
"NOT NULL"
]
[
:unit
"varchar(16)"
]
[
:grade
:real
]))
;; -> (0)
Insert complete records into a table using the
clojure.java.jdbc/insert!
function, invoking it with a vector of the
column values for each row. Be sure to provide the column values in
the order in which the columns were declared in the table:
(
jdbc/insert!
db-spec
:fruit
nil
; column names omitted
[
"Red Delicious"
"dark red"
20
"bushel"
8.2
]
[
"Plantain"
"mild spotting"
48
"stalk"
7.4
]
[
"Kiwifruit"
"fresh"
35
"crate"
9.1
]
[
"Plum"
"ripe"
12
"carton"
8.4
])
;; -> (1 1 1 1)
To query the database, generate the SQL for the query with the
java-jdbc.sql/select
function, then invoke clojure.java.jdbc/query
with the result:
(
require
'
[
java-jdbc.sql
:as
sql
])
(
jdbc/query
db-spec
(
sql/select
*
:fruit
(
sql/where
{
:appearance
"ripe"
})))
;; -> ({:grade 8.4, :unit "carton", :cost 12, :appearance "ripe", :name "Plum"})
If you no longer need a particular table, invoke
clojure.java.dbc/jdb-do-commands
with the appropriate DDL statements
generated by java-jdbc.ddl/drop-table
:
(
jdbc/db-do-commands
db-spec
(
ddl/create-table
:delete_me
[
:name
"varchar(16)"
"PRIMARY KEY"
]))
(
jdbc/db-do-commands
db-spec
(
ddl/drop-table
:delete_me
))
;; -> (0)
The clojure.java.jdbc
library provides functions that wrap the
basic capabilities of the Java JDBC specification. The java-jdbc/dsl
project’s java-jdbc.sql
and java-jdbc.ddl
namespaces implement small DSLs to generate basic
SQL DML and DDL statements.
java-jdbc/dsl
used to be a part of clojure.java.jdbc
, but was
removed to keep the API of the core library as small as possible.
The java-dbc.ddl/create-table
function generates the DDL
needed to create a table. The arguments are a table name and a vector
for each column specification. At the time of this writing,
table-level specifications are not yet supported.
Records may be inserted into a table in a variety of ways. In addition
to the vector method illustrated, the
clojure.java.jdbc/insert!
function can accept one or more maps with
column names as keys:
(
jdbc/insert!
db-spec
:fruit
{
:name
"Banana"
:appearance
"spotting"
:cost
35
}
{
:name
"Tomato"
:appearance
"rotten"
:cost
10
:grade
1.4
}
{
:name
"Peach"
:appearance
"fresh"
:cost
37
:unit
"pallet"
})
;; -> ({:grade nil, :unit nil, :cost 35, :appearance "spotting", :name "Banana"}
;; {:grade 1.4, :unit nil, :cost 10, :appearance "rotten", :name "Tomato"}
;; {:grade nil, :unit "pallet", :cost 37, :appearance "fresh",
;; :name "Peach"})
If you want to insert rows but only specify some columns’ values, you
can invoke clojure.java.jdbc/insert!
with a vector of column
names followed by one or more vectors containing values for those
columns:
(
jdbc/insert!
db-spec
:fruit
[
:name
:cost
]
[
"Mango"
84
]
[
"Kumquat"
77
])
;; -> (1 1)
To update existing records, invoke clojure.java.jdbc/update!
with a
map of column names to new values. The optional
java-jdbc.sql/where
clause controls which rows will be
updated:
(
jdbc/update!
db-spec
:fruit
{
:grade
7.0
:appearance
"spotting"
:cost
75
}
(
sql/where
{
:name
"Mango"
}))
;; -> (1)
Database transactions are available to ensure that multiple operations
are performed atomically (i.e., all or none). The
clojure.java.jdbc/with-db-transaction
macro creates a transaction-aware
connection from the database specification. Use the transaction-aware
connection for the duration of the transaction:
;; Insert two new fruits atomically
(
jdbc/with-db-transaction
[
trans-conn
db-spec
]
(
jdbc/insert!
trans-conn
:fruit
{
:name
"Fig"
:cost
12
})
(
jdbc/insert!
trans-conn
:fruit
{
:name
"Date"
:cost
14
}))
;; -> ({:grade nil, :unit nil, :cost 14, :appearance nil, :name "Date"})
If an exception is thrown, the transaction is rolled back:
;; Query how many items the table has now
(
defn
fruit-count
"Query how many items are in the fruit table."
[
db-spec
]
(
let
[
result
(
jdbc/query
db-spec
(
sql/select
"count(*)"
:fruit
))]
(
:count
(
first
result
))))
(
fruit-count
db-spec
)
;; -> 11
(
jdbc/with-db-transaction
[
trans-conn
db-spec
]
(
jdbc/insert!
trans-conn
:fruit
[
:name
:cost
]
[
"Grape"
86
]
[
"Pear"
86
])
;; At this point the insert! call is complete, but the transaction
;; is not. An exception will cause the transaction to roll back,
;; leaving the database unchanged.
(
throw
(
Exception.
"sql-test-exception"
)))
;; -> Exception sql-test-exception ...
;; The table still has the same number of items
(
fruit-count
db-spec
)
;; -> 11
Transactions can be explicitly set to roll back with the
clojure.java.jdbc/db-set-rollback-only!
function. This setting can
be unset with the clojure.java.jdbc/db-unset-rollback-only!
function and tested with the clojure.java.jdbc/is-rollback-only
function:
(
fruit-count
db-spec
)
;; -> 11
(
jdbc/with-db-transaction
[
trans-conn
db-spec
]
(
jdbc/db-set-rollback-only!
trans-conn
)
(
jdbc/insert!
trans-conn
:fruit
{
:name
"Pear"
:cost
69
}))
;; -> ({:grade nil, :unit nil, :cost 69, :appearance nil, :name "Pear"})
;; The table still has the same number of items
(
fruit-count
db-spec
)
;; -> 11
Database records are returned from queries as Clojure maps, with the table’s column names used as keys. Retrieval of a set of database records produces a sequence of maps that can then be processed with all the normal Clojure functions. Here, we query all the records in the fruit table, gathering the name and grade of any low-quality fruit:
(
->>
(
jdbc/query
db-spec
(
sql/select
"name, grade"
:fruit
))
;; Filter all fruits by fruits with grade < 3.0
(
filter
(
fn
[{
:keys
[
grade
]}]
(
and
grade
(
<
grade
3.0
))))
(
map
(
juxt
:name
:grade
)))
;; -> (["Tomato" 1.4])
The preceding example uses the SQL DSL provided by the
java-jdbc.sql
namespace. The DSL implements a simple
abstraction over the generation of SQL statements. At present, it
provides some basic mechanisms for selects, joins, where
clauses, and
order-by
clauses:
(
defn
fresh-fruit
[]
(
jdbc/query
db-spec
(
sql/select
[
:f.name
]
{
:fruit
:f
}
(
sql/where
{
:f.appearance
"fresh"
})
(
sql/order-by
:f.name
))))
(
fresh-fruit
)
;; -> ({:name "Kiwifruit"} {:name "Peach"})
The use of the SQL DSL is entirely optional. For more direct control,
a vector containing an SQL query string and arguments can be passed to
the query
function. The following function also finds low-quality
fruit but does it by passing a quality threshold value directly to
the SQL statement:
(
defn
find-low-quality
[
acceptable
]
(
jdbc/query
db-spec
[
"select name, grade from fruit where grade < ?"
acceptable
]))
(
find-low-quality
3.0
)
;; -> ({:grade 1.4, :name "Tomato"})
The jdbc/query
function has several optional keyword parameters that control
how it constructs the returned result set. The :result-set-fn
parameter
specifies a function that is applied to the entire result set (a lazy
sequence) before it is returned. The default argument is the doall
function:
(
defn
hi-lo
[
rs
]
[(
first
rs
)
(
last
rs
)])
;; Find the highest- and lowest-cost fruits
(
jdbc/query
db-spec
[
"select * from fruit order by cost desc"
]
:result-set-fn
hi-lo
)
;; -> [{:grade nil, :unit nil, :cost 77, :appearance nil, :name "Kumquat"}
;; {:grade 1.4, :unit nil, :cost 10, :appearance "rotten", :name "Tomato"}]
The :row-fn
parameter specifies a function that is applied to each
result row as the result is constructed. The default argument is the
identity
function:
(
defn
add-tax
[
row
]
(
assoc
row
:tax
(
*
0.08
(
row
:cost
))))
(
jdbc/query
db-spec
[
"select name,cost from fruit where cost = 12"
]
:row-fn
add-tax
)
;; -> ({:tax 0.96, :cost 12, :name "Plum"} {:tax 0.96, :cost 12, :name "Fig"})
The Boolean :as-arrays?
parameter indicates whether to return the
results as a set of vectors or not. The default argument value is
false
:
(
jdbc/query
db-spec
[
"select name,cost,grade from fruit where appearance = 'spotting'"
]
:as-arrays?
true
)
;; -> ([:name :cost :grade] ["Banana" 35 nil] ["Mango" 75 7.0])
Finally, the :identifiers
parameter takes a function that is
applied to each column name in the result set. The default argument is
the clojure.string/lower-case
function, which lowercases the table’s
column names before they are converted to keywords. If your
application needs to perform some different conversion of column
names, provide an alternate function using this keyword parameter.
The clojure.java.jdbc
library is a good choice for quick and easy
access to most popular relational databases. Its use of Clojure’s
vectors and maps to represent records blends well with Clojure’s
emphasis on data-oriented programming. Novice users of SQL can
conveniently utilize the provided DSLs while expert users can more
directly construct and execute complex SQL statements.
clojure.java.jdbc
.
clojure.java.jdbc
.
clojure.java.jdbc
GitHub repository for more
detailed information on the library.
java-jdbc/dsl
GitHub repository for more
information on the SQL query generation capabilities it provides.
Alternatively, investigate the Honey
SQL, SQLingvo, or
Korma libraries for SQL query generation.
Korma is covered in Recipe 6.4, “Simplifying SQL with Korma”.
Use Korma as a DSL for generating SQL queries and traversing relationships.
Before starting, add [korma "0.3.0-RC6"]
and
[org.postgresql/postgresql "9.2-1002-jdbc4"]
to your project’s
dependencies or start a REPL using lein-try
:
$ lein try korma org.postgresql/postgresql
To follow along with this recipe, you’ll need a running SQL database and an existing table to connect to. We suggest PostgreSQL.[21]
After you have PostgreSQL running (presumably on localhost:5432), run the following command to create a database for this recipe:
# On Mac:
$ /Applications/Postgres.app/Contents/MacOS/bin/createdb learn_korma
# Everyone else:
$ createdb learn_korma
To connect to the learn_korma
database, use defdb
with the postgres
helper. Because Korma is a rather large DSL, it is
acceptable to :refer :all
its contents into model namespaces:
(
require
'
[
korma.db
:refer
:all
])
(
defdb
db
(
postgres
{
:db
"learn_korma"
}))
To interact with a table in your database, define and create what Korma calls entities. Here you’ll define an entity for blog posts:
(
defentity
posts
(
pk
:id
)
(
table
:posts
)
; Table name
(
entity-fields
:title
:content
))
; Default fields to SELECT
Normally you’d use a proper migration library for your schema, but for
the sake of simplicity, we’ll create a table manually. Use the
exec-raw
function to execute raw SQL statements against the
database. You should only do this where strictly necessary:
(
def
create-posts
(
str
"CREATE TABLE posts "
"(id serial, title text, content text,"
"created_on timestamp default current_timestamp);"
))
(
exec-raw
create-posts
)
Now that the posts
table exists, you can invoke insert
against
posts
with a map’s values
to add records to the database.
Each record is represented by a map. The names of the keys in the map
must match the names of the columns in the database:
(
insert
posts
(
values
nil
{
:title
"First post"
:content
"blah blah blah"
}))
To retrieve values from the database, query using select
. Successful
queries will return a sequence of maps, each containing keys
representing the column names:
(
select
posts
(
limit
1
))
;; -> [{:created_on #inst "2013-11-01T19:21:10.652920000-00:00",
;; :content "blah blah blah",
;; :title "First post",
;; :id 1}]
To correct or change existing records, use the update
macro. Invoke
update
against posts
, providing a set-fields
declaration to
specify what should change and a where
declaration narrowing what
records to make those changes to:
(
update
posts
(
set-fields
{
:title
"Best Post"
})
(
where
{
:title
"First post"
}))
;; -> {:title "Best Post", :id 1 ...}
The delete
macro works similarly to update
, but doesn’t take a
set-fields
declaration:
(
delete
posts
(
where
{
:title
"Best Post"
}))
(
select
posts
)
;; -> []
Korma provides a simple and intuitive way to construct SQL queries from Clojure. The advantage of using Korma is that the queries are written as regular code instead of SQL strings. You can easily compose queries and abstract common operations.
Korma exposes these abilities through its entity system. Entities are
an abstraction over traditional SQL tables that mask the complexity of
SQL’s crufty and complicated DDL (data definition language). Via the
defentity
macro, you have access to all of the power of traditional
SQL, packaged in a readable, Clojure-based DSL.
When defining entities with defentity
, you can pass in a number of
options. Some common options include table
to specify a table name,
pk
to specify the default ID field (primary key), entity-fields
to
specify the default fields for SELECT statements, or even db
to
specify which database the entity belongs in.
Entities also simplify defining relations between tables. Entity
declaration statements such as has-one
, has-many
, belongs-to
, and
many-to-many
define relationships to other entities. Consider adding
an author to each of our blog posts:
;; Create authors, assuming posts has an author_id
(
defentity
authors
;; By default, foreign-key will be :authors_id, but that is a little
;; awkward
(
has-many
posts
{
:fk
:author_id
}))
;; Redefine posts such that it assumes it has an author_id
(
defentity
posts
(
belongs-to
authors
{
:fk
:author_id
}))
;; Create the authors table
(
exec-raw
"CREATE TABLE authors (id serial, name text);"
)
;; Add the authors_id field to posts
(
exec-raw
"ALTER TABLE posts ADD COLUMN author_id int;"
)
(
def
ryan
(
insert
authors
(
values
{
:name
"Ryan"
})))
ryan
;; -> {:name "Ryan", :id 1}
(
insert
posts
(
values
[{
:title
"My first post!"
,:author_id
(
:id
ryan
)}
{
:title
"My second post."
,:author_id
(
:id
ryan
)}]))
(
select
posts
(
where
{
:author_id
(
:id
ryan
)}))
;; -> [{:author_id 1,
;; ...
;; :title "My first post!",
;; :id 4}
;; {:author_id 1,
;; ...
;; :title "My second post.",
;; :id 5}]
Stemming from its entity system, Korma provides DSL versions of common
SQL statements such as select
, update
, insert
, and delete
. One
of the most interesting query types is select
, which provides
support for most every SELECT statement option, include simplified
table joins (via its relation helpers). Some notable helpers include
aggregate
, join
, order
, group
, and having
. Chances are, if it
is an SQL statement feature, Korma has a helper for it.
Korma’s DSL isn’t only convenient, it’s also composable. Using
select*
instead of select
returns a query as a value, instead of
an evaluated result. You can pipeline query values through regular
select
helpers to build up or store partial queries. Finally, invoke
select
on a query value to execute it and receive its result:
(
defn
authors-posts
"Retrieve all posts for a person with a given name"
[
name
]
(
->
(
select*
posts
)
(
with
authors
)
(
where
{
:authors.name
name
})))
;; Find the title of all posts by the author named "Ryan"
(
->
(
authors-posts
"Ryan"
)
(
where
(
like
:title
"%second%"
))
(
fields
:title
)
select
)
;; -> [{:title "My second post."}]
Another convenience Korma provides is default connections. You may
have noticed in the examples that we never referred to the db
we defined.
When only a single connection is defined, it will be used by default
and you don’t have to pass it explicitly. If you like, you can define
multiple connections and wrap series of statements in a with-db
call:
(
with-db
db
(
select
(
authors-posts
"Ryan"
)))
You want to support flexible full-text search over an unstructured or semistructured dataset using Lucene. For example, you want to return all people in the United States that have “Clojure” anywhere in their job descriptions.
Use Clucy, a Clojure wrapper for Lucene. Clucy provides the tools to build and query indexes from within a Clojure process.
To follow along with this recipe, create a new project (lein new
text-search
), add [clucy "0.4.0"]
to its dependencies, and start a
REPL using lein repl
.[22]
The following code creates and queries a simple in-memory index:
(
require
'
[
clucy.core
:as
clucy
])
(
def
index
(
clucy/memory-index
))
;; -> #'user/index
(
clucy/add
index
{
:name
"Alice"
:description
"Clojure expert"
:location
"North Carolina, United States"
}
{
:name
"Bob"
:description
"Clojure novice"
:location
"Berlin, Germany"
}
{
:name
"Eve"
:description
"Eavesdropper"
:location
"Maryland, United States"
})
;; -> nil
(
clucy/search
index
"description:clojure AND location:"united states""
10
)
;; -> ({:name "Alice",
;; :location "North Carolina, United States",
;; :description "Clojure expert"})
Lucene is a Java library for information retrieval. To use Lucene, you generate documents and index them for later retrieval. Documents consist of fields and terms. In this example, the documents are quite small, but Lucene is capable of efficiently indexing large numbers of very large documents as well.
Clucy wraps Lucene in a convenient manner for use in Clojure and is capable of generating Lucene documents directly from simple Clojure maps, where keys map to fields and values map to textual data to be indexed.
clucy.core/search
takes an index, a query string, and the number of
results to return as parameters. Lucene is able to efficiently query
in part because it is not necessary to return all matching documents,
just the top n
best matches.
Clucy does not work as well out of the box with nested values in your maps. Be sure to flatten out values into simple strings for proper indexing and retrieval.
This example uses a memory-index
, which stores the index in system
memory. In most real applications, you’ll want to persist the index to
disk, which allows it to grow larger than the available memory and
allows you to restart your process without re-indexing. Clucy lets
you construct a Lucene disk index via the disk-index
function:
(
def
index
(
clucy.core/disk-index
"/tmp/index"
))
As part of the process for generating documents, Lucene calls an
analyzer on your strings to generate tokens for indexing. The default
StandardAnalyzer
is sufficient for most purposes and can be
customized with a list of “stop words” to be ignored during token
generation:
(
import
'org.apache.lucene.analysis.standard.StandardAnalyzer
)
;; -> org.apache.lucene.analysis.standard.StandardAnalyzer
(
import
'org.apache.lucene.analysis.util.CharArraySet
)
;; -> org.apache.lucene.analysis.util.CharArraySet
(
def
stop-words
(
doto
(
CharArray.
clucy.core/*version*
3
true
)
(
.add
"do"
)
(
.add
"not"
)
(
.add
"index"
)))
(
binding
[
clucy.core/*analyzer*
(
StandardAnalyzer.
clucy.core/*version*
stop-words
)]
;; Invoke index add and search forms here, within the binding
)
However, in other situations you may need to use a different analyzer
or write your own. For example, the EnglishAnalyzer
uses Porter stemming and
other techniques better suited to taking into account pluralization or
possessives:
(
import
org.apache.lucene.analysis.en.EnglishAnalyzer
)
;; -> org.apache.lucene.analysis.en.EnglishAnalyzer
(
binding
[
clucy.core/*analyzer*
(
EnglishAnalyzer.
clucy.core/*version*
)]
;; Invoke index add and search forms here, within the binding
)
The basic search query syntax is field:term
. By default, multiple
clauses will perform an OR
search, so an explicit AND
is required
if both clauses must be true.
If no field is specified, there is an implicit field _content
that
indexes all map values. Documents returned are ordered by Lucene’s
default relevance algorithm, which takes into account term frequency,
distance, and document length:
(
clucy.core/search
index
"clojure united states"
10
)
;; -> ({:name "Alice",
;; :location "North Carolina, United States",
;; :description "Clojure expert"}
;; {:name "Eve",
;; :location "Maryland, United States",
;; :description "Eavesdropper"}
;; {:name "Bob",
;; :location "Berlin, Germany",
;; :description "Clojure novice"})
You want to index data using the ElasticSearch indexing and search engine.
Use Elastisch, a minimalistic Clojure wrapper around the ElasticSearch Java APIs.
In order to successfully work through the examples in this recipe, you should have ElasticSearch installed and running on your local system. You can find details on how to install it on the ElasticSearch website.
ElasticSearch supports multiple transports (e.g., HTTP, native Netty-based transport, and Memcached). Elastisch supports HTTP and native transports. This recipe will use an HTTP transport client for the examples and explain how to switch to the native transport in the discussion section.
To follow along with this recipe, add [clojurewerkz/elastisch "1.2.0"]
to your project’s dependencies, or start a REPL using lein-try
:
$ lein try clojurewerkz/elastisch
Before you can index and search with Elastisch, it is necessary to
tell Elastisch what ElasticSearch node to use. To use the HTTP
transport, you use the clojurewerkz.elastisch.rest/connect!
function
that takes an endpoint as its sole argument:
(
require
'
[
clojurewerkz.elastisch.rest
:as
esr
])
(
esr/connect!
"http://127.0.0.1:9200"
)
Before data can be searched over, it needs to be indexed. Indexing is the process of scanning the text and building a list of search terms and data structures called a search index. Search indexes allow search engines such as ElasticSearch to efficiently retrieve relevant documents for a query.
The process of indexing involves a few steps:
To create an index, use the clojurewerkz.elastisch.rest.index/create
function:
(
require
'
[
clojurewerkz.elastisch.rest.index
:as
esi
])
(
esr/connect!
"http://127.0.0.1:9200"
)
;; Create an index with the given settings and no custom mapping types
(
esi/create
"test1"
)
;; Create an index with custom settings
(
esi/create
"test2"
:settings
{
"number_of_shards"
1
}))
A full explanation of the available indexing settings is outside the scope of this recipe. Please refer to the Elastisch documentation on indexing for full details.
Mappings define the fields in a document and what the indexing
characteristics are for each field. Mapping types are specified when
an index is created using the :mapping
option:
(
esr/connect!
"http://127.0.0.1:9200"
)
;; Mapping types map structure is the same as in the ElasticSearch API reference
(
def
mapping-types
{
"person"
{
:properties
{
:username
{
:type
"string"
:store
"yes"
}
:first-name
{
:type
"string"
:store
"yes"
}
:last-name
{
:type
"string"
}
:age
{
:type
"integer"
}
:title
{
:type
"string"
:analyzer
"snowball"
}
:planet
{
:type
"string"
}
:biography
{
:type
"string"
:analyzer
"snowball"
:term_vector
"with_positions_offsets"
}}}})
(
esi/create
"test3"
:mappings
mapping-types
)))
To add a document to an index, use the
clojurewerkz.elastisch.rest.document/create
function. This will
cause a document ID to be generated automatically:
(
require
'
[
clojurewerkz.elastisch.rest.document
:as
esd
])
(
esr/connect!
"http://127.0.0.1:9200"
)
(
def
mapping-types
{
"person"
{
:properties
{
:username
{
:type
"string"
:store
"yes"
}
:first-name
{
:type
"string"
:store
"yes"
}
:last-name
{
:type
"string"
}
:age
{
:type
"integer"
}
:title
{
:type
"string"
:analyzer
"snowball"
}
:planet
{
:type
"string"
}
:biography
{
:type
"string"
:analyzer
"snowball"
:term_vector
"with_positions_offsets"
}}}})
(
esi/create
"test4"
:mappings
mapping-types
)
(
def
doc
{
:username
"happyjoe"
:first-name
"Joe"
:last-name
"Smith"
:age
30
:title
"The Boss"
:planet
"Earth"
:biography
"N/A"
})
(
esd/create
"test4"
"person"
doc
)
;; => {:ok true, :_index people, :_type person,
;; :_id "2vr8sP-LTRWhSKOxyWOi_Q", :_version 1}
clojurewerkz.elastisch.rest.document/put
will add a document to the index but expects a document ID to be provided:
(
esr/put
"test4"
"person"
"happyjoe"
doc
)
Whenever a document is added to the ElasticSearch index, it is first analyzed.
Analysis is a process of several stages:
How exactly a document was analyzed defines what search queries will match (find) it. ElasticSearch is based on Apache Lucene and offers several analyzers developers can use to achieve the kind of search quality and performance they need. For example, different languages require different analyzers: English, Mandarin Chinese, Arabic, and Russian cannot be analyzed the same way.
It is possible to skip performing analysis for fields and specify whether field values are stored in the index or not. Fields that are not stored still can be searched over but will not be included into search results.
ElasticSearch allows users to define exactly how different kinds of documents are indexed, analyzed, and stored.
ElasticSearch has excellent support for multitenancy: an ElasticSearch cluster can have a virtually unlimited number of indexes and mapping types. For example, you can use a separate index per user account or organization in a SaaS (Software as a Service) product.
There are two ways to index a document with ElasticSearch: you can submit a document for indexing without an ID or update a document with a provided ID, in which case if the document already exists, it will be updated (a new version will be created).
While it is fine and common to use automatically created indexes early in development, manually creating indexes lets you configure a lot about how ElasticSearch will index your data and, in turn, what kinds of queries it will be possible to execute against it.
How your data is indexed is primarily controlled by mappings. They define which fields in documents are indexed, if/how they are analyzed, and if they are stored. Each index in ElasticSearch may have one or more mapping types. Mapping types can be thought of as tables in a database (although this analogy does not always stand). Mapping types are the heart of indexing in ElasticSearch and provide access to a lot of ElasticSearch functionality.
For example, a blogging application may have types such as article, comment, and person. Each has distinct mapping settings that define a set of fields documents of the type have, how they are supposed to be indexed (and, in turn, what kinds of queries will be possible over them), what language each field is in, and so on. Getting mapping types right for your application is the key to a good search experience. It also takes time and experimentation.
Mapping types define document fields and their core types (e.g., string, integer, or date/time). Settings are provided to ElasticSearch as a JSON document, and this is how they are documented on the ElasticSearch site.
With Elastisch, mapping settings are specified as Clojure maps with the same structure (schema). A very minimalistic example:
{
"tweet"
{
:properties
{
:username
{
:type
"string"
:index
"not_analyzed"
}}}}
Here is a brief and very incomplete list of things that you can define via mapping settings:
"_all"
, default field, etc.)
When an index is created using the
clojurewerkz.elastisch.rest.index/create
function, mapping settings
are passed with the :mappings
option, as seen previously.
When it is necessary to update mapping for an index, you can use the
clojurewerkz.elastisch.rest.index/update-mapping
function:
(
esi/update-mapping
"myapp_development"
"person"
:mapping
{
:properties
{
:first-name
{
:type
"string"
:store
"no"
}}})
In a mapping configuration, settings are passed as maps where keys are
names (strings or keywords) and values are maps of the actual
settings. In this example, the only setting is :properties
, which
defines a single field—a string that is not analyzed:
{
"tweet"
{
:properties
{
:username
{
:type
"string"
:index
"not_analyzed"
}}}}
There is much more to the indexing and mapping options, but that’s outside the scope of a single recipe. See the Elastisch indexing documentation for an exhaustive list of the capabilities provided.
Use the Cassaforte library to connect to a Cassandra cluster and work with the records in the database.
In order to successfully work through the examples in this recipe, you should have Cassandra installed. You can find details on how to install Cassandra on the GettingStarted page of the wiki.
To follow along with this recipe, add [clojurewerkz/cassaforte "1.1.0"]
to your project’s dependencies, or start a REPL using lein-try
:
$ lein try clojurewerkz/cassaforte
In order to connect to your Cassandra cluster and create and use your first
keyspace, you will need the clojurewerkz.cassaforte.client
,
clojurewerkz.cassaforte.cql
, and clojurewerkz.cassaforte.query
namespaces.
clojurewerkz.cassaforte.client
is responsible for
connection—the other two provide an easy interface to execute queries:
(
require
'
[
clojurewerkz.cassaforte.client
:as
client
]
'
[
clojurewerkz.cassaforte.cql
:as
cql
]
'
[
clojurewerkz.cassaforte.query
:as
q
])
;; Connect to 2 nodes in a cluster
(
client/connect!
[
"localhost"
"another.node.local"
])
;; Create a keyspace named `cassaforte_keyspace`, using
;; the Simple Replication Strategy and a replication factor of 2
(
cql/create-keyspace
"cassaforte_keyspace"
(
q/with
{
:replication
{
:class
"SimpleStrategy"
:replication_factor
2
}}))
;; Switch to the keyspace
(
cql/use-keyspace
"cassaforte_keyspace"
)
Now, you can create tables and start inserting data into them. For that, invoke the create-table
and insert
functions of the clojurewerkz.cassaforte.cql
namespace:
(
cql/create-table
"users"
(
q/column-definitions
{
:name
:varchar
:city
:varchar
:age
:int
:primary-key
[
:name
]}))
Now, insert several users into the table:
(
cql/insert
"users"
{
:name
"Alex"
:city
"Munich"
:age
(
int
26
)})
(
cql/insert
"users"
{
:name
"Robert"
:city
"Brussels"
:age
(
int
30
)})
You can access these records using a select
query. For example, if you want to retrieve all the users from the table or use limit
in your query, you can run:
;; Will retrieve all users
(
cql/select
"users"
)
;; Will retrieve top 10 users
(
cql/select
"users"
(
q/limit
10
))
Alternatively, if you want to retrieve information about a single person by a given name
, you can add a where
clause to it:
(
cql/select
"users"
(
q/where
:name
"Alex"
))
Cassandra is an open source implementation of many of the ideas in Amazon’s landmark Dynamo Paper. It’s a key/value datastore, and it’s not aware of any relationships between tables and data points. Cassandra is a distributed datastore and is designed to be highly available. For that, it replicates data within the cluster. The data is stored redundantly on multiple nodes. If one node fails, data is still available for retrieval from a different node or multiple nodes.
Cassandra starts making sense when your data is rather big. Because it was built for distribution, you can scale your reads and writes, and fine-tune and manage your database’s consistency and availability. Cassandra handles network partitions well, so even if several of your nodes are unavailable for some time, you will still be able to read and write data until the network partition heals. If your dataset is rather small, you don’t expect it to grow significantly anytime soon, and you need to run many ad hoc queries against the dataset, then Cassandra may not make sense.
Consistency and availability are tunable values. You can get better availability by sacrificing data consistency: due to network partitions, not all the nodes will hold the latest snapshot of data at all times, but you’ll be still able to respond to writes and receive reads. If you choose to have strong consistency, conversely, the latency will increase, since more nodes should respond successfully for reads and writes. Eventual consistency guarantees that, if no conflicting writes are made for the data point, eventually all nodes will hold the latest value.
Like most datastores, Cassandra has concepts of separate databases (keyspaces in Cassandra terminology). Every keyspace holds tables (sometimes called column families). Tables hold rows, and rows consist of columns. Each column has a key (column name), value, write timestamp, and time to live.
Cassandra uses two different communication protocols: an older binary protocol called Thrift, and CQL (Cassandra Query Language). All query operators in Cassaforte generate CQL code under the hood. Here are a couple of examples of how these operations translate to CQL internally:
(
cql/select
"users"
(
q/where
:name
"Alex"
))
;; SELECT * FROM users WHERE name='Alex';
(
cql/insert
"users"
{
:name
"Alex"
:city
"Munich"
:age
(
int
26
)})
;; INSERT INTO users (name, city) VALUES ('Munich', 26);
There’s much more to Cassandra than just creating tables and inserting values. If you want to update records in your database, you can call the update
function:
(
cql/update
"users"
{
:city
"Berlin"
}
(
q/where
:name
"Alex"
))
Deleting records from the database is just as easy:
;; Will delete just one user
(
cql/delete
:users
(
q/where
:name
"Alex"
))
;; Will delete all users whose names match within IN clause
(
cql/delete
:users
(
q/where
:name
[
:in
[
"Alex"
"Robert"
]]))
If you’d like to execute some arbitrary CQL statements, outside of
Cassaforte’s macro-based DSL, you can pass a string to the
client/execute
function:
(
client/execute
"INSERT INTO users (name, city, age) VALUES ('Alex', 'Munich', 19);"
)
For each issued write, you can specify an optional time to live to expire the data after a certain period of time. This is useful for caching and for data that you only want to hold for a certain period of time (like user sessions). For example, if you want the record to live for just 60 seconds, you can run:
(
cql/insert
"users"
{
:name
"Alex"
:city
"Munich"
:age
(
int
26
)}
(
q/using
:ttl
60
))
Another concept that people like about Cassandra is distributed
counters. Counter columns provide an efficient way to count or sum
anything you need. This is achieved by using atomic increment/decrement
operations on values. In order to create a table with a counter from
Cassaforte, you can use the :counter
column type:
(
cql/create-table
:scores
(
q/column-definitions
{
:username
:varchar
:score
:counter
:primary-key
[
:username
]}))
You can increment and decrement counters by using the increment-by
and decrement-by
queries:
(
cql/update
:scores
{
:score
(
q/increment-by
50
)}
(
q/where
:name
"Alex"
))
(
cql/update
:scores
{
:score
(
q/decrement-by
5
)}
(
q/where
:name
"Robert"
))
Use Monger to connect to MongoDB and search or manipulate the data. Monger is a Clojure wrapper around the Java MongoDB driver.
Before using Mongo from your Clojure code, you must have a running instance of MongoDB to connect to. See MongoDB’s installation guide for instructions on how to install MongoDB on your local system.
When you’re ready to write a Clojure MongoDB client, start a REPL using lein-try
:
$ lein try com.novemberain/monger
To connect to MongoDB, use the monger.core/connect!
function. This will store your connection in the *mongodb-connection*
dynamic var. If you want to get a connection to use without storing it in a dynamic var, you can use monger.core/connect
with the same options:
(
require
'
[
monger.core
:as
mongo
])
;; Connect to localhost
(
mongo/connect!
{
:host
"127.0.0.1"
:port
27017
})
;; Disconnect when you are done
(
mongo/disconnect!
)
Once you are connected, you can insert and query documents easily:
(
require
'
[
monger.core
:as
mongo
]
'
[
monger.collection
:as
coll
])
(
import
'
[
org.bson.types
ObjectId
])
;; Set the database in the *mongodb-database* var
(
mongo/use-db!
"mongo-time"
)
;; Insert one document
(
coll/insert
"users"
{
:name
"Jeremiah Forthright"
:state
"TX"
})
;; Insert a batch of documents
(
coll/insert-batch
"users"
[{
:name
"Pete Killibrew"
:state
"KY"
}
{
:name
"Wendy Perkins"
:state
"OK"
}
{
:name
"Steel Whitaker"
:state
"OK"
}
{
:name
"Sarah LaRue"
:state
"WY"
}])
;; Find all documents and return a com.mongodb.DBCursor
(
coll/find
"users"
)
;; Find all documents matching a query and return a DBCursor
(
coll/find
"users"
{
:state
"OK"
})
;; Find documents and return them as Clojure maps
(
coll/find-maps
"users"
{
:state
"OK"
})
;; -> ({:_id #<ObjectId 520...>, :state "OK", :name "Wendy Perkins"}
;; {:_id #<ObjectId 520...>, :state "OK", :name "Steel Whitaker"})
;; Find one document and return a com.mongodb.DBObject
(
coll/find-one
"users"
{
:name
"Pete Killibrew"
})
;; Find one document and return it as a Clojure map
(
coll/find-one-as-map
"users"
{
:name
"Sarah LaRue"
})
;; -> {:_id #<ObjectId 520...>, :state "WY", :name "Sarah LaRue"}
MongoDB, especially with Monger, can be a natural choice for storing Clojure data. It stores data as BSON (binary JSON), which maps well to Clojure’s own vectors and maps.
There are several ways to connect to Mongo, depending on how much you need to customize your connection and whether you have a map of options or a URI:
;; Connect to localhost, port 27017 by default
(
mongo/connect!
)
;; Connect to another machine
(
mongo/connect!
{
:host
"192.168.1.100"
:port
27017
})
;; Connect using more complex options
(
let
[
options
(
mongo/mongo-options
:auto-connect-retry
true
:connect-timeout
15
:socket-timeout
15
)
server
(
mongo/server-address
"192.168.1.100"
27017
)]
(
mongo/connect!
server
options
))
;; Connect via a URI
(
mongo/connect-via-uri!
(
System/genenv
"MONGOHQ_URL"
))
When inserting data, giving each document an _id
is optional. One will be created for you if you do not have one in your document. It often makes sense to add it yourself, however, if you need to reference the document afterward:
(
require
'
[
monger.collection
:as
coll
])
(
import
'
[
org.bson.types
ObjectId
])
(
let
[
id
(
ObjectId.
)
user
{
:name
"Lola Morales"
}]
(
coll/insert
"users"
(
assoc
user
:_id
id
))
;; Later, look up your user by id
(
coll/find-map-by-id
"users"
id
))
;; -> {:_id #<ObjectId 521...>, :name "Lola Morales"}
In its idiomatic usage, Monger is set up to work with one connection and one database, as monger.core/connect!
and monger.core/use-db!
set dynamic vars to hold their information.
It is easy to work around this, though. You can use binding
to set these explicitly around code. In addition, you can use the monger.multi.collection
namespace instead of monger.collection
. All functions in the monger.multi.collection
namespace take a database as their first argument:
(
require
'
[
monger.core
:as
mongo
]
'
[
monger.multi.collection
:as
multi
])
(
mongo/connect!
)
;; use-db! takes a string for the database, as it is a convenience function,
;; but for monger.multi.collection and other functions, we need to use
;; get-db to get the database
(
let
[
stats-server
(
mongo/connect
"stats.example.org"
)
app-db
(
mongo/get-db
"mongo-time"
)
geo-db
(
mongo/get-db
"geography"
)]
;; Record data in our stats server
(
binding
[
mongo/*mongodb-connection*
stats-server
]
(
multi/insert
(
mongo/get-db
"stats"
)
"access"
{
:ip
"127.0.0.1"
:time
(
java.util.Date.
)}))
;; Find users in our application DB
(
multi/find-maps
app-db
"users"
{
:state
"WY"
})
;; Insert a square in our geography DB
(
multi/insert
geo-db
"shapes"
{
:name
"square"
:sides
4
:parallel
true
:equal
true
}))
The basic find functions in monger.collection
will work for simple queries, but you will soon find yourself needing to make more complex queries, which is where monger.query
comes in. This is a domain-specific language for MongoDB queries:
(
require
'
[
monger.query
:as
q
])
;; Find users, skipping the first two and getting the next three.
(
q/with-collection
"users"
(
q/find
{})
(
q/skip
2
)
(
q/limit
3
))
;; Get all the users from Oklahoma, sorted by name.
;; You must use array-map with sort so you can keep keys in order.
(
q/with-collection
"users"
(
q/find
{
:state
"OK"
})
(
q/sort
(
array-map
:name
1
)))
;; Get all users not from Oklahoma or with names that start with "S".
(
q/with-collection
"users"
(
q/find
{
"$or"
[{
:state
{
"$ne"
"OK"
}}
{
:name
#
"^S"
}]}))
You want to work with data in Redis.
Use Carmine to connect to and interact with Redis.
To use this recipe, you should first install Redis and have it running locally. You can find details on how to install Redis at the official Redis download page. If you are on Windows, you will want to look at the Microsoft Open Tech GitHub Redis project.
To follow along with this recipe, add [com.taoensso/carmine "2.2.0"]
to your project’s dependencies, or start a REPL using lein-try
:
$ lein try com.taoensso/carmine
To use Carmine, you must first define a connection spec:
(
def
server-connection
{
:pool
{
:max-active
8
}
:spec
{
:host
"localhost"
:port
6379
;;:password ""
:timeout
4000
}})
Carmine supports all of the Redis commands, and the names (for the most part) match the Redis documentation.
Use the wcar
function and the connection specification server-connection
to send all the Redis commands
you already know and love:
(
require
`
[
taoensso.carmine
:as
car
:refer
(
wcar
)])
(
wcar
server-connection
(
car/set
"Nick"
"Nack"
))
;; -> "OK"
(
wcar
server-connection
(
car/get
"Nick"
))
;; -> "Nack"
(
wcar
server-connection
(
car/hset
"founder"
"name"
"Tim"
))
;; -> 0
(
wcar
server-connection
(
car/hset
"founder"
"age"
59
))
;; -> 0
(
wcar
server-connection
(
car/hgetall
"founder"
))
;; -> [name Tim age 59]
Passing in multiple commands will pipeline them and return the results together as a vector:
(
wcar
server-connection
(
car/set
"paddywhacks"
0
)
(
car/incr
"paddywhacks"
)
(
car/get
"paddywhacks"
))
;; -> ["OK" 1 "1"]
Redis describes itself as a data structure server. With data structures similar to the core data structures in Clojure, they make a natural pairing for a wide range of problems. Redis’s speed and key/value storage make it especially useful for caching and memoization applications (more on that later).
You can remove some boilerplate by wrapping the call to wcar
in a macro that passes the
connection specification for you:
(
defmacro
wcar*
[
&
body
]
`
(
car/wcar
server-connection
~@
body
))
(
wcar*
(
car/set
"Nick"
"Nack"
))
;; -> "OK"
(
wcar*
(
car/get
"Nick"
))
;; -> "Nack"
Serialization is handled automatically and for most cases just works. Simply pass in the data you want to store, and Carmine will automatically serialize/deserialize it for you:
(
wcar*
(
car/set
"some-key"
{
:event
"An Event"
,:timestamp
(
new
java.util.Date
)})
(
car/get
"some-key"
))
;; -> [OK {:event An Event, :timestamp #inst "2013-08-18T21:31:33.993-00:00"}]
This works great as long as you stick to core Clojure data types. However, if you need to support storing custom data types, you will need to deal with the underlying serialization library, called Nippy. For more information, see the Nippy GitHub project.
Redis is great to use as a memoization storage backend. Obviously,
there are some serious trade-offs to consider when weighing against an
in-memory solution, such as the core.cache
library. But for the
right situation, it can be an incredible boost. Consider, for example,
memoizing a function that hits an external web service to fetch the
current weather. With minimal effort, multiple servers can share the
latest data and even have stale data automatically expire and refresh.
The following is an example for just such a situation:
(
defn
redis-memoize
"Convert a function to one that is memoized using Redis as storage."
[
key-prefix
ttl-seconds
connection-spec
f
]
(
fn
[
&
args
]
(
let
[
key-name
[
key-prefix
args
]]
(
if-let
[
found-result
(
wcar
connection-spec
(
car/get
key-name
))]
found-result
(
let
[
new-result
(
apply
f
args
)]
(
wcar
connection-spec
(
car/set
key-name
new-result
)
(
car/expire
key-name
ttl-seconds
))
new-result
)))))
This makes a couple of assumptions worth noting. First, it assumes that the arguments for the function
being memoized are supported by Nippy (see the earlier serialization example). Second, it assumes that the
memoized data should be expired after a specified number of seconds. To use redis-memoize
, simply
pass in a function. The following is a highly contrived example that uses the server-connection
defined previously:
(
defn
square
[
x
]
(
printf
"Ran square for: %s "
x
)
(
*
x
x
))
(
def
redis-squared
(
redis-memoize
"squared"
10
server-connection
square
))
(
redis-squared
99
)
;; -> Ran square for: 99
;; -> 9801
(
redis-squared
99
)
;; -> 9801
In addition to the features showcased earlier, Carmine includes (among other things) a message queue, distributed locks, a Ring session store, and even an implementation of DynamoDB (which is in alpha at the time of writing). These features are outside the scope of this recipe, but they’re well documented and straightforward to use. Consult the Carmine GitHub project for more information.
memoize
function
Before starting, add [com.datomic/datomic-free "0.8.4218"]
to your project’s
dependencies or start a REPL using lein-try
:
$ lein try com.datomic/datomic-free
To create and connect to an in-memory database, use
database.api/create-database
and datomic.api/connect
:
(
require
'
[
datomic.api
:as
d
])
(
def
uri
"datomic:mem://sample-database"
)
(
d/create-database
uri
)
;; -> true
(
def
conn
(
d/connect
uri
))
conn
;; -> #<LocalConnection datomic.peer.LocalConnection@49384d99>
Once you have a connection, you can use it to get a database value
with datomic.api/db
. This value is used to query a database:
(
def
db
(
d/db
(
d/connect
uri
)))
db
;; -> datomic.db.Db@7b7fea26
You can also use the connection to transact data using
datomic.api/transact
:
;; Transact the schema for your Next Big Thing
(
def
my-great-schema
[])
; This vector intentionally left blank
(
d/transact
(
d/connect
uri
)
my-great-schema
)
You’ll notice in the solution that we not only connected to a database,
but we created it too. This pattern is common when using in-memory
databases, as no in-memory databases exist in a fresh JVM. It is not
strictly necessary to call create-database
if the database already
exists, but it is safe to do so—create-database
is idempotent and
will return false
if one already exists. When connecting to a database that isn’t in memory, it is necessary
for the relevant transactor and storage service to be running.
The return value of d/connect
is used when querying a database value
or when transacting data. It is also used when reading the transaction
log, when consuming the transaction report queue, or when performing
administrative tasks such as requesting an indexing job, garbage
collecting storage, and disposing of resources associated with the
connection.
Connections are thread-safe and are cached by URI internally, so there is no need to pool connections yourself. There is no performance overhead for creating many connections to the same URI.
Datomic transactor processes have a limit on the number of concurrently connected peer processes. Datomic Free has a limit of two peers per transactor. For nondistributed applications, this may well be sufficient. If you’re building a larger service, then you may need a Datomic Pro license for more peers.
There are several options for storage services that back Datomic. Three are built-in, and the rest use external services. Datomic Free
includes access to the in-memory and :free
storage backends. Datomic
Pro and Pro Starter Edition include access to all services.
The built-in storage options are:
"datomic:mem://[db-name]"
"datomic:free://host[:port]/[db-name]"
"datomic:dev://host[:port]/[db-name]"
Free and Dev can also be configured to use alternate ports for
storage: "datomic:free://host[:port]/[db-name]?h2-port=[port]&h2-web-port=[port]"
.
By default, these ports will be one and two more than the transactor port, respectively.
Several external storage options also exist. These include:
"datomic:ddb://[aws-region]/[dynamodb-table]/[db-name]?aws_access_key_id=[XXX]&aws_secret_key=[YYY]"
"datomic:riak://host[:port]/bucket/dbname[?interface=http|protobuf]"
(default is protobuf)
"datomic:couchbase://host/bucket/dbname[?password=xxx]"
"datomic:inf://[cluster-member-host:port]/[db-name]"
"datomic:sql://[db-name][?jdbc-url]"
For SQL storage services, the map format can be used instead of the
string format. This is useful when specifying objects that can’t be
embedded in URI strings, like DataSource
s. The format for the SQL map
is:
{
:protocol
:sql
;; keyword or string
:db-name
"[db-name]"
;; keyword or string
:data-source
aDataSourceObject
;; OR
:factory
aCallableReturningConnection
}
You need to define how your data will be modeled in Datomic. For example, you need to model users and their user groups, relating the two in some way.
Datomic schemas are defined in terms of attributes. It’s probably easiest to jump straight to an example.
To follow along with this recipe, complete the steps in the solution in Recipe 6.10, “Connecting to a Datomic Database”. After doing this, you
should have an in-memory database and connection, conn
, to work with.
Consider the attributes a user might have:
To define this schema, create a vector with attribute maps for email, name, and role, as well as insertions of the three static roles:
(
def
user-schema
[{
:db/doc
"User email address"
:db/ident
:user/email
:db/valueType
:db.type/string
:db/cardinality
:db.cardinality/one
:db/unique
:db.unique/identity
:db/id
#
db/id
[
:db.part/db
]
:db.install/_attribute
:db.part/db
}
{
:db/doc
"User name"
:db/ident
:user/name
:db/valueType
:db.type/string
:db/cardinality
:db.cardinality/one
:db/index
true
:db/id
#
db/id
[
:db.part/db
]
:db.install/_attribute
:db.part/db
}
{
:db/doc
"User roles"
:db/ident
:user/roles
:db/valueType
:db.type/ref
:db/cardinality
:db.cardinality/many
:db/id
#
db/id
[
:db.part/db
]
:db.install/_attribute
:db.part/db
}
[
:db/add
#
db/id
[
:db.part/user
]
:db/ident
:user.roles/guest
]
[
:db/add
#
db/id
[
:db.part/user
]
:db/ident
:user.roles/author
]
[
:db/add
#
db/id
[
:db.part/user
]
:db/ident
:user.roles/editor
]])
We define a group as having:
Define the group as follows:
(
def
group-schema
[{
:db/doc
"Group UUID"
:db/ident
:group/uuid
:db/valueType
:db.type/uuid
:db/cardinality
:db.cardinality/one
:db/unique
:db.unique/value
:db/id
#
db/id
[
:db.part/db
]
:db.install/_attribute
:db.part/db
}
{
:db/doc
"Group name"
:db/ident
:group/name
:db/valueType
:db.type/string
:db/cardinality
:db.cardinality/one
:db/index
true
:db/id
#
db/id
[
:db.part/db
]
:db.install/_attribute
:db.part/db
}
{
:db/doc
"Group users"
:db/ident
:group/users
:db/valueType
:db.type/ref
:db/cardinality
:db.cardinality/many
:db/id
#
db/id
[
:db.part/db
]
:db.install/_attribute
:db.part/db
}])
Finally, transact
both schema definitions into a database via a
connection:
(
require
'
[
datomic.api
:as
d
])
@
(
d/transact
(
d/connect
"datomic:mem://sample-database"
)
(
concat
user-schema
group-schema
))
;; -> {:db-before datomic.db.Db@25b48c7b,
;; :db-after datomic.db.Db@5d81650c,
;; :tx-data [#Datum{:e ... :a ... :v ... :tx :added true}, ...],
;; :tempids {-... ..., ...}}
A Datomic schema is represented as Clojure data and is added to the
database in a transaction, just like any other data we would store.
The :db.install/_attribute :db.part/db
key/value pair is used by the
transactor to make the schema available to the rest of the system.
The schema is placed in the :db.part/db
database partition, a partition
reserved for schemas. All user data is placed in user partition(s)—either the default of :db.part/user
or a custom partition. Partitions
are useful for optimizing how indexes sort data, which is useful for
optimizing a query. Schema entities require that at least :db/ident
, :db/valueType
, and
:db/cardinality
values are present.
Aside from the schema, Datomic does not enforce how attributes are combined for any given entity. Datomic only requires that a schema be defined up front, enforcing type and uniqueness constraints at runtime.
Use namespaces in schema :db/ident
values to help classify entities
(such as user
in :user/email
). Datomic doesn’t do anything
specific with namespaces, so using them is optional. There are several options for :db/valueType
, listed in Table 6-1.
|
|
|
|
|
|
|
|
|
|
|
|
|
See the Datatomic schema documentation for an exhaustive listing of their semantics.
Attributes with :db/valueType :db.type/ref
can only have other
entities as their value(s). You use this type to model relationships
between entities. Datomic does not enforce which entities are related
to on a given :db/valueType :db.type/ref
attribute. Any other entity
can be related to—this means that entities can relate to themselves!
You also use :db/valueType :db.type/ref
and lone :db/ident
values
to model enumerations, such as the user roles that you defined. These
enumerations are not actually schemas; they are normal entities with a
single attribute, :db/ident
. An entity’s :db/ident
value serves as
a shorthand for that entity; you may use this value in lieu of the
entity’s :db/id
value in transactions and queries.
Attributes with :db/valueType :db.type/ref
and :db/unique
values
are implicitly indexed as though you had added :db/index true
to
their definitions.
It is also possible to use Lucene full-text indexing on string
attributes, using :db/fulltext true
and the system-defined
fulltext
function in Datalog.
There are two options for specifying a uniqueness constraint at
:db/unique
:
:db.unique/value
:db.unique/identity
In the case where you are modeling entities with subentities that
only exist in the context of those entities, such as order items on an
order or variants for a product, you can use :db/isComponent
to
simplify working with such subentities. It can only be used on
attributes of type :db.type/ref
.
When you use the :db.fn/retractEntity
function in a transaction, any
entities on the value side of such attributes for the retracted entity
will also be retracted. Also, when you use d/touch
to realize all
the lazy keys in an entity map, component entities will be
realized too. Both the retraction and realization behaviors are
recursive.
By default, Datomic stores all past values of attributes. If you do
not wish to keep past values for a particular attribute, use
:db/noHistory true
to have Datomic discard previous values. Using
this attribute is much like using a traditional update-in-place
database.
Use a Datomic connection to transact data.
To follow along with this recipe, complete the steps in the solutions to Recipe 6.10, “Connecting to a Datomic Database”, and Recipe 6.11, “Defining a Schema for a Datomic Database”.
After doing this, you will have a
connection, conn
, and a schema installed against which you can
insert data:
(
require
'
[
datomic.api
:as
d
:refer
[
q
db
]])
(
def
tx-data
[{
:db/id
(
d/tempid
:db.part/user
)
:user/email
"[email protected]"
:user/name
"Martin Fowler"
:user/roles
[
:user.roles/author
:user.roles/editor
]}])
@
(
d/transact
conn
tx-data
)
(
q
'
[
:find
?name
:where
[
?e
:user/name
?name
]]
(
:db-after
tx-result
))
;; -> #{["Martin Fowler"]}
This map-based syntax for representing the data expands to a series of
:db/add
statements. This transaction is identical to the previous
one:
(
def
new-id
(
d/tempid
:db.part/user
))
new-id
;; -> #db/id[:db.part/user -1000013]
(
def
tx-data2
[[
:db/add
new-id
:user/email
"[email protected]"
]
[
:db/add
new-id
:user/name
"Ryan Neufeld"
]
[
:db/add
new-id
:user/roles
[
:user.roles/author
:user.roles/editor
]]])
(
def
tx-result
@
(
d/transact
conn
tx-data2
))
;; Keep this for later...
(
q
'
[
:find
?name
:where
[
?e
:user/name
?name
]]
(
db
conn
))
;; -> #{["Ryan Neufeld"] ["Martin Fowler"]}
Of course, you can use statements like these yourself, or you can use the map syntax shown in the solution. You can also mix the two. This is
how you would transact multiple entries (e.g., (d/transact conn
[person1-map person2-map])
).
One difference you’ll note between the map and the expanded form is
the lack of a :db/add
statement for the :db/id
key. In the
expanded form, this value comes immediately after the action
(:db/add
) and must be identical between all statements to
correlate attributes to a single entity. When specifying an entity as
a map, you provide a single ID, which the transactor transparently
affixes to each attribute.
What is an appropriate ID? Any new entities are assigned
temporary, negative ID values, which can be used to model
relationships within the transaction. Upon successfully completing a
transaction, all the temporary IDs are assigned in-storage positive ID
values. When working with code, the correct approach is to use the
datomic.api/tempid
function to obtain a temporary ID. The
datomic.api/tempid
function takes a partition keyword and an optional ID number as its
arguments; for most purposes, :db.part/user
will suffice.
When working with nonexecutable data, you’ll need to use the
data-literal form for temporary IDs. The literal #db/id
[:db.part/user]
is equivalent to (d/tempid :db.part/user)
. This
form is most useful when you store transaction data in an .edn
file, which is most often the case with schema definitions. Again, you
should use d/tempid
in your code—the #db/id
literal will
evaluate once at compile time, which means that any code that
expects the ID value to change from one execution to the next will
fail, because it’ll only ever have one value.
Consider our example file, user-bootstrap.edn:
[{
:db/id
#
db/id
[
:db.part/user
]
:user/email
"[email protected]"
:user/name
"Martin Fowler"
:user/roles
[
:user.roles/author
:user.roles/editor
]}]
When a transaction completes, you’ll receive a completed future. If you
prefer to transact asynchronously, you can use d/transact-async
instead, which will return its future immediately. In this case, as
with all futures, when you dereference it, it will block until the
transaction completes. Either way, dereferencing the future returns a
map, with four keys:
:db-before
:db-after
:tx-data
:tempids
You can use the :db-after
database to query the database directly
after the transaction:
(
def
db-after-tx
(
:db-after
tx-result
))
(
q
'
[
:find
?name
:in
$
:where
[
?entity
:user/email
]
[
?entity
:user/name
?name
]]
db-after-tx
"[email protected]"
)
;; -> #{["Martin Fowler"]}
You can use the :tempids
map to find the in-storage IDs for any new
entities you care about, much like you would when retrieving the last
insert ID in SQL databases. Invoke datomic.api/resolve-tempid
with
the :db-after
value, the :tempids
value, and the original temporary ID
to retrieve the realized ID:
(
d/resolve-tempid
db-after-tx
(
:tempids
tx-result
)
new-id
)
;; -> 17592186045421
To remove a value for an attribute, you should use the :db/retract
operation in transactions.
To follow along with this recipe, complete the steps in
the solutions to Recipe 6.10, “Connecting to a Datomic Database”, and
Recipe 6.11, “Defining a Schema for a Datomic Database”. After doing this, you will have a
connection, conn
, and a schema installed against which you can
insert data.
To start things off, add a user, Barney Rubble, and verify that he has an email address:
(
def
new-id
(
d/tempid
:db.part/user
))
(
def
tx-result
@
(
d/transact
conn
[{
:db/id
new-id
:user/name
"Barney Rubble"
:user/email
"[email protected]"
}]))
(
def
after-tx-db
(
:db-after
tx-result
))
(
def
barney-id
(
d/resolve-tempid
after-tx-db
(
:tempids
tx-result
)
new-id
))
barney-id
;; -> 17592186045429
(
d/q
'
[
:find
:in
$
?entity-id
:where
[
?entity-id
:user/email
]]
after-tx-db
barney-id
)
;; -> #{["[email protected]"]}
To retract Barney’s email, transact a transaction with the
:db/retract
operation:
(
def
retract-tx-result
@
(
d/transact
conn
[[
:db/retract
barney-id
:user/email
"[email protected]"
]]))
(
def
after-retract-db
(
:db-after
retract-tx-result
))
(
d/q
'
[
:find
:in
$
?entity-id
:where
[
?entity-id
:user/email
]]
after-retract-db
barney-id
)
;; -> #{}
To retract entire entities, use the :db.fn/retractEntity
built-in transactor function:
(
def
retract-entity-tx-result
@
(
d/transact
conn
[[
:db.fn/retractEntity
barney-id
]]))
(
def
after-retract-entity-db
(
:db-after
retract-entity-tx-result
))
(
d/q
'
[
:find
?entity-id
:in
$
?name
:where
[
?entity-id
:user/name
?name
]]
after-retract-entity-db
"Barney Rubble"
)
;; -> #{}
When using :db/retract
, you provide the value to retract so that in
the case of cardinality-many attributes, it’s clear which value to
retract from the set of values for that attribute. Regardless of the
cardinality, if you provide a value that isn’t in storage, nothing
will be retracted. This means that you have to know what value you
want to retract; you can’t simply retract everything for an attribute
by only providing the entity ID and the attribute.
If you retract values for an attribute that does not use
:db/noHistory
, you will be able to query past database values to
find past values for the attribute.
If you retract values for an attribute that uses :db/noHistory
, that
data will be permanently deleted.
When using :db.fn/retractEntity
, all attribute values for all the
attributes on that entity will be retracted, as will all :db/ref
attributes that have the entity as a value. Any component entities of
the entity being retracted will themselves be recursively
retracted.
You’ll find that the actual entity ID itself is not retracted, but that it will have no attributes associated with it. This is because once an entity is created, it cannot be retracted. Removing all the attributes and references to the entity has the same effect as if it had been permanently removed, though!
If you need to permanently remove data due to legal concerns or because the data in question falls outside of your domain-specified retention period, use excision to remove the data permanently.
Build your transaction as usual, but instead of calling d/transact
or d/transact-async
, use d/with
to produce an in-memory database
that includes the changes your transaction provides.
To follow along with this recipe, complete the steps in the solutions to Recipe 6.10, “Connecting to a Datomic Database”, and
Recipe 6.11, “Defining a Schema for a Datomic Database”. After doing this, you will have a
connection, conn
, and a schema installed against which you can
insert data.
First, add some data to the database about Fred Flintstone. As of about 4000 BCE, Fred didn’t have an email, but we at least know his name:
(
require
'
[
datomic.api
:as
d
])
(
def
new-id
(
d/tempid
:db.part/user
))
(
def
tx-result
@
(
d/transact
conn
[{
:db/id
new-id
:user/name
"Fred Flintstone"
}]))
Fast-forward to today: Fred is thawed, after having been frozen in ice for 6,000 years, and he gets his first email address. Prepare a transaction to add an email to the Fred entity:
;; Grab Fred's ID from the original transaction
(
def
fred-id
(
d/resolve-tempid
(
:db-after
tx-result
)
(
:tempids
tx-result
)
new-id
))
fred-id
;; -> 17592186045421
(
def
add-freds-email-tx
[[
:db/add
fred-id
:user/email
"[email protected]"
]])
Now, prepare an in-memory database with this new transaction applied.
First, get the current database value to use as a basis, then create
an in-memory database. Finally, grab the :db-after
value so that
you can test that the email was properly added:
(
defn
db-with
"Return a new database with tx applied"
[
db
tx
]
(
->
(
d/with
db
tx
)
:db-after
))
(
def
db-after
(
db-with
(
d/db
conn
)
add-freds-email-tx
))
Compare the value of Fred’s email in the current database with that of Fred’s email in the in-memory database:
(
defn
users-email
"Retrieve a user's email given the user's name."
[
db
name
]
(
->
(
d/q
'
[
:find
:in
$
?name
:where
[
?entity
:user/name
?name
]
[
?entity
:user/email
]]
db
name
)
ffirst
))
(
users-email
db-after
"Fred Flintstone"
)
;; -> "[email protected]"
(
users-email
(
d/db
conn
)
"Fred Flintstone"
)
;; -> nil
As you can see, the current database remains unaffected by this
transaction, but the database at db-after
now displays the new
value.
Databases produced by d/with
can be used with any of the other API
functions that accept a database, including d/with
itself. This
means that you can layer multiple transactions on top of one another
without first having to commit them to the transactor!
One of the things that makes Datomic so powerful is its ability to treat a database as a value. For this reason, the helper functions we’ve written take a database as an argument, not a connection. Now it is not only possible to query the current database, but other values of the database as well.
Use the datomic.api/datoms
function to directly access the core
Datomic indexes in your database.
To follow along with this recipe, complete the steps in
the solutions to Recipe 6.10, “Connecting to a Datomic Database”, and
Recipe 6.11, “Defining a Schema for a Datomic Database”. After doing this, you will have a
connection, conn
, and a schema installed against which you can
insert data.
For example, to quickly find the entities that have the provided attribute and
value set, invoke datomic.api/datoms
, specifying the :avet
index
(attribute, value, entity, transaction) and the desired attribute and
value:
(
require
'
[
datomic.api
:as
d
])
(
d/transact
conn
[{
:db/id
(
d/tempid
:db.part/user
)
:user/name
"Barney Rubble"
:user/email
"[email protected]"
}])
(
defn
entities-with-attr-val
"Return entities with a given attribute and value."
[
db
attr
val
]
(
->>
(
d/datoms
db
:avet
attr
val
)
(
map
:e
)
(
map
(
partial
d/entity
db
))))
(
def
barney
(
first
(
entities-with-attr-val
(
d/db
conn
)
:user/email
"[email protected]"
)))
(
:user/email
barney
)
;; -> "[email protected]"
This will only work for attributes where :db/index
is true
or
:db/unique
is not nil
.
To quickly determine all of the attributes an entity has, use the
:eavt
-ordered index:
(
defn
entities-attrs
"Return attrs of an entity"
[
db
entity
]
(
->>
(
d/datoms
db
:eavt
(
:db/id
entity
))
(
map
:a
)
(
map
(
partial
d/entity
db
))
(
map
:db/ident
)))
(
entities-attrs
(
d/db
conn
)
barney
)
;; -> (:user/email :user/name)
To quickly find entities that refer, via :db.type/ref
, to a provided
entity, use the :vaet
-ordered index:
;; Add a person that refers to a :user.roles/author role
(
d/transact
conn
[{
:db/id
(
d/tempid
:db.part/user
)
:user/name
"Ryan Neufeld"
:user/email
"[email protected]"
:user/roles
[
:user.roles/author
:user.roles/editor
]}])
(
defn
referring-to
"Find all entities referring to an entity as a certain attribute."
[
db
entity
]
(
->>
(
d/datoms
db
:vaet
(
:db/id
entity
)
)
(
map
:e
)
(
map
(
partial
d/entity
db
))))
(
def
author-entity
(
d/entity
(
d/db
conn
)
:user.roles/author
))
;; The names of all users with a :user.roles/author role
(
map
:user/name
(
referring-to
(
d/db
conn
)
author-entity
))
;; -> ("Ryan Neufeld")
For simple lookup queries, like “find by attribute” or “find by
value”, nothing beats Datomic’s raw indexes in terms of performance.
The datomic.api/datoms
interface provides access to all of Datomic’s
indexes and conveniently lets you dive in any number of levels,
“biting off” only the data you need.
As with most Datomic functions, datoms
takes a db
as its first
argument. You’ll note that in our examples, and elsewhere in the book, we
too accept a database as a value, and not a connection—this idiom
allows API users to perform varying numbers of operations on the same
database value. You should always try to do this yourself.
The second argument to datoms
indicates the particular index you want to access.
Each value is a permutation of the letters e
(entity), a
(attribute), v
(value),
and t
(transaction). The order of the letters in an index indicates how it
is indexed. For example, :eavt
should be traversed by entity, then
attribute, and so on and so forth. The four indexes and what they
include are as follows:
:eavt
:aevt
:avet
:db/index
is true
. Incredibly useful as a lookup index
(e.g., “I need the entity with an email of [email protected]”).
:vaet
:db.type/ref
values. This is a very interesting index that can be used to treat your data
a bit like a graph database.
After specifying an index ordering, you can optionally provide any number of components to pre-traverse the index. This serves to reduce the number of elements returned. For example, specifying just an attribute component for AVET traversal will return any entity with that attribute. Specifying an attribute and a value component, on the other hand, will return only entities with that specific attribute and value pair.
What is returned by datoms
is a stream of Datum
objects. Each
datum responds to :a
, :e
, t
, :v
, and :added
as functions.
[18] Mac users: visit http://postgresapp.com/ to download an easy-to-install DMG. Everyone else: you’ll find a guide for your operating system on the PostgreSQL wiki.
[19] Mac users: visit http://postgresapp.com/ to download an easy-to-install DMG. Everyone else: you’ll find a guide for your operating system on the PostgreSQL wiki.
[20] Mac users: visit http://postgresapp.com/ to download an easy-to-install DMG. Everyone else: you’ll find a guide for your operating system on the PostgreSQL wiki.
[21] Mac users: visit http://postgresapp.com/ to download an easy-to-install DMG. Everyone else: you’ll find a guide for your operating system on the PostgreSQL wiki.
[22] We would normally suggest using
lein-try
, but the plug-in is currently incompatible with Clucy.
3.141.2.157