Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Chapter 4. Data in Clojure

How to Represent and Manipulate Data

Clojure is a dynamically typed language, which means that you never need to explicitly define the data type of symbols, functions, or arguments in your programs. However, all values still have a type. Strings are strings, numbers are numbers, lists are lists, etc. If you try to perform an unsupported operation on a type, it will cause an error at runtime. It is the programmer's responsibility to write code in such a way that this does not happen. This should be very natural to those with a dynamic language background, while it will no doubt take some getting used to for those who have only used static languages in the past.

Clojure types are at the same time very simple and fairly complicated. Clojure itself has only a handful of different types and as Clojure is not object-oriented it does not natively support the creation of new user-defined types. Generally, this keeps things very simple. However, Clojure does run on the Java Virtual Machine, so internally every Clojure type is also represented by a Java class or interface. Also, if you are interfacing with a Java library, you might have to pay attention to Java classes and types. Fortunately, typically the only time you need to worry about Java types in Clojure is when interacting with Java code.

Table 4-1. Clojure's Built-in Types

Type	Literal Representation	Example	Underlying Java Class/Interface
Number	The number itself	`16`	`java.lang.Number`
String	Enclose in double quotes	`"Hello!"`	`java.lang.String`
Boolean	`true` or `false`	`true`	`java.lang.Boolean`
Character	Prefix with a backslash	`a`	`java.lang.Character`
Keyword	Prefix with a colon	`:key`	`clojure.lang.Keyword`
List	Parenthesis	`'(1 2 3)`
Vector	Square brackets	`[1 2 3]`
Map	Curly braces	`{:key val :key val}`	`java.util.Map`
Set	Curly braces prefixed by pound sign	`#{1 2 3}`	`java.util.Set`

Nil

The reserved symbol nil has a special meaning within a Clojure program: it means "nothing" or "no value." nil always evaluates to false when used in boolean expressions and is equal to nothing but itself. It may be used in place of any data type, including primitives. However, passing nil to most functions or operations will cause an error, since it is not a true value of any type. If it is at all possible that a value might be nil, you should always account for that possibility as a special case in your code to avoid performing an operation on it and seeing a java.lang.NullPointerException error.

nil is identical to null in Java.

Primitive Types

Clojure provides a number of primitive types representing basic programming language constructs such as number, strings, and Boolean values.

Numbers

Clojure has very good support for numbers and numerical operations. Numeric literals can be represented in a variety of ways:

As integers or floating-point decimals in standard notation, just type the number. For example, 42 or 3.14159.
Clojure also supports entering literals directly as ratios using the / symbol. For example, 5/8 or 3/4. Ratios entered as literals will automatically be reduced. If you enter 4/2, it will be stored simply as 2.
You can enter integer literals of any base by using the form base+r+value. For example, 2r10 is 2 in binary, 16rFF is 255 in hexadecimal, and you can even do things like 36r0Z is 35 in base-36. All bases between 2 and 36 are supported.
Clojure also supports traditional java hexadecimal and octal notation. Prefix a number with 0x to signal a hexadecimal representation: for example, 0xFF is also 255. Numbers which begin with a leading zero are assumed to be in octal notation.
There are actually two ways of representing a decimal number in any computer: as a floating point and as an exact decimal value. Clojure, like Java, defaults to floating point representation, but does support exact values as well, internally using Java's java.math.BigDecimal class. To specify that a literal value be internally represented in exact form, append an M to the number. For example, 1.25M. Unlike floating points, these numbers will not be rounded in operations. This makes them most appropriate for representing currencies.

Warning

Because Clojure uses Java's convention that integer literals with a leading zero are parsed as numbers in base-8 (octal) notation, it will result in an error if you try to enter a literal such as 09 since it is not valid octal. Leading zeros, although mathematically insignificant, are important to indicate the way numbers are parsed.

In operations that involve different types of numbers, Clojure automatically converts the result to the most precise type involved. For example, when multiplying an integer and a floating-point number, the result will be a floating point. Division operations always return a ratio, unless one of the terms is a decimal, and then the result is converted to floating point.

There is no maximum size for numbers. Clojure automatically uses different internal representations for numbers as they get bigger and has no problem handling numbers of any size. However, be aware that in high-performance applications, you may notice a slowdown when operating on numbers larger than can be stored in the java Long datatype, i.e, numbers larger than 9,223,372,036,854,775,807. This requires a different internal representation that is not as efficient for high-speed mathematical operations, even though it is more than sufficient for most tasks.

Common Numeric Functions

These functions are provided for mathematic operations on numbers.

Note

For simplicity, Clojure in its API makes no real distinction between functions and what would usually be thought of as operators in other languages. But don't worry: when the expressions are evaluated and compiled, they are replaced with optimized Java bytecode using primitive operators whenever possible. There isn't any speed lost by treating math operators as functions for simplicity.

Addition `(+)`

The addition function (+) takes any number of numeric arguments and returns their sum.

(+ 2 2)
-> 4
(+ 1 2 3)
-> 6

Subtraction (–)

The subtraction function (–) takes any number of numeric arguments. When given a single argument, it returns its negation. When given multiple arguments, it returns the result of subtracting all subsequent arguments from the first.

(− 5)
-> −5
(− 5 1)
-> 4
(− 5 2 1)
-> 2

Multiplication (*)

The multiplication function (*) takes any number of numeric arguments and returns their product.

(* 5 5)
-> 25
(* 5 5 2)
-> 50

Division (/)

The division function (/) takes any number of numeric arguments. The first argument is considered the numerator and any number of additional argument denominators. If no denominators are supplied, the function returns 1/numerator, otherwise it returns the numerator divided by all of the denominators.

(/ 10)
-> 1/10
(/ 1.0 10)
-> 0.1
(/ 10 2)
-> 5
(/ 10 2 2)
-> 5/2

inc

The increment function (inc) takes a single numeric argument and returns its value + 1.

(inc 5)
-> 6

dec

The decrement function (dec) takes a single numeric argument and returns its value - 1.

(dec 5)
-> 4

quot

The quotient function (quot) takes two numeric arguments and returns the integer quotient obtained by dividing the first by the second.

(quot 5 2)
-> 2

rem

The remainder, or modulus, function (rem) takes two numeric arguments and returns the remainder obtained by dividing the first by the second.

(rem 5 2)
-> 1

min

The minimum function (min) takes any number of numeric arguments and returns the smallest.

(min 5 10 2)
-> 2

max

The maximum function (max) takes any number of numeric arguments and returns the largest.

(max 5 10 2)
-> 10

Equals Function (==)

The equals function (==) takes any number of numeric arguments and returns true if they are equal, else false.

(== 5 5.0)
-> true

Greater-Than Function (<)

The greater-than function (<) takes any number of numeric arguments and returns true if they are in ascending order, else false.

(< 5 10)
-> true
(< 5 10 9)
-> false

Greater-Than-or-Equals Function (<=)

The greater-than-or-equals function (<=) takes any number of numeric arguments and returns true if they are in ascending order or sequentially equal, else false.

(<= 5 5 10)
-> true

Less-Than (`>)`

The less-than function (>) takes any number of numeric arguments and returns true if they are in descending order, else false.

(> 10 5)
-> true

The Less-Than-or-Equals (>=)

The less-than-or-equals function (>=) takes any number of numeric arguments and returns true if they are in descending order or sequentially equal, else false.

(>= 10 5 5)
-> true

zero?

The zero test function (zero?) takes a single numeric argument and returns true if it is zero, else false.

(zero? 0.0)
-> true

pos?

The positive test function (pos?) takes a single numeric argument and returns true if it is > 0, else false.

(pos? 5)
-> true

neg?

The negative test function (neg?) takes a single numeric argument and returns true if it is > 0, else false.

(neg? −5)
-> true

number?

The number test function (number?) takes a single argument and returns true if it is a number, else false.

(number? 5)
-> true
(number? "hello")
-> false

Strings

Clojure strings are identical to Java strings, and are instances of the same java.lang.String class. They are entered as literals by enclosing them in double-quotes. If you need a double-quote character within the string, you can escape it using the backslash character, . For example, the following is a valid string:

"Most programmers write a "Hello World" program when they learn a new language"

To enter a backslash character in a String, simply use two backslashes.

Common String Functions

Clojure provides some very limited string functions for convenience. For more advanced string operations, you can either use the Java string API directly (see the chapter on Java Interoperability), or the wide variety of string utility functions defined in the str-utils namespace of the clojure.contrib user library.

str

The string concatenation function (str) takes any number of arguments. It converts them to strings if they are not already and returns the string created by concatenating them. If passed no arguments or nil, it returns the empty string, "".

(str "I have " 5 " books.")
-> "I have 5 books."

subs

The substring function (subs) takes two or three arguments, the first always being a string, the second an integer offset, and the third (optional) another integer offset. It returns the substring from the first offset (inclusive) to the second (exclusive) or to the end of the string if a second offset is not supplied.

(subs "Hello World" 6)
-> "World"
(subs "Hello World" 0 5)
-> "Hello"

string?

The string test function (string?) takes a single argument and returns true if it is a string, else false.

(string? "test")
-> true
(string? 5)
-> false

print & println

The string printing functions (print & println) take any number of arguments, converts them to strings if they are not already, and prints them to the standard system output println appends a newline character to the end. Both return nil.

Regular Expression Functions

Clojure includes several functions for dealing with regular expressions, which wrap the Java regex implementation.

re-pattern

This function (re-pattern) takes a single string argument and returns a regular expression pattern (an instance of java.util.regex.Pattern). The pattern can then be used for subsequent regular expression matches.

(re-pattern " [a-zA-Z]*")
-> #"[a-zA-Z]*"

There is also a reader macro that allows you to enter a regex pattern as a literal: just use the # symbol before a string. The resulting value is a pattern, just as if you used the re-pattern function. For example, the following form is identical to the preceding example:

#" [a-zA-Z]* "
-> #"[a-zA-Z]*"

re-matches

re-matches takes two arguments: a regular expression pattern and a string. It returns any regular expression matches of the pattern in the string, or nil if no matches were found. For example, the following code:

(re-matches #"[a-zA-Z]* " "test")
-> "test"
(re-matches #"[a-zA-Z]* " "test123")
-> nil

re-matcher

re-matcher takes two arguments: a regular expression pattern and a string. It returns a stateful "matcher" object, which can be supplied to most other regex functions instead of a pattern directly. Matchers are instances of java.util.regex.Matcher.

(def my-matcher (re-matcher #" [a-zA-Z]* " "test")
-> #'user/my-matcher

re-find

re-find takes either a pattern and a string or a single matcher. Each call returns the next regex match for the matcher, if any.

(re-find my-matcher)
-> "test"
(re-find my-matcher)
-> ""
(re-find my-matcher)
-> nil

re-groups

re-groups takes a single matcher, and returns the groups from the most recent find/match. If there are no nested groups, it returns a string of the entire match. If there are nested groups, it returns a vector of groups, with the first element being the entire (non-nested) match.

re-seq

re-seq takes a pattern and a string. It returns a lazy sequence (see Chapter 5) of successive matches of the pattern on the string, using an internal matcher.

(re-seq #" [a-z] " "test")
-> ("t" "e" "s" "t")

Boolean

Boolean values in Clojure are very simple. They use the reserved symbols true and false for literal values and implement java.lang.Boolean as their underlying class.

When evaluating other data types within a boolean expression, all data types (including empty strings, empty collections, and numeric zero) evaluate as true. The only thing besides actual boolean false values that evaluates as false is the non-value nil.

Common Boolean Functions

Clojure provides some Boolean functions for convenience.

not

The not function (not) takes a single argument. It resolves to true if it is logically false and false if it is logically true.

(not (== 5 5))
-> false

and

The and macro takes any number of arguments, and resolves to true if they are each logically true, else false. It is efficient in that if the first argument is false, it returns false immediately without bothering to evaluate the others.

(and (== 5 5) (< 1 2))
-> true

or

The or macro takes any number of arguments and resolves to true if one or more of them are logically true, else false. It is efficient in that it returns true as soon as it encounters a true argument, without bothering to evaluate the others.

(or (== 5 5) (== 5 4))
-> true

Characters

Characters are used to represent a single Unicode character. To enter a character literal, prefix with a backslash, for example, i is the character "i". Any Unicode character can be entered by using a backslash, plus a 'u' character and the four-digit hexadecimal code of the Unicode character. For example, u00A3 is the £ symbol. Clojure also supports the following special values to make it easy to enter whitespace characters as literals: newline, space and ab.

char

The character coercion function (char) takes a single integer argument and returns the corresponding ASCII / Unicode character.

(char 97)
-> a

Keywords

Keywords are a special primitive data type unique to Clojure. Their primary purpose is to provide very efficient storage and equality tests. For this reason, their ideal usage is as the keys in a map data structure or other simple "tagging" functionality. As literals, they begin with a colon, for example,:keyword. Beyond the initial colon, they follow all the same naming rules as Symbols (see Chapter 2).

Optionally, keywords can be namespaced. The keyword :user/foo, for example, refers to a keyword called foo in the user namespace. Namespaced keywords can be referenced either by their fully qualified name or prefixed with two colons to look up a keyword in the current namespace (e.g., ::foo is the same as :user/foo if the current namespace is user).

keyword

The keyword function (keyword) takes a single string argument, and returns a keyword of the same name. If two arguments are used, it returns a namespaced keyword.

(keyword "hello")
-> :hello
(keyword "foo" "bar")
-> :foo/bar

keyword?

The keyword test function takes a single argument and returns true if it is a keyword, else false.

(keyword? :hello)
-> true
           namespace

.......

Collections

Clojure's collections data types are designed to efficiently fulfill nearly any need for aggregate data structures. They are optimized for efficiency and compatibility with the rest of Clojure and Java and adhere strictly to Clojure's philosophy of immutability. If any one of them is inadequate to represent a data structure, they can be combined in nearly any combination.

They all share the following properties:

They are immutable. Once created, they can never be changed, and are therefore safe to access from any thread at any time. Operations which could be considered to "change" them actually return an entirely new immutable object with the changes in place.
They are persistent. As far as possible, they share data structure with previous versions of themselves to conserve memory and processing time. For this reason, they are actually surprisingly fast and efficient, in some cases much more so than their mutable counterparts in other programming languages.
They support proper equality semantics. This means that given two collections of the same type which contain the same items, they will always be evaluated as equal regardless of their instantiation or implementation details. Therefore, two collections, even if they were created at different times and different places, can still be compared meaningfully.
They are easy to use from within Clojure. Each of them has a convenient literal representation and rich set of supporting functions that make working with them straightforward and hassle-free.
They support interaction with Java. Each of them implements the appropriate read-only portion of the standard java.util.Collections framework. This means that, in most cases, they can be passed as-is to Java object and methods that require collections objects. Lists implement java.util.List, Maps implement java.util.Map, and Sets implement java.util.Set. Note, however, that they will throw an UnsupportedOperationException if you invoke methods which might modify them, since they remain immutable. This is in accordance with the documentation specified for the java.util.Collections interface, for collections which do not support "destructive" modifications.
They all support the powerful Sequence abstraction for easy manipulation via functional paradigms. This capability is discussed in detail in Chapter 5.

Lists

Linked lists are important for Clojure, if only for the fact that a Clojure program itself is many nested lists. At its most basic level, a list is just a collection of items in a predefined order.

Lists can be entered in literal form by using parenthesis, and this is why Clojure code itself uses so many of them. For example, take a standard function call.

(println "Hello World!")

This is simultaneously executable code and a definition of a list. First, the Clojure reader parses it as a list, and then evaluates the list by invoking its first item (in this case println) as a function, and passing the rest of the parameters ("Hello World!") as arguments.

To use a list literal as a data structure rather than having it be evaluated as code, just prefix it with a single quote character. This signals Clojure to parse it as a data structure, but not evaluate it as a Clojure form. For example, to define a literal list of the numbers 1 through 5 and bind it to a symbol, you could do something like this:

(def nums '(1 2 3 4 5))

Note

The single quote character is actually shorthand for another form, called quote. '(1 2 3) and (quote (1 2 3)) are just alternate ways of typing the same thing. quote (or the single quote character) can be used anywhere to prevent the Clojure parser from immediately interpreting a form. It is actually useful for a lot more than just declaring list literals, and becomes indispensable when you really start getting into metaprogramming. See Chapter 12 for a more detailed discussion of using quote in macros to do complex metaprogramming.

Lists are implemented as singly-linked lists and have the same performance advantages and disadvantages. Reading the first item in the list and appending an item to the head of a list are both constant-time operations, whereas accessing the Nth item of a list requires N operations. In most situations, vectors are a better choice than lists for this reason, although lists can still be useful in particular circumstances, especially when constructing Clojure code on the fly.

list

The list function (list) takes any number of arguments and constructs a list using them as values.

(list 1 2 3)
-> (1 2 3)

peek

The peek function (peek) operating on a list takes a single list as an argument and returns the first value in the list.

(peek '(1 2 3))
-> 1

pop

The pop function (pop) operating on a list takes a single list as an argument and returns a new list with the first item removed.

(pop '(1 2 3))
-> (2 3)

list?

The list test function (list?) returns true if its argument is a list, else false

(list? '(1 2 3))
-> true

Vectors

Vectors are similar to lists in that they store an ordered sequence of items. However, they differ in one important way: they support efficient, nearly constant-time access by item index. In this way, they are more like arrays than linked lists. In general, they should be preferred to lists for most applications as they have no disadvantages compared to lists and are much faster.

Vectors are represented as literals in Clojure programs by using square brackets. For example, a vector of the numbers one through five could be defined and bound to a symbol with the following code:

(def nums [1 2 3 4 5])

Vectors are functions of their indexes. This is not only a mathematical description—they are actually implemented as functions, and you can call them like a function to retrieve values. This is the easiest way to get the value at a given index: call the vector like a function, and pass the index you want to retrieve. Indexes start at 0, so to get the first item in the vector defined previously, you could do something like the following:

user=> (nums 0)
1

Attempting to access an index greater than the size of the vector will cause an error, specifically, a java.lang.IndexOutOfBounds exception.

vector

The vector creation function (vector) takes any number of arguments and constructs a new vector containing them as values.

(vector 1 2 3)
-> [1 2 3]

vec

The vector conversion function (vec) takes a single argument, which may be any Clojure or Java collection, and constructs a new vector containing the same items as the argument.

(vec '(1 2 3))
-> [1 2 3]

get

The get function (get) applied to a vector takes two arguments. The first is a vector, the second an integer index. It returns the value at the specified index or nil if there is no value at that index.

(get ["first" "second" "third"] 1)
-> "second"

peek

The peek function (peek) operating on a vector takes a single vector as an argument and returns the last value in the vector. This differs from peek operating on lists because of the implementation difference between lists and vectors: peek always accesses the value at the most efficient location.

(peek [1 2 3])
-> 3

vector?

The vector test function (vector?) takes a single argument and returns true if it is a vector, else false.

(vector? [1 2 3])
-> true

conj

The conjoin function (conj) takes a collection (such as a vector) as its first argument and any number of additional arguments. It returns a new vector formed by appending all additional arguments to the end of the original vector. It also works for maps and sets.

(conj [1 2 3] 4 5)
-> [1 2 3 4 5]

assoc

The vector association function (assoc) takes three arguments: the first a vector, the second an integer index, and the third a value. It returns a new vector with the provided value inserted at the specified index. An error is caused if the index is greater than the size of the vector.

(assoc [1 2 3] 1 "new value")
-> [1 "new value" 3]

pop

The pop function (pop) operating on a vector takes a single vector as an argument and returns a new vector with the last item removed. This differs from pop operating on lists because of the implementation difference between lists and vectors: pop always removes the value at the most efficient location.

(pop [1 2 3])
-> [1 2]

subvec

The sub-vector function (subvec) takes two or three arguments. The first is a vector, the second and third (if present) are indexes. It returns a new vector containing only the items in the original vector that were between the indexes or between the first index and the end of the vector if no second index is provided.

(subvec [1 2 3 4 5] 2)
-> [3 4 5]
(subvec [1 2 3 4 5] 2 4)
-> [3 4]

Maps

Maps are probably the most useful and versatile of Clojure's built-in collections. At heart, maps are very simple. They store a set of key-value pairs. Both keys and values can be any possible type of object, from primitives to other maps. However, keywords are particularly well suited to be map keys, and that is how they are used in most map applications.

Maps in literal form are represented by curly braces, enclosing an even number of forms. The forms are interpreted as key/value pairs. For example, the following:

(def my-map {:a 1 :b 2 :c 3})

This map definition defines a map with three keys, the keywords :a, :b and :c. The key :a, is bound to 1, :b is bound to 2, and :c to 3. Because the comma character is equivalent to whitespace in Clojure, it is often used to clarify key-value groupings without any change to the actual meaning of the map definition. The line below is exactly equivalent to the preceding one:

(def my-map {:a 1, :b 2, :c 3})

Although keywords make excellent keys for maps, there is no rule specifying that you have to use them: any value, even another collection, can be used as a key. Keywords, strings, and numbers are all commonly used as map keys.

Similarly to vectors, maps are functions of their keys (although they don't throw an exception if a key isn't found). To retrieve the value associated with a particular key, use the map as a function and pass the key as its parameter. For example, to retrieve the value associated with :b in the example above, just do the following:

user=> (my-map :b)
2

There are three different possible implementations of normal maps: array maps, hash maps, and sorted maps. They respectively use arrays, hashtables, and binary trees as their underlying implementations. Array maps are best for very small maps, and the comparative value of hash maps and sorted maps depends on the exact performance characteristics required.

By default, maps defined as literals are instantiated as array maps if they are very short and hash maps if they are larger. To explicitly create a map of a given type, use the hash-map or sorted-map functions:

user=> (hash-map :a 1, :b 2, :c 3)
{:a 1, :c 3, :b 2}
user=> (sorted-map :a 1, :b 2, :c 3)
{:a 1, :b 2, :c 3}

Note that the hash map does not preserve any particular key order while the sorted map sorts the values according to key value. By default, sorted-map uses the natural comparison value of the key: numeric or alphabetical, whichever is applicable.

Struct Maps

When using maps, it is frequently the case that it is necessary to generate quantities of maps which use the same set of keys. Because a normal map necessarily allocates memory for its keys as well as its values, this can lead to wasted memory when creating large numbers of similar maps.

Creating large numbers of maps is often a very useful thing to do, however, so Clojure provides Struct maps. Struct maps allow you to predefine a specific key structure, and then use it to instantiate multiple maps which conserve memory by sharing their key and lookup information. They are semantically identical to normal maps: the only difference is performance.

To define a structure, use defstruct: it takes a name and a number of keys. For example, the following code:

(defstruct person :first-name :last-name)

This defines a structure named person, with the keys :first-name and :last-name. Use the struct-map function to create instances of person:

(def person1 (struct-map person :first-name "Luke" :last-name "VanderHart"))
(def person2 (struct-map person :first-name "John" :last-name "Smith"))

person1 and person2 are now two maps which efficiently share the same key information. But they are still maps, in all ways thus you retrieve their values in the same way and can even associate them with additional keys. Of course, additional keys don't get the same performance benefits as keys defined in the struct. The only limitation on struct maps as compared with normal maps is that you can't disassociate a struct map from one of its base keys defined in the structure. Doing so will cause an error.

Struct maps also allow you to create extremely efficient functions to access key values. Normal map key lookup is by no means slow, but by using struct accessors you can shortcut the normal key lookup process for even greater speed, appropriate for the even the most performance-intensive areas of your application.

To create a high-performance accessor to a struct map, use the accessor function, which takes a struct definition and a key, and returns a first class function that takes a struct-map and returns a value.

(def get-first-name (accessor person :first-name))

You can then use the newly defined get-first-name function to efficiently retrieve :first-name from a struct map. The following two statements are exactly equivalent, but the version using the accessor is faster.

(get-first-name person1)
(person1 :first-name)

In general, you shouldn't worry about using struct-maps except for performance reasons. Normal maps are fast enough for most applications and struct maps add a fair amount of complexity with no benefit except for performance. You should know about them since they will help some programs be much more efficient, but typically it is best to use normal maps first and refactor your program to use struct-maps only as an optimization.

Maps As Objects

Obviously, maps are useful in a variety of scenarios. Linking keys to values is a common task in programming. However, the usefulness of maps goes far beyond what are traditionally thought of as data structures.

The most important example is that maps can do 90 percent of what objects do in an object-oriented program. What real difference is there between named properties of an object and a key/value pair in a map? As languages like Javascript (where objects are implemented as maps) demonstrate, very little.

Good Clojure programs make heavy use of this idea of maps-as-objects. Although Clojure eschews the object-oriented mindset in general, decades of research into object- oriented design do reveal some good principles of data encapsulation and organization. By utilizing Clojure's maps in this way, it becomes possible to reap many of the benefits and lessons learned from object-oriented data structuring while avoiding its pitfalls. In the context of a Clojure program, using maps is far better, because they can be operated on in a common way without needing to define handlers for each different class of object.

assoc

The map association function (assoc) takes as its arguments a map and a number of sequential key-value pairs. It returns a new map with the provided values associated with their respective keys, replacing any existing values with those keys.

(assoc {:a 1 :b 2} :c 3)
-> {:c 3, :a 1, :b 2}
(assoc {:a 1 :b 2} :c 3 :d 4)
-> {:d 4, :c 3, :a 1, :b 2}

dissoc

The map disassociation function (dissoc) takes as its arguments a map and a number of keys. It returns a new map formed by removing the provided keys from the supplied map.

(dissoc {:a 1 :b 2 :c 3} :c)
-> {:a 1, :b 2}
(dissoc {:a 1 :b 2 :c 3 :d 4} :a :c)
-> {:b 2, :d 4}

conj

The conj function (conj) works with maps the same way as it does with vectors, only instead of being given individual items to append it must be given a key-value pair.

(conj {:a 1 :b 2 :c 3} {:d 4})
-> {:d 4, :a 1, :b 2, :c 3}

A vector pair as an item also works, as shown in the following code:

(conj {:a 1 :b 2 :c 3} [:d 4])
-> {:d 4, :a 1, :b 2, :c 3}

merge

The map merge function (merge) takes any number of arguments, each of which is a map. It returns a new map formed by combining all the keys and values of its arguments. If a key is present in more than one map, the final value will be that of the last map provided containing that key.

(merge {:a 1 :b 2} {:c 3 :d 4})
-> {:d 4, :c 3, :a 1, :b 2}

merge-with

The map merge-with function (merge-with) takes a first-class function as its first argument and any number of additional arguments, each of which is a map. It returns a new map formed by combining all the keys and values of the map arguments. If a key is present in more than one map, the value in the result map is the result of calling the supplied function with the values of the conflicting key as parameters.

(merge-with + {:a 1 :b 2} {:b 2 :c 4})
-> {:c 4, :a 1, :b 4}

get

The map get function (get) takes a map and a key as its first and second arguments, and an optional third argument specifying the value if the key is not found. It returns the value of the specified key in the map, returning nil if it is not found and there is no third argument.

(get {:a 1 :b 2 :c 3} :a)
-> 1
(get {:a 1 :b 2 :c 3} :d 0)
-> 0

contains?

The map contains function (contains?) takes a map and a key as arguments. It returns true if the provided key is present in the map, otherwise false. In addition to maps, it also works on vectors and sets.

(contains? {:a 1 :b 2 :c 3} :a)
-> true

map?

The map test function (map?) takes a single argument and returns true if it is a map, otherwise false.

(map? {:a 1 :b 2 :c 3})
-> true

keys

The map keys function (keys) takes a single argument, a map. It returns a list of all the keys present in the map.

(keys {:a 1 :b 2 :c 3})
-> (:a :b :c)

vals

The map vals function (vals) takes a single argument, a map. It returns a list of all the values in the map.

(vals {:a 1 :b 2 :c 3})
-> (1 2 3)

Sets

Sets in Clojure are closely related to the mathematical concept: they are collections of unique values and support efficient membership tests as well as common set operations such as union, intersection, and difference.

The literal syntax for a set is the pound sign accompanied by the members of the set enclosed in curly braces. For example, the following code:

(def languages #{:java :lisp :c++})

Like maps, they support any kind of object as members. For example, a similar set using strings:

(def languages-names #{"Java" "Lisp" "C++"})

The implementation of sets is very similar to maps. They can be created in both hashtable and binary tree implementations, using the hash-set and sorted-set functions:

(def set1 (hash-set :a :b :c))
(def set2 (sorted-set :a :b :c))

Also like maps, sets are functions of their members. Calling a set as a function and passing it a value will return the value if the set contains the value and nil if it doesn't.

(set1 :a) ;returns :a
(set1 :z) ;returns nil

Common Set Functions

Note that the relational set functions are not part of the default clojure.core namespace, but rather the clojure.set namespace. You will need to either reference this explicitly or else include it into your namespace using the :use clause in your ns form. See Chapter 2.

clojure.set/union

The set union function takes any number of arguments, each a set. It returns a new set containing the union of the members of the argument sets.

(clojure.set/union #{:a :b} #{:c :d})
-> #{:a, :c, :b, :d}

clojure.set/intersection

The set intersection function takes any number of arguments, each a set. It returns a new set containing the intersection of the members of the argument sets or the empty set if there is no intersection.

(clojure.set/intersection #{:a :b :c :d} #{:c :d :f :g})
-> #{:c, :d}

clojure.set/difference

The set difference function takes any number of arguments, each a set. It returns a new set containing the members of the first set without the members of the remaining sets.

(clojure.set/difference #{:a :b :c :d} #{:c :d})
-> #{:a, :b}

Summary

Clojure provides a very complete and capable set of data types which in combination should be able to meet just about any programming need. Its primitive types provide the basic building blocks of any program, including very rich, worry-free numeric and string support.

The true strength of Clojure's data system, however, lies in its collections library. Collections are important not just convenient things to use, but are integral to Clojure's philosophy on data and immutability. They strictly adhere to the principles of immutability, meaning they cannot be changed, and persistence, meaning they share their structure for maximum efficiency. Relying on Clojure's built-in data structures and being familiar with the methods available for them will go a long way towards making your code efficient, readable, and idiomatic.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for 4. Data in Clojure

Create new playlist

Sign In

Sign Up

Chapter 4. Data in Clojure

How to Represent and Manipulate Data

Nil

Primitive Types

Numbers

Warning

Common Numeric Functions

Note

Addition (+)

Subtraction (–)

Multiplication (*)

Division (/)

inc

dec

quot

rem

min

max

Equals Function (==)

Greater-Than Function (<)

Greater-Than-or-Equals Function (<=)

Less-Than (>)

The Less-Than-or-Equals (>=)

zero?

pos?

neg?

number?

Strings

Common String Functions

str

subs

string?

print & println

Regular Expression Functions

re-pattern

re-matches

re-matcher

re-find

re-groups

re-seq

Boolean

Common Boolean Functions

not

and

or

Characters

char

Keywords

keyword

keyword?

Collections

Lists

Note

list

peek

pop

list?

Vectors

vector

vec

get

peek

vector?

conj

assoc

pop

subvec

Maps

Struct Maps

Maps As Objects

assoc

dissoc

conj

merge

merge-with

get

Table of Contents for
4. Data in Clojure

Addition `(+)`

Less-Than (`>)`