Validating Data

Any time you receive a value from a user or an external source, that data may contain values that break your expectations. Instead, Clojure specs precisely describe those expectations, check whether the data is valid, and determine how it conforms to the specification. In the case where data is invalid, you may want to know what parts are invalid and why.

Predicates are the simplest kind of spec—they check whether a predicate matches a single value. Other specs compose predicates (and other specs) to create more complicated specifications. Some of the tools we’ll discuss include range specs, logical connectors like and and or, and collection specs. These specs combine to cover any data structure you need to describe.

Predicates

Predicate functions that take a single value and return a logical true or false value are valid specs. Clojure provides dozens of predicates (many of these functions end in ?); you can use any of those or ones in your own project. Some example predicates in clojure.core are functions like boolean?, string?, and keyword?. These predicates check for a single underlying type.

Other predicates combine several types together, such as rational?, which returns true for a value that is an integer, a decimal, or a ratio. Many predicates verify a property of the value itself, like pos?, zero?, empty?, any?, and some?.

Consider a simple predicate spec for a company name:

 (s/def :my.app/company-name string?)

We’ve registered the specification for a company name in our application, so let’s use it to validate incoming data:

 (s/valid? :my.app/company-name ​"Acme Moving"​)
 -> true
 
 (s/valid? :my.app/company-name 100)
 -> false

Now that we’ve created the spec and registered it with a name (:my.app/company-name), other parts of our program can use it as well. As we code, we build and record the semantics of our domain.

Next, we’ll consider the common case of writing a spec that matches a set of enumerated values.

Enumerated values

If we’re in the business of selling marbles, we might stock three colors—red, green, and blue. When defining a spec for the color, we want to match any of those three values (declared as a keyword). Clojure sets are a perfect match for this—they allow us to store a set of non-duplicate items with a fast check for containment. Even better, sets implement the function interface and are valid specs as well:

 (s/def :marble/color #{:red :green :blue})
 
 (s/valid? :marble/color :red)
 -> true
 
 (s/valid? :marble/color :pink)
 -> false

Consider writing a spec to match a bowling roll, where 0–10 pins could be knocked down. We can write such a spec like this:

 (s/def ::bowling/roll #{0 1 2 3 4 5 6 7 8 9 10})
 
 (s/valid? ::bowling/roll 5)
 -> true

This works fine, but it seems silly to need to say all those numbers in order. We should make our computer take care of that instead, and range specs provide this functionality.

Range Specs

Clojure spec provides several range specs for just the purpose of validating a range of values, like a bowling roll. Let’s rewrite our spec using the provided s/int-in spec. We provide this spec with beginning (inclusive) and end (exclusive) integer values.

 (s/def ::bowling/ranged-roll (s/int-in 0 11))
 
 (s/valid? ::bowling/ranged-roll 10)
 -> true

In addition to s/int-in, s/double-in and s/inst-in are for ranges of doubles and time instants. All of these cases define value ranges with lower and upper bounds. The details vary slightly depending on the data type, so check the doc string for each function for proper use using (doc s/double-in) or (doc s/inst-in). Now let’s see how specs handle the special case of nil values.

Handling nil

Most predicates in Clojure will return false for the nil value; however, there are many cases where you’ll want to take an existing spec and extend it to also include the nil value. You can use the special s/nilable operation to extend an existing spec.

For example we could define our company name field as accepting either strings or nil:

 (s/def ::my.app/company-name-2 (s/nilable string?))
 
 (s/valid? ::my.app/company-name-2 nil)
 -> true

The s/nilable predicate provides optimal performance and is preferred over equivalent specs that you might construct with s/or or other operations.

One case you might encounter is spec’ing the set of values true, false, or nil. It’s tempting to use an explicit set for this #{true, false, nil}. However, when you ask whether a set contains a value, it returns the matching value—in this case, possibly false or nil. Spec interprets this as the predicate rejecting the value, rather than accepting it. Instead, use s/nilable to add nil as a valid value to the boolean? predicate:

 (s/def ::nilable-boolean (s/nilable boolean?))

This will give you the correct behavior and the best performance.

Now that we know the basics of working with predicates and sets, we can consider writing more interesting compound specs that combine specs using logical operations.

Logical Specs

Logical specs create composite specs from other specs using s/and or s/or. For example, to create a spec for an odd integer, combine the predicates int? and odd?:

 (s/def ::odd-int (s/and int? odd?))
 (s/valid? ::odd-int 5)
 -> true
 (s/valid? ::odd-int 10)
 -> false
 (s/valid? ::odd-int 5.2)
 -> false

In this example, we’re combining predicates, but s/and can take any kind of spec.

Similarly, we use s/or to combine multiple alternatives. For instance, to add 42 as a valid value in our last spec:

 (s/def ::odd-or-42 (s/or :odd ::odd-int :42 #{42}))

With s/or, we see that things are a bit different—each option in the s/or contains a keyword tag used to report how a value matches (or doesn’t match) the spec.

If we want to know how a value matched a spec, we can use s/conform, which returns the value annotated with information about optional choices or the components of the value. We call this a conformed value.

For all of the specs we’ve seen until this point, there were no options or components, and so the conformed value was the same as the original value. The s/or contains a choice, and the conformed value tags the choice taken with the key (either :42 or :odd):

 (s/conform ::odd-or-42 42)
 -> [:42 42]
 (s/conform ::odd-or-42 19)
 -> [:odd 19]

The conformed value for an s/or is a map entry, and the key and val functions extract the tag and value, respectively. The s/conform operation parses a value and describes the parse structure using the tags.

Conversely, the s/explain function describes all of the ways an invalid value didn’t match its spec:

 (s/explain ::odd-or-42 0)
 | val​:​ 0 fails spec​:​ :user/odd-int at​:​ [:odd] predicate​:​ odd?
 | val​:​ 0 fails spec​:​ :user/odd-or-42 at​:​ [:42] predicate​:​ #{42}

Here, spec has found and reported two problems with the value. The first problem is that the value is not odd. Note that it passed the int? check inside ::odd-int, so the report only includes the failing predicates. In each problem line, we see the failing value, the spec being checked, the tag path to the failing spec, and the failing predicate.

The explain messages print to the console, but we can alternately retrieve this info as a string with s/explain-str or as data with s/explain-data.

Now that we’ve started to compose specs, you may be looking at your own data and thinking about how to write specs for it. Most Clojure data is not made of individual values but of collections, and there are a number of ways to write collection specs.

Collection Specs

The two most common collection specs you’ll use are s/coll-of and s/map-of. The s/coll-of spec describes lists, vectors, sets, and seqs. You provide a spec that members of the collection must satisfy, and spec checks all members.

 (s/def ::names (s/coll-of string?))
 (s/valid? ::names [​"Alex"​ ​"Stu"​])
 -> true
 (s/valid? ::names #{​"Alex"​ ​"Stu"​})
 -> true
 (s/valid? ::names '(​"Alex"​ ​"Stu"​))
 -> true

The s/coll-of spec also comes with many additional options supplied as keyword arguments at the end of the spec.

  • :kind - a predicate checked at the beginning of the spec. Common examples are vector? and set?.

  • :into - one of these literal collections: [], (), or #{}. Conformed values collect into the specified collection.

  • :count - an exact count for the collection.

  • :min-count - a minimum count for the collection.

  • :max-count - a maximum count for the collection.

  • :distinct - true if the elements of the collection must be unique.

As you can see, these options allow for specifying many common collection shapes and constraints. For example, we can choose to match just int sets with at least two values with a spec like this:

 (s/def ::my-set (s/coll-of int? :kind set? :min-count 2))
 (s/valid? ::my-set #{10 20})

Similar to s/coll-of, s/map-of specs a lookup map where the keys and values each follow a spec, such as a mapping from player names to scores:

 (s/def ::scores (s/map-of string? int?))
 (s/valid? ::scores {​"Stu"​ 100, ​"Alex"​ 200})
 -> true

All of the s/coll-of options also apply, although you won’t typically need to use :into or :kind because they default to map-specific settings.

If you recall, the s/conform function tells how a value was parsed according to a spec. s/map-of conforms to a map and always conforms values. Keys are not conformed by default, but you can change that using the :conform-keys flag.

In cases of large collections or large maps, you might not want to validate or conform all values. For these cases, you can use the sampling specs s/every and s/every-kv instead.

Collection Sampling

The sampling collection specs are s/every and s/every-kv for collections and maps, respectively. They are similar in operation to s/coll-of and s/map-of, except they check up to only s/*coll-check-limit* elements (by default, 101).

Because these specs validate only a limited subset of values and conform no elements, large collections and maps have much better validation performance.

Tuples

Tuples are vectors with a known structure where each fixed element has its own spec. For example, a vector of x and y coordinates can represent a point:

 (s/def ::point (s/tuple float? float?))
 (s/conform ::point [1.3 2.7])
 -> [1.3 2.7]

Tuples do not name or tag the returned fields. Later in this chapter, we’ll see another approach to handling sequential collections with well-known internal structure (using s/cat).

It’s common to use maps to represent information with well-known fields. Clojure spec provides a number of tools for handling these kinds of maps, which we’ll look at next.

Information Maps

It’s common in Clojure to represent domain objects as maps with well-known fields; for example, we might be managing data related to bands and music albums. We might represent a particular release like this:

 {::music/id #uuid ​"40e30dc1-55ac-33e1-85d3-1f1508140bfc"
  ::music/artist ​"Rush"
  ::music/title ​"Moving Pictures"
  ::music/date #inst ​"1981-02-12"​}

We start with describing specs for the attributes:

 (s/def ::music/id uuid?)
 (s/def ::music/artist string?)
 (s/def ::music/title string?)
 (s/def ::music/date inst?)

Many validation libraries define the structure of a map of attributes by including definitions of the attributes. This approach introduces important long-term problems. Attributes have independent semantics and may be reused across different structures where they should have the identical meaning without needing to be redefined.

By contrast, Clojure spec requires attributes to be defined independently and then defines map specs as open collections of attributes with no need for restatement of attribute definitions. This approach easily supports subsets of maps, evolution of maps over time, and the transfer of maps through subsystems where not all attributes need to be understood by the intermediary.

To specify a map of attributes, we use s/keys, which describes both required and optional attributes:

 (s/def ::music/release
  (s/keys :req [::music/id]
  :opt [::music/artist
  ::music/title
  ::music/date]))

Here we define a ::music/release map to require only a ::music/id attribute, and all other attributes are optional.

An important additional feature of s/keys is that it will validate the values of all registered keys, even if they’re not listed as required or optional. For example, if new optional attributes for ::music/release are added in the future, the s/keys spec will automatically validate those as well. This approach encourages the uniform validation of registered attributes and the growth of systems over time.

In the case where you have existing maps that don’t use qualified keys, the variant options :req-un and :opt-un take a spec identifier but find the value to check by using the unqualified name of the spec. For example, if our map instead looked like this:

 {:id #uuid ​"40e30dc1-55ac-33e1-85d3-1f1508140bfc"
  :artist ​"Rush"
  :title ​"Moving Pictures"
  :date #inst ​"1981-02-12"​}

we could use our existing attribute specs to create an unqualified version of the spec:

 (s/def ::music/release-unqualified
  (s/keys :req-un [::music/id]
  :opt-un [::music/artist
  ::music/title
  ::music/date]))

This version of the spec will match the spec for ::music/id with the unqualified attribute :id and so on for the other attributes as well. Specs for records also use the same approach with unqualified attributes.

You’ve now seen an overview of the basic tools for specifying the structure of our data. In the next section, we take things to the next level by specifying the inputs and outputs of our functions using specs. When we do that, we’ll need some way to specify the syntax for function calls, and that will bring in one final set of specs we’ve not yet examined—regex ops.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.135.183.138