In Chapter 3, we covered the JSON data types. The importance and usefulness of data types was discussed. Knowing ahead of time what something is and what it is for (remember the kid with the hammer) makes a world of difference.
In most scenarios with data interchange formats, the data is being created to send across the Internet or a network to another party. That party usually has a desired format for the document that they are expecting, including structure and data types. They will usually provide documentation that explains the format and provides examples.
Even when the most detailed, beautiful documentation is provided, it is not difficult to create errors in your data. To be clear, these aren’t syntax errors we are talking about here. These are errors of misunderstanding, like “I sent an apple, and you were expecting an orange.” In this book, I will refer to this type of validation as conformity validation so that it may be distinguished from syntax validation.
In this scenario, the process usually plays out in the following steps:
This scenario has existed with data interchange since before JSON existed. Fortunately, the people of the technology industry are problem solvers, and the concept of the schema was born.
In the real world, we often use contracts between two parties where the outcome is important. When I sign a contract that says I will complete a project for someone, the details are outlined in that contract. I agree that I will deliver the spaceship by August 31st, and the final product will have a fully functional spaceship with life support, lasers, and three engines.
Imagine now that we live in a world of wizards and magic. When the company I’m doing the project for handed me the contract, they added a bit of magic. At any time, I can tap my wand on the contract and it will tell me whether I’ve met my end of the bargain. I’d never have to walk into the meeting to proclaim “I’m done!” and be met with the embarrassing response of “What about the third engine you promised to put on the spaceship. Where is it?” At any time I can verify that I am really done with the project and walk into the meeting with confidence.
A data interchange schema is much like that imagined world of wizards and magic. Before we send our data, we can at any time validate it for conformity with the schema and find out whether our data is acceptable. When we are interchanging data with a schema, the process is much different than our scenario without the schema:
Additionally, the JSON schema can be used on the other end of the transaction by the party that is accepting the data. A JSON schema can be a first line of defense in accepting data, to verify that the data conforms. It can answer all of these questions before the data is processed:
While JSON is fairly mature, JSON Schema is still under development. As of April 2015, JSON Schema is in draft 4. This doesn’t mean that you shouldn’t use JSON Schema—it just means it’s still evolving to better serve the world.
A JSON Schema is written with JSON, so reading or writing one is only a few steps away. In our very first name-value pair of our JSON, we must declare it as a schema document (Example 4-1).
{ "$schema": "http://json-schema.org/draft-04/schema#" }
The second name-value pair in our JSON Schema Document will be the title (see Example 4-2).
{ "$schema": "http://json-schema.org/draft-04/schema#", "title": "Cat" }
In the third name-value pair of our JSON Schema Document, we will define the properties that we want to be included in the JSON. The "properties"
value is essentially a skeleton of the name-value pairs of the JSON we want. Instead of a literal value, we have an object that defines the data type, and optionally the description (Example 4-3).
{ "$schema": "http://json-schema.org/draft-04/schema#", "title": "Cat", "properties": { "name": { "type": "string" }, "age": { "type": "number", "description": "Your cat's age in years." }, "declawed": { "type": "boolean" } } }
We can then validate that our JSON conforms to the JSON Schema (Example 4-4).
{ "name": "Fluffy", "age": 2, "declawed": false }
Earlier I stated that a JSON Schema can answer the following questions:
With the very simple cat example, the first question was answered. We were able to validate that the JSON for the cat “Fluffy” has the correct data types for the values of name, age, and declawed. Let’s answer the second question: does this include the required data?
When we ask for data, there are often properties (or fields) that we must have values for, and others that are optional. For example, when I create a new account on a shopping website, I need to complete a shipping address form. That address form requires my name, street, city, state, and zip code. Optionally, I can include a company name, apartment number, and a second line for a street address. If I leave out one of the required fields, I cannot move forward with the account creation.
To achieve this required logic in the JSON schema, we add a fourth name-value pair after "$schema"
, "title"
, and "properties"
. This name-value pair has the name "required"
and a value of the array data type. The array includes the fields we require.
In Example 4-5, we first add another field for "description"
. Next, we add a fourth name-value pair, "required"
, with an array of required values for its value. "name"
, "age"
, and "declawed"
are required, so we add them to this list. We leave out "description"
because it’s not required.
{ "$schema": "http://json-schema.org/draft-04/schema#", "title": "Cat", "properties": { "name": { "type": "string" }, "age": { "type": "number", "description": "Your cat's age in years." }, "declawed": { "type": "boolean" }, "description": { "type": "string" } }, "required": [ "name", "age", "declawed" ] }
With the addition of "required"
to our JSON schema, the JSON in Example 4-6 is valid. This JSON conforms to our JSON schema for "Cat"
with the required fields of "name"
, "age"
, and "declawed"
. We are including the optional name-value pair, "description"
.
{ "name": "Fluffy", "age": 2, "declawed": false, "description" : "Fluffy loves to sleep all day." }
We may also leave out the "description"
field, as it’s not included in the list of required fields. The JSON in Example 4-7 conforms to our JSON Schema for "Cat"
with the required fields of "name"
, "age"
, and "declawed"
.
{ "name": "Fluffy", "age": 2, "declawed": false }
It is important to note that if you do not include the "required"
name-value pair in your JSON schema with the array of required names, then nothing is required. A JSON object with no name-value pairs inside it would be considered valid. Without the array of "required"
, the JSON in Example 4-8 is considered valid for the "Cat"
JSON Schema.
{}
The third and final question we can answer with our JSON schema is: are the values in the format I require? We answered the question about the data types of our values, but we often need a specific format for the type. For example, I require a username, but the username should not exceed 20 characters. Additionally, I might ask you to think of a number between 10 and 100. We can express these specific requirements in our JSON schema.
In the cat JSON, we have requirements such as name being a string and age being a number. However, we do not want someone giving us data with a really long cat name, a really short cat name, or a negative number for the cat’s age. In our JSON schema, we can define a minimum length and a maximum length for a string, and a minimum for a number.
In Example 4-9, validation has been added to ensure that the cat’s name is a minimum of 3 characters and a maximum of 20 characters. Additionally, we ensure that the age of the cat submitted is not a negative number.
{ "$schema": "http://json-schema.org/draft-04/schema#", "title": "Cat", "properties": { "name": { "type": "string", "minLength": 3, "maxLength" : 20 }, "age": { "type": "number", "description": "Your cat's age in years.", "minimum" : 0 }, "declawed": { "type": "boolean" }, "description": { "type": "string" } }, "required": [ "name", "age", "declawed" ] }
The JSON in Example 4-10 is not valid with the "Cat"
JSON Schema because the name value exceeds the "maxLength"
, and the age value precedes the "minimum"
.
{ "name": "Fluffy the greatest cat in the whole wide world", "age": -2, "declawed": false, "description" : "Fluffy loves to sleep all day." }
The JSON in Example 4-11 is valid with the cat JSON Schema and conforms to the requirements for the values.
{ "name": "Fluffy", "age": 2, "declawed": false, "description" : "Fluffy loves to sleep all day." }
If we return to the comparison of a schema to a contract, you can see that the details of our contract can be very specific. The examples provided in this chapter are introductory and just the tip of the iceberg. JSON Schema even supports regular expressions (character patterns, such as an email address format) and enum (a list of possible values). If you wish to become a master of JSON Schema, visit the following pages, where you can find links to the specifications:
There is a long and growing list of JSON Schema libraries and projects for specific programming languages and frameworks. A quick Google search of “JSON Schema Validation [insert programming language name here]” should get you what you need if you’d like to integrate JSON Schema validation into a project. Additionally, there are a few online validators, which are programming language agnostic and great for experimenting with JSON Schema:
If I go to the JSON Schema Lint website, I will be presented with two text areas: one for the JSON schema, and another for the JSON document to be validated. If I paste in the schema from Example 4-9 and the JSON from Example 4-10, I will see the following errors:
If I go to the JSON Schema Validator website, I am also presented with the same two text areas. Once again, if I paste in the schema from Example 4-9 and the JSON from Example 4-10, I will see errors. Additionally, the line numbers of the JSON will display a red x, showing us where the errors are at in the JSON.
The JSON Schema Validator not only points us to the line numbers where the error takes place, but also gives us the paths to the schema requirements that are causing the validation to fail. The two validators may have described the errors a bit differently, but both found the same errors.
This chapter covered the following key term:
We also discussed these key concepts:
3.15.221.133