CHAPTER 4

image

Introducing JSON

The JavaScript Object Notation data format, or JSON for short, is derived from the literals of the JavaScript programming language. This makes JSON a subset of the JavaScript language. As a subset, JSON does not possess any additional features that the JavaScript language itself does not already possess. Although JSON is a subset of a programming language, it itself is not a programming language but, in fact, a data interchange format.

JSON is known as the data interchange standard, which subtextually implies that it can be used as the data format wherever the exchange of data occurs. A data exchange can occur between both browser and server and even server to server, for that matter. Of course, these are not the only possible means to exchange JSON, and to leave it at those two would be rather limiting.

History

JSON is attributed to being the creation of Douglas Crockford. While Crockford admits that he is not the first to have realized the data format,1 he did provide it with a name and a formalized grammar within RFC 4627. The RFC 4627 formalization, written in 2006, introduced the world to the registered Internet media type application/json, the file extension .json, and defines JSON’s composition. In December 2009, JSON was officially recognized as an ECMA standard, ECMA-404, and is now a built-in aspect of the standardization of ECMAScript-262, 5th edition.

Controversially, another Internet working group, the Internet Engineering Task Force (IETF), has also recently published its own JSON standard, RFC 7159, which strives to clean up the original specification. The major difference between the two standards is that RFC 7159 states that a valid JSON text must encompass any valid JSON values within an initial object or an array, whereas the ECMA standard suggests that a valid JSON text can appear in the form of any recognized JSON value. You will learn more about the valid JSON values when we explore the structure of JSON.

It is important to remember, as we get further into the structure of JSON, that as a subset of JavaScript, it remains subject to the same set of governing rules defined by the ECMA-262 standardization. You can feel free to read about the latest specification at the following URL: www.ecma-international.org/publications/files/ECMA-ST/Ecma-262.pdf. At the time of writing, the current edition of the ECMA-262 standard is 5.1; however, 6 is just around the corner.

Image Note  While edition 5.1 is today’s current standard, at the time of JSON’s formalization, the ECMA-262 standard was only in edition 3.

Crockford documented JSON’s grammar on http://json.org in 2001, and soon word began to spread that there was an alternative to the XML data format. With the widespread adoption of Ajax (Asynchronous JavaScript and XML), JSON’s popularity began to soar, as people began to note its ease of implementation and how it rivaled that of XML. You would think that Ajax would have enforced the adoption of XML, as the x within the acronym strictly refers to XML. However, being modeled after SGML, a document format, XML possesses qualities that make it very verbose, which is not ideal for data transmission. One of the reasons JSON has become the de facto data format of the Web, as you will shortly see in the upcoming section, is due to its grammatical simplicity, which allows for JSON to be highly interoperable.

JSON Grammar

JSON, in a nutshell, is a textual representation defined by a small set of governing rules in which data is structured. The JSON specification states that data can be structured in either of the two following compositions:

  1. A collection of name/value pairs
  2. An ordered list of values

Composite Structures

As the origins of JSON stem from the ECMAScript standardization, the implementations of the two structures are represented in the forms of the object and array. Crockford outlines the two structural representations of JSON through a series of syntax diagrams. As I am sure you will agree, these diagrams resemble train tracks from a bird’s-eye view and thus are also referred to as railroad diagrams. Figure 4-1 illustrates the grammatical representation for a collection of string/value pairs.

9781484202036_Fig04-01.jpg

Figure 4-1. Syntax diagram of a string/value pair collection

As the diagram outlines, a collection begins with the use of the opening brace ({), and ends with the use of the closing brace (}). The content of the collection can be composed of any of the following possible three designated paths:

  • The top path illustrates that the collection can remain devoid of any string/value pairs.
  • The middle path illustrates that our collection can be that of a single string/value pair.
  • The bottom path illustrates that after a single string/value pair is supplied, the collection needn’t end but, rather, allow for any number of string/value pairs, before reaching the end. Each string/value pair possessed by the collection must be delimited or separated from one another by way of a comma (,).

Image Note  String/value is equivalent to key/value pairs, with the exception that said keys must be provided as strings.

An example of each railroad path for a collection of string/value can be viewed within Listing 4-1. The structural characters that identify a valid JSON collection of name/value pairs have been provided emphasis.

Listing 4-1. Examples of Valid Representations of a Collection of Key/Value Pairs, per JSON Grammar

//Empty Collection Set
{};
//Single string/value pair
{"abc":"123"};
//Multiple string/value pairs
{"captainsLog":"starDate 9522.6","message":"I've never trusted Klingons, and I never will."};

Figure 4-2 illustrates the grammatical representation for that of an ordered list of values. Here we can witness that an ordered list begins with the use of the open bracket ([) and ends with the use of the close bracket (]).

9781484202036_Fig04-02.jpg

Figure 4-2. Syntax diagram of an ordered list

The values that can be held within each index are outlined by the following three “railroad” paths:

  • The top path illustrates that our list can remain devoid of any value(s).
  • The middle path illustrates that our ordered list can possess a singular value.
  • The bottom path illustrates that the length of our list can possess any number of values, which must be delimited, that is, separated, with the use of a comma (,).

An example of each railroad path for the ordered list can be viewed within Listing 4-2. The structural tokens that identify a valid JSON ordered list have been emphasized.

Listing 4-2. Examples of Valid Representations of an Ordered List, per JSON Grammar

//Empty Ordered List
[];
//Ordered List of multiple values
["abc"];
//Ordered List of multiple values
["0",1,2,3,4,100];

You may have found yourself wondering how it came to be that the characters [, ], {, and } represent an array and an object, as illustrated in Listing 4-1 and Listing 4-2. The answer is quite simple. These come directly from the JavaScript language itself. These characters represent the Object and Array quite literally.

As was stated in Chapter 2, both an object and an array can be created in one of two distinct fashions. The first invokes the creation of either, through the use of the constructor function defined by the built-in data type we wish to create. This style of object invocation can be seen in Listing 4-3.

Listing 4-3. Using the new Keyword to Instantiate an object and array

var objectInstantion   = new Object();  //invoking the constructor returns a new Object
var arrayInstantiation = new Array();   //invoking the constructor returns a new Array

The alternative manner, which we can use to create either object or array, is by literally defining the composition of either, as demonstrated in Listing 4-4.

Listing 4-4. Creation of an object and an array via Literal Notation

var objectInstantion   = {}; //creation of an empty object
var arrayInstantiation = []; //creation of an empty array

Listing 4-4 demonstrates how to create both an array and an object, explicitly using JavaScript’s literal notation. However, both instances are absent of any values. While it is perfectly acceptable for an array or object to exist without content, it will be more likely that we will be working with ones that possess values.

Because object literals can be used to design the composition of objects within source code, they can also be provisioned with properties as they are authored. Listing 4-5 should begin to resemble the syntax diagrams we just reviewed.

Listing 4-5. Designing an object and array via Literal Notation with the Provision of Properties

var objectInstantion   = {name:"ben",age:36};
var arrayInstantiation = ["ben",36];

Image Note  While Listing 4-4 and Listing 4-5 illustrate the creation of objects through the use of literals, JSON uses literals to capture the composition of data.

The JSON data format expresses both objects and arrays in the form of their literal. In fact, JSON uses literals to capture all JavaScript values, except for the Date object, as it lacks a literal form.

What you may not have noticed, due to its subtlety, is that JavaScript object literals do not require its key identifiers to be explicitly defined as strings. Take, for example, the literal declaration of {name:"ben", age:36}; from Listing 4-5. It could have equally been declared as {"name":"ben", age:36};. Both declarations will create the same object, allowing our program to reference the same name property equally. Consider the code within Listing 4-6.

Listing 4-6. Object Keys Can Be Defined Explicitly or Implicitly As Strings

var objectInstantionA   = {name:"ben",age:36};
var objectInstantionB   = {"name":"ben",age:36};
console.log( objectInstantionA.name );  // "ben"
console.log( objectInstantionB.name );  // "ben"

The reason the preceding example works is because, behind the scenes, JavaScript turns every key identifier into a string. That said, it is imperative that the key of every value pair be wrapped in double quotes to be considered valid JSON. This is due to the many reserved keywords in JSON’s superset and the fact that ECMA 3.0 grammar prohibits the use of keywords as the properties held by an object. The ECMA 3.0 grammar does not allow reserved words (such as true and false) to be used as a key identifier or to the right of the period in a member expression.2 Listing 4-7 demonstrates the first JSON text used to interchange data.3

Listing 4-7. The Very First JSON Message Used by Douglas Crockford

var firstJSON =  {to:"session",do:"test","message":"Hello World"}; //Syntax Error in ECMA 3

However, this JSON text produced an error instantly, due to the use of the reserved keyword do as the property name of a string/value pair. Rather than outlining all words that would then cause such syntax errors, Crockford found it simpler to formalize that all property names must be explicitly expressed as strings.

Image Note  If you were to reference the exact preceding code expecting to arrive at a syntax error, you’ll likely be confused why none is thrown. The ECMAScript, 5th edition allows for keywords to now be used with dot notation. However the JSON spec continues to account for legacy.

JSON Values

As mentioned earlier, JSON is a subset of JavaScript and does not add anything that the JavaScript language does not possess. So, naturally, the values that can be utilized within our JSON structures are represented by types, as outlined within the 3rd edition of the ECMA standard. JSON makes use of four primitive types and two structured types.

The next figure in succession, Figure 4-3, defines the possible values that can be substituted where the term value appears in Figures 4-1 and 4-2. A JSON value can only be a representative of string, number, object, array, true, false, and null. The latter three must remain lowercased, lest you invoke a parsing error. While Figure 4-3 does not clearly demonstrate it, all JSON values can be preceded and succeeded by whitespace, which greatly assists in the readability of the language.

9781484202036_Fig04-03.jpg

Figure 4-3. Syntax diagram illustrating the possible values in JSON

String literals in the JavaScript language can possess any number of Unicode characters enclosed within either single or double quotes. However, it will be important to note, as outlined in Figure 4-4, that a JSON string must always begin and end with the use of double quotes. While Crockford does not justify this, it is for interoperable reasons. The C programming grammar states that single quotes identify a single character, such as a or z. A double quote, on the other hand, represents a string literal. While Figure 4-4 appears verbose, there are only four possible paths.

9781484202036_Fig04-04.jpg

Figure 4-4. Syntax diagram of the JSON string value

  • The topmost path illustrates that our string literal can be absent of any Unicode characters.
  • The middle path illustrates that our string can possess any Unicode characters (represented in literal form), except for the following: the quotation mark, the backslash (solidus).
  • The last several paths illustrate that we can insert into our string control characters with the use of a solidus ()character preceding it. Additionally, the bottommost rung specifies that any character can be defined in its Unicode representation. To indicate that the preceding u character is used to identify a Unicode value, it, too, must be escaped.
  • The second topmost path represents our loop, which allows the addition of any of the outlined characters.

Listing 4-8 demonstrates a variety of valid string values.

Listing 4-8. Examples of Valid String Values As Defined by the JSON Grammar

//absent of unicode
"";
//random unicode characters
""; or " ";
//use of escaped character to display double quotes;
" " " ";
//use of u denotes a unicode value
"u22A0"; // outputs
//a series of valid unicode as defined by the grammar
"u22A0   " ";

A solidus, better known as a backslash, is used to demarcate characters as having an alternate meaning. Without the use of the , the lexer might interpret as a token what is intended to be used as a string, or vice versa. Escaping characters offers us the ability to inform the lexer to handle a character in a manner that is different from its “normal” behavior. Table 4-1 illustrates the use of the escaped literals for the prohibited characters.

Table 4-1. Escaped Literals

Tab1

The last value to discuss is that of the number. A number in JSON is the arrangement of base10 literals, in combination with mathematical notation to define a real number literal. Figure 4-5 addresses the syntactical grammar of the JSON number in great detail; however, it’s rather simple when we view it step-by-step.

9781484202036_Fig04-05.jpg

Figure 4-5. Syntax diagram of a JSON number

The first thing to note is that the numbers grammar does not begin or end with any particular symbolic representation, as our earlier object, array, and string examples did.

As illustrated in Figure 4-5, a JSON number must adhere to the following rules:

  1. The number literal will be implicitly positive, unless explicitly indicated as a negative value.
  2. Numbers cannot possess superfluous 0’s.
  3. Can be in the form of a whole number
    1. made up of a single BASE10 numeric literal (0-9)
    2. made. any number of BASE10 numeric literals (0-9)
  4. Can be in the form of a fraction

    4.1.  Made up of a singular base10 numerical literal at the 10s placement

    4.2.  Made up of any base10 numerical literal per placement beyond the decimal

  5. Can possess the exponential demarcation literal

    5.1.  E notation can be expressed in the form of a uppercase “E” or lowercase “e”

    5.2.  Immediately followed by a signed sequence of 1 or more base10 numeric literals (0-9)

Listing 4-9 reveals valid numerical values as defined by the JSON grammar.

Listing 4-9. Valid Numerical Values

-0.01   //valid use of 0's
 00.1   //superfluous 0 produces a SyntaxError
 1/3    //fraction form
 .3333333333333333 //decimal form
 1.2e-1 //scientific notation

Any of the values discussed in this chapter can be used in any combination when contained within a composite structure. Listing 4-10 illustrates how they can be mixed and matched. What is necessary is that the JSON grammar covered is followed. The examples in Listing 4-10 demonstrate proper adherence of the JSON grammar to portray data.

Listing 4-10. Examples of JSON Text Containing a Variety of Valid JSON Values

// JSON text of an array with primitives
[
    null,  true, 8
]
// JSON text of an object with two members
{
    "first": "Ben",
    "last": "Smith",
}
// JSON text of an array with nested composites
[
    {  "abc": "123" },
    [ "0", 1,  2, 3, 4, 100 ]
]
//JSON text of an object with nested composites
{
    "object": {
        "array": [true]
    }
}

JSON Tokens

While the Object and Array are conventions used in JavaScript, JavaScript, like many programming languages, borrowed from the C language in one form or another. While not every language explicitly implements Arrays and Objects akin to JavaScript, they do often possess the means to model collections of key/value pairs and ordered lists. These may take on the form of Hash maps, dictionaries, Hash tables, vectors, collections, and lists. Furthermore, most languages will be capable of working with text, which is precisely what JSON is based on.

At the end of the day, JSON is nothing more than a sequence of Unicode characters. However, the JSON grammar standardizes which Unicode characters or “tokens” define valid JSON, in addition to demarcating the values contained within.

Therefore, when regarding the interchange of JSON and the many languages that do not natively possess Objects and Arrays, the tokens that make up the JSON text are all that is required to interpret if any collections or ordered lists exist and apply all values in a manner required of that language. This is accomplished with six structural characters, as listed in Table 4-2.

Table 4-2. Six Structural Character Tokens

Tab2

One point to note is that JSON will ignore all insignificant whitespace before or after the preceding six structural tokens. Table 4-3 illustrates the four whitespace character tokens.

Table 4-3. Four Whitespace Character Tokens

Tab3

Because JSON is nothing more than text, you may find it rather difficult to determine whether your JSON is properly formatted or not. Furthermore, if the syntax is inaccurate to the grammar specified, then you will find that your malformed JSON causes code to come to a halt. This would be due to the syntax error that would be uncovered at the time of trying to parse said JSON. You will learn about parsing in Chapter 6.

For this reason, any attempt to devise JSON by hand should be performed with the aid of an editor. The following list of JSON editors understand the JSON grammar and are able to offer some much needed and immediate validation.

The first editor, http://jsoneditoronline.org/, adheres to the ECMA-262 standardization and, therefore, allows your JSON text to represent a singular primitive value. Whereas the ladder follows the RFC 7159 standardization, thus requiring a JSON text to represent a structural value, i.e., array or object literal. It should be made known that the two editors mentioned previously are not the only two in existence. There are many online and offline editors, each with its own nuances. I favor the two mentioned, for their convenience.

Summary

In this chapter, I covered the history of JSON and the specifications of the JSON data format that defines the grammar of a valid JSON text. You learned that JSON is a highly interoperable format for data interchange. This is achieved via the standardization of a simplistic grammar that can be translated into any language simply by understanding the grammar.

As was demonstrated in this chapter, we can use the JSON grammar in conjunction with predetermined data to create JSON. Because we are simply working with text, it will be helpful to rely on an editor that understands JSON’s grammar, for validation purposes. However, JSON can be written with a basic text editor and saved as a JSON document, using the file extension .json. Furthermore, as a subset of JavaScript, JSON can even be hard-coded within a JavaScript file directly. Both methods are ideal for devising configuration files for an application.

The next chapter will reveal how we can use the JavaScript language to produce JSON at runtime.

Key Points from This Chapter

  • The array represents an ordered list of values, whereas the object represents a collection of key/value pairs.
  • Unordered collections of key/value pairs are contained within the following opening ({) and closing (}) brace tokens.
  • Ordered lists are encapsulated within opening ([) and closing (]) square bracket tokens.
  • The key of a member must be contained in double quotes.
  • The key of a member and its possessed value must be separated by the colon (:) token.
  • Multiple values within an object or array must be separated by the comma (,) token.
  • Boolean values are represented using lowercase true/false literals.
  • Number values are represented using double-precision floating number point format.
  • Number values can be specified with scientific notation.
  • Control characters must be escaped via the reverse solidus () token.
  • Null values are represented as the literal: null.

_________________

1http://yuiblog.com/yuitheater/crockford-json.m4v.

2Allen Wirfs-Brock, “ES 3.1 ‘true’ as absolute or relative?” https://mail.mozilla.org/pipermail/es-discuss/2009-April/009119.html, April 9, 2009.

3http://yuiblog.com/assets/crockford-json.zip.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.222.164.141