© Charles Bell 2018
Charles BellIntroducing the MySQL 8 Document Storehttps://doi.org/10.1007/978-1-4842-2725-1_3

3. JSON Documents

Charles Bell1 
(1)
Warsaw, Virginia, USA
 

Now that we have the MySQL server installed, we can begin to learn more about what the document store is and how we can begin to work with it. The core concept is JavaScript Object Notation (JSON) documents. What we discover is that MySQL has two ways to work with JSON documents: a pure NoSQL document store mechanism complete with a full developer application programming interface and a very cool integration of JSON with relational databases.

The origins of the MySQL document store lie in several technologies that are leveraged together to form the document store. In particular, Oracle has combined a key, value mechanism with a new data type, a new programming library, and a new access mechanism to create what is now the document store. As we learned in Chapter 1, not only does this allow us to use MySQL with a NoSQL interface, it also allows us to build hybrid solutions that leverage the stability and structure of relational data while adding the flexibility of JSON documents.

In this chapter, we learn how MySQL supports JSON documents including how to add, find, update, and remove data (commonly referred to as create, read, update, and delete, respectfully). We begin with more information about the concepts and technologies you will encounter throughout this book. We then move on to learning more about the JSON data type and the JSON functions in the MySQL server. Although this chapter focuses on using JSON with relational data, a firm foundation on how to use JSON is required to master the MySQL document store NoSQL interface—the X Developer API (X DevAPI).

Let’s begin with a review of the concepts and technologies we will encounter when working with the document store and JSON in MySQL.

Concepts and Technologies: Jargon Explained

As we learned in Chapter 1, there are several new concepts and technologies and associated jargon to navigate to learn how to use the document store in MySQL. We encountered some of these terms in Chapter 1, but we explore them in a little more detail here in the context of MySQL. That is, we see how these concepts and technologies explain what comprises the JSON data type and document store interface. Let’s begin with most basic concept that JSON uses: key, value mechanisms.

Origins: Key, Value Mechanisms

As with most things in this world, nothing is truly new in the sense that it is completely original without some form of existence that came before and is typically built from existing technologies applied in novel ways. Key, value mechanisms are a prime example of a base technology. I use the term, mechanism, because the use of the key allows you to access the value.

When we say key, value we mean there exists some tag (normally a string) that forms the key and each key is associated with a value. For example, "name":"George" is an example where key (name) has a value (George). Although the values in a key, value store are normally short strings, values can be complex: numeric; alphanumeric; lists; or even nested key, value sets.

Key, value mechanisms are best known for being easy to use programmatically while still retaining readability. That is, with diligent use of whitespace, a complex nested key, value data structure can be read by humans. The following shows one example formatted in a manner how developers would format code. As you can see, it is very easy to see what this set of key, values are storing: name, address, and phone numbers.

{ "name": {
    "first":"George",
    "last":"Folger"
  },
  "phones": [
    {
      "work":"555-1212"
    },
    {
      "cell":"555-2121"
    }
  ],
  "address": {
    "street":"123 Main Street",
    "city":"melborne",
    "state":"California",
    "zip":"90125"
  }
}

Recall from Chapter 1, we saw some examples of these constructs. Now we know how and why they are constructed.

One example of a key, value mechanism (or storage) is Extensible Markup Language (XML) , which has been around for some time. The following is a simple example of XML using the data above. It is the result of a SQL SELECT query with the output (rows) shown in XML format.1 Note how XML uses tags like HTML (because it is derived from HTML) along with the key, value storage of the data. Here, the keys are <row>, <field> and the values the contents between the start and end tag symbols (<field> </field>).

<?xml version="1.0"?>
<resultset statement="select * from thermostat_model limit 1;" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
  <row>
    <field name="model_id">acme123</field>
    <field name="brand">Lennox</field>
  </row>
</resultset>

There are systems designed around key, value mechanisms (called key, value or relational stores) such as the Semantic Web.2 In short, the Semantic Web is an attempt to leverage associations of data to describe things, events, and so forth. Sometimes the terms relation store or triple store are used to describe the types of storage systems employed. There are several forms of key, value mechanisms used in the Sematic Web including Resource Description Framework (RDF), Web Ontology Language (OWL), and XML.

There are other examples of key, value mechanisms but the one most pertinent to the document store is JSON.

JSON

I gave a brief description of JSON in Chapter 1. Recall that JSON is a human and machine readable data exchange format. It is also platform independent, which means that there are no concepts of the format that prohibit it from being used in almost any programming language. In addition, JSON is a widely popular format that is used on the Internet.

JSON allows you to describe data in any way you want to without forcing any structure. In fact, you can format (lay out) your data any way you want to. The only real restriction is the proper use of the descriptors (curly braces, square brackets, quotes, commas, etc.) that must be aligned and in some cases paired correctly. When supported in programming languages, developers can easily read the data by accessing it via the keys. Better still, developers don’t need to know what the keys are (but it helps!) because they can use the language support mechanisms to get the keys and iterate over them. In this way, like XML, the data is self-describing.

Now let’s look at another key component of the document store—the NoSQL interface starting with the programming library.

Application Programming Interface

An application programming interface (API) , sometimes simply called a library or programming library, is a set of classes and methods that support operations for one or more capability. These capabilities, through the classes and methods, allow a programmer to use the classes and methods to perform various tasks.

For example, when we use any application with a graphical user interface on our phone, tablet, or computer, the application was built using one of several APIs. The graphical user interface itself was built using one or more APIs that encapsulated a set of classes and methods for drawing windows, creating buttons, and so forth—all of the things that the graphical user interface was engineered to provide for developers.

In the case of the MySQL document store, we use the X DevAPI to access the server through a set of classes and methods that provide connectivity to the server, abstractions of concepts (such as collections, tables, SQL operations), and more. As we learned earlier, the X DevAPI is also built on several other technologies including the X Protocol enabled through the X Plugin. These technologies are combined for a NoSQL interface to the MySQL server.

NoSQL Interface

There are several sometimes conflicting definitions (if not examples) of NoSQL. For the purpose of this book and MySQL in general, a NoSQL interface is an API that does not require the use of SQL statements to access data. The API itself provides the connection to the server as well as classes and methods for creating, retrieving, updating, and deleting data.

For example, if you want to fetch all the data that meets a specific criterion, you must first create a connection to the server, request access to the object containing the data, and then fetch the data. Each of these steps requires creating object instances and calling the methods for those object instances to manipulate the API.

In contrast, the normal mechanism used to interact with MySQL is through a SQL interface in which you must form all your interactions with objects and data with strictly formatted SQL commands. You issue the command and read the results. If you want to write an application that uses the SQL interface, say for getting data, you must use commands to search for the data then convert results into internal programming structures making the data seem like an auxiliary component rather than an integral part of the solution.

NoSQL interfaces break this mold by allowing you to use APIs to work with the data. More specific, you use programming interfaces rather than command-based interfaces.

It is at this point that you’re wondering about how MySQL handles the hybrid option of using JSON documents with relational data. In basic terms, MySQL has been designed to permit storing and retrieving JSON documents in the relational data (via the SQL interface). That is, the server has been modified to handle the JSON document. There is also a set of functions that allows you to do all manner of things with the JSON data making it easy to use JSON via the SQL interface.

However, you also can use JSON documents via the NoSQL X DevAPI either through an SQL command or as a pure document store using the special classes and methods of the X DevAPI. We will learn more about the X DevAPI in Chapter 5.

Document Store

A document store (also known as a document-oriented database) is a storage and retrieval system for managing semistructured data (hence documents). Modern document store systems support a key, value construct such as those found in XML and JSON. Document store systems are therefore sometimes considered a subclass of key, value storage systems.

Document store systems also are commonly accessed by a NoSQL interface implemented as a programming interface (API) that permits developers to incorporate the storage and retrieval of documents in their programs without need of a third-party access mechanism (the API implements the access mechanism). Indeed, the metadata that describes the data is embedded with the data itself. Roughly, this means the keys and the layout (arrangement or nesting) of the keys form the metadata and the metadata becomes opaque to the storage mechanism. More specific, how the data is arranged (how the document is formed or describes the data) is not reflected in or managed by the storage mechanism. Access to the semistructured data requires accessing the mechanism designed to process the document itself using the NoSQL interface.

These two qualities: semistructured data and NoSQL interfaces are what separate document stores from relational data. Relational data requires structure that is not flexible forcing all data to conform to a specific structure. Data is also grouped together with the same structure and there is often little allowance for data that can vary in content. Thus, we don’t normally see document store accessible via traditional relational data mechanism. That is, until now.

One thing that is interesting about working with the document store is you don’t need to be an expert on JavaScript or Python to learn how to work with the document store. Indeed, most of what you will do doesn’t require mastery of any programming language.3 That is, there are plenty of examples of how to do things so you need not learn all that there is to know about the language to get started. In fact, you can pick up what you need very quickly and then learn more about the language as your needs mature.

Now, let’s dive into what JSON documents are and how we can use them with MySQL.

Introducing JSON Documents

In MySQL 5.7.8 and beyond, we can use the JSON data type to store a JSON document in a column in a table. Recall from Chapter 1 that although it is possible to embed JSON in a TEXT or BLOB field, there are several very good reasons not to but the most compelling reason is because you would have to add the parsing of the data to your program thereby making it more complex and potentially error prone. The JSON data type overcomes this problem in two big ways.
  • Validation: The JSON data type provides document validation. That is, only valid JSON can be stored in a JSON column.

  • Efficient access: When a JSON document is stored in a table, the storage engine packs the data into a special optimized binary format allowing the server fast access to the data elements rather than parsing the data each time it is accessed.

This opens a whole new avenue for storing unstructured data in a structured form (relational data). However, Oracle didn’t stop with simply adding a JSON data type to MySQL. Oracle also added a sophisticated programming interface as well as the concept of storing documents as collections in the database. We’ll see more about these aspects later in the book. For now, let’s see how to use JSON with relational data.

JSON Format Rules

JSON data is formed using strings bracketed with certain symbols. Although we have been discussing key, value mechanisms as they relate to JSON, there are two types of JSON attributes: arrays formed by a comma separated list and objects formed from a set of key, value pairs. You also can nest JSON attributes. For example, an array can contain objects and values in object keys can contain arrays or other objects. The combination of JSON arrays and objects is called a JSON document.

A JSON array contains a list of values separated by commas and enclosed within square brackets ([ ]). For example, the following are valid JSON arrays.

["red", "green", "yellow", "blue"]
[1,2,3,4,5,6]
[true, false, false]

Note that we started and ended the array with square brackets and used a comma to separate the values. Although I did not use whitespace, you can use whitespace and, depending on your programming language, you may be able to also use newlines, tabs, and carriage returns. For example, the following is still a valid JSON array.

["red", 27, "yellow", 4.75, "blue", false]

A JSON object is a set of key, value pairs where each key, value pair is enclosed within open and close curly braces ({ }) and separated by commas. For example, the following are valid JSON objects. Note that the key address has a JSON object as its value.

{"address": {
   "street": "123 First Street",
   "city": "Oxnard",
   "state": "CA",
   "zip": "90122"
}}
{"address": {
    "street":"4 Main Street",
    "city":"Melborne",
    "state":"California",
    "zip":"90125"
}}
{"address": {
    "street":"173 Caroline Ave",
    "city":"Montrose",
    "state":"Georgia",
    "zip":"31505"
}}

JSON arrays are typically used to contain lists of related (well, sometimes) things, and JSON objects are used to describe complex data. JSON arrays and objects can contain scalar values such as strings or numbers, the null literal (just like in relational data), or Boolean literals true and false. Keys must always be strings and are commonly enclosed in quotes. Finally, JSON values can also contain time information (date, time, or datetime). For example, the following shows a JSON array with time values.

["03:22:19.012000", "2016-02-03", "2016-02-03 03:22:19.012000"]

The following section describes how we can use JSON in MySQL. In this case, we are referring to relational data but the formatting of JSON documents is the same in the document store.

Using JSON in MySQL

When used in MySQL, JSON documents are written as strings. MySQL parses any string used in a JSON data type validating the document. If the document is not valid—it’s not a properly formed JSON document—the server will produce an error. You can use JSON documents in any SQL statement where it is appropriate. For example, you can use it in INSERT and UPDATE statements as well as in clauses like the WHERE clause.

Properly formatting JSON documents can be a bit of a challenge for some, especially those not used to formatting data structures in programming or scripting languages. The things to remember most is to balance your quotes, use commas correctly, and balance all curly braces and square brackets. Easy, right? There’s just one thing that can stymie some people: quotes!

When you specify keys and values as strings, you must use the double quote character ("), not the single quote ('). Because MySQL expects JSON documents as strings, you can use the single quote around the entire JSON document, but not within the document itself. Fortunately, MySQL provides a host of special functions that you can use with JSON documents, one of which is the JSON_VALID() function that permits you to check a JSON document for validity. It returns a 1 if the document is valid and a 0 if it is not. The following shows the results of an attempt to validate a JSON document with single quotes for the keys and values versus a properly formatted JSON document with double quotes.

Tip

If you want to use the MySQL Shell for SQL commands, be sure to start in SQL mode (--sql) or you can switch to SQL mode with sql command once the shell is started.

MySQL  localhost:33060+ ssl  JS > sql
Switching to SQL mode... Commands end with ;
MySQL  localhost:33060+ ssl  SQL > SELECT JSON_VALID("{'address': {'street': '123 First Street','city': 'Oxnard','state': 'CA','zip': '90122'}}");
+-------------------------------------------------------------------------+
| JSON_VALID("{'address': {'street': '123 First Street','city': 'Oxnard','state': 'CA','zip': '90122'}}") |
+-------------------------------------------------------------------------+
|                                                                       0 |
+-------------------------------------------------------------------------+
1 row in set (0.00 sec)
MySQL  localhost:33060+ ssl  SQL > SELECT JSON_VALID('{"address": {"street": "123 First Street","city": "Oxnard","state": "CA","zip": "90122"}}');
+-------------------------------------------------------------------------+
| JSON_VALID('{"address": {"street": "123 First Street","city": "Oxnard","state": "CA","zip": "90122"}}') |
+-------------------------------------------------------------------------+
|                                                                       1 |
+-------------------------------------------------------------------------+
1 row in set (0.00 sec)

Note that the string with the double quotes inside is valid but not the one with single quotes. This is what most people stumble over first when working with JSON.

Let’s look at how to use the JSON document in SQL statements. Suppose we wanted to store the addresses listed previously in a table. For this example, we keep it simple and insert the data in a very simple table. Listing 3-1 shows a transcript of the exercise starting with creating a test table then inserting the first two addresses.

Tip

You can use the G command that is appended to an SQL command to display the result in a vertical format to make it easier to read.

MySQL  localhost:33060+ ssl  Py > sql
Switching to SQL mode... Commands end with ;
 MySQL  localhost:33060+ ssl  SQL > CREATE DATABASE `test`;
Query OK, 1 row affected (0.00 sec)
 MySQL  localhost:33060+ ssl  SQL > USE `test`;
Query OK, 0 rows affected (0.00 sec)
 MySQL  localhost:33060+ ssl  SQL > CREATE TABLE `test`.`addresses` (`id` int(11) NOT NULL AUTO_INCREMENT, `address` json DEFAULT NULL, PRIMARY KEY (`id`)) ENGINE=InnoDB DEFAULT CHARSET=latin1;
Query OK, 0 rows affected (0.00 sec)
 MySQL  localhost:33060+ ssl  SQL > INSERT INTO `test`.`addresses` VALUES (NULL, '{"address": {"street": "123 First Street","city": "Oxnard","state": "CA","zip": "90122"}}');
Query OK, 1 row affected (0.00 sec)
 MySQL  localhost:33060+ ssl  SQL > INSERT INTO `test`.`addresses` VALUES (NULL, '{"address": {"street":"4 Main Street","city":"Melborne","state":"California","zip":"90125"}}');
Query OK, 1 row affected (0.00 sec)
 MySQL  localhost:33060+ ssl  SQL > SELECT * FROM `test`.`addresses` G
*************************** 1. row ***************************
     id: 1
address: {"address": {"zip": "90122", "city": "Oxnard", "state": "CA", "street": "123 First Street"}}
*************************** 2. row ***************************
     id: 2
address: {"address": {"zip": "90125", "city": "Melborne", "state": "California", "street": "4 Main Street"}}
2 rows in set (0.00 sec)
Listing 3-1

Using JSON with SQL Statements

Note that in the CREATE statement we used the data type JSON. This signals MySQL to allocate special storage mechanisms in the storage engine for handling JSON. Contrary to some reports, the JSON data type is not simply direct storage of a string. On the contrary, it is organized internally to optimize retrieval of the elements. Thus, it is very important that the JSON be formatted correctly. You can have multiple JSON columns in a table. However, the sum of the JSON documents in a table row is limited to the value of the variable max_allowed_packet.

Note

JSON columns cannot have a default value like other columns (data types) in a table.

Now, let’s see what happens if we use an invalid JSON document (string) in the SQL statement. The following shows an attempt to insert the last address from the previous example only without the correct quotes around the keys. Note the error thrown.

MySQL  localhost:33060+ ssl  SQL > INSERT INTO test.addresses VALUES (NULL, '{"address": {street:"173 Caroline Ave",city:"Monstrose",state:"Georgia",zip:31505}}');
ERROR: 3140: Invalid JSON text: "Missing a name for object member." at position 13 in value for column 'addresses.address'.

You can expect to see errors like this and others for any JSON document that isn’t formatted correctly. If you want to test your JSON first, use the JSON_VALID() function. However, there are two other functions that may also be helpful when building JSON documents: JSON_ARRAY() and JSON_OBJECT().

The JSON_ARRAY() function takes a list of values and returns a valid formatted JSON array. The following shows an example. Note that it returned a correctly formatted JSON array complete with correct quotes (double instead of single) and the square brackets.

MySQL  localhost:33060+ ssl  SQL > SELECT JSON_ARRAY(1, true, 'test', 2.4);
+----------------------------------+
| JSON_ARRAY(1, true, 'test', 2.4) |
+----------------------------------+
| [1, true, "test", 2.4]           |
+----------------------------------+
1 row in set (0.00 sec)

The JSON_OBJECT() function takes a list of key, value pairs and returns a valid JSON object. The following shows an example. Note that here I used single quotes in calling the function. This is just one example in which it is confusing which quotes to use. In this case, the parameters for the function are not JSON documents; they’re normal SQL strings, which can use single or double quotes.

MySQL  localhost:33060+ ssl  SQL > SELECT JSON_OBJECT("street","4 Main Street","city","Melborne",'state','California','zip',90125);
+-------------------------------------------------------------------------+
| JSON_OBJECT("street","4 Main Street","city","Melborne",'state','California','zip',90125) |
+-------------------------------------------------------------------------+
| {"zip": 90125, "city": "Melborne", "state": "California", "street": "4 Main Street"}     |
+-------------------------------------------------------------------------+
1 row in set (0.00 sec)

Note once again that the automatic conversion of the quotes in the function result. This can be helpful if you need to build JSON on the fly (dynamically).

There is one other useful function for constructing JSON documents: the JSON_TYPE() function. This function takes a JSON document and parses it into a JSON value. It returns the value's JSON type if it is valid or throws an error if it is not valid. The following shows use of this function with the above statements.

MySQL  localhost:33060+ ssl  SQL > SELECT JSON_TYPE('[1, true, "test", 2.4]');
+-------------------------------------+
| JSON_TYPE('[1, true, "test", 2.4]') |
+-------------------------------------+
| ARRAY                               |
+-------------------------------------+
1 row in set (0.00 sec)
MySQL  localhost:33060+ ssl  SQL > SELECT JSON_TYPE('{"zip": 90125, "city": "Melborne", "state": "California", "street": "4 Main Street"}') G
*************************** 1. row ***************************
JSON_TYPE('{"zip": 90125, "city": "Melborne", "state": "California", "street": "4 Main Street"}'): OBJECT
1 row in set (0.00 sec)

There are more functions that MySQL provides to work with the JSON data type. We will see more about these in a later section.

This section described only the basics for using JSON with MySQL in SQL statements. In fact, the formatting of the JSON document also applies to the document store. However, there is one item we haven’t talked about yet—how to access the elements in a JSON document.

To access an element—via its key—we use special notation called path expressions. The following shows a simple example. Note the WHERE clause. This shows a path expression in which I check to see if the address column includes the JSON key ‘city’ referenced with the special notation address->'$.address.city'. We see more details about path expressions in the "Path Expressions" section.

MySQL  localhost:33060+ ssl  SQL > SELECT id, address->'$.address.city' FROM test.addresses WHERE address->'$.address.zip' = '90125';
+----+---------------------------+
| id | address->'$.address.city' |
+----+---------------------------+
|  2 | "Melborne"                |
+----+---------------------------+
1 row in set (0.00 sec)

Path Expressions

If you consider that a JSON document can be a complex set of semistructured data and that at some point you will need to access certain elements in the document, you also may be wondering how to go about getting what you want from the JSON document. Fortunately, there is a mechanism to do this and it is called a path expression. More specific, it is shortcut notation that you can use in your SQL commands (or in the X DevAPI) to get an element without additional programming or scripting.

As you will see, it is a very specific syntax that, although not very expressive (it doesn’t read well in English), the notation can get you what you need without a lot of extra typing. Path expressions are initiated with the dollar sign symbol ($) enclosed in a string. But this notation must have a context. When using path expressions in SQL statements, you must use the JSON_EXTRACT() function, which allows you to use a path expression to extract data from a JSON document. This is because, unlike the X DevAPI classes and methods, path expressions are not directly supported in all SQL statements (but are for some, as we will see). For example, if you wanted the third item in an array, you would use the function as follows.

MySQL  localhost:33060+ ssl  SQL > SELECT JSON_EXTRACT('[1,2,3,4,5,6]', '$[2]');
+---------------------------------------+
| JSON_EXTRACT('[1,2,3,4,5,6]', '$[2]') |
+---------------------------------------+
| 3                                     |
+---------------------------------------+
1 row in set (0.00 sec)

Note that this accesses data in a JSON array. Here we use an array subscript with square brackets around the index (elements start at 0) as you would for an array in many programming languages.

Tip

The use of path expressions in the SQL interface is limited to either one of the JSON functions or used only in specific clauses that have been modified to accept path expressions such as SELECT column lists or WHERE, HAVING, ORDER BY, or GROUP BY clauses.

Now suppose you wanted to access an element by key. You can do that too. In this case, we use the dollar sign followed by a period then the key name. The following shows how to retrieve the last name for a JSON object containing the name and address of an individual.

MySQL  localhost:33060+ ssl  SQL > SELECT JSON_EXTRACT('{"name": {"first":"Billy-bob","last":"Throckmutton"},"address": {"street":"4 Main Street","city":"Melborne","state":"California","zip":"90125"}}', '$.name.first') AS Name;
+-------------+
| Name        |
+-------------+
| "Billy-bob" |
+-------------+
1 row in set (0.00 sec)

Note that I had to use two levels of access. That is, I wanted the value for the key named first from the object named name. Hence, I used '$.name.first'. This demonstrates how to use path expressions to drill down into the JSON document. This also is why we call this a path expression because the way we form the expression gives us the “path” to the element.

Now that we’ve seen a few examples, let’s review the entire syntax for path expressions; both for use in SQL and the NoSQL interfaces. Unless otherwise stated, the syntax aspects apply to both interfaces.

Once again, a path expression starts with the dollar sign and can optionally be followed by several forms of syntax called selectors that allow us to request a part of the document. These selectors include the following:
  • A period followed by the name of a key name references the value for that key. The key name must be specified within double quotation marks if the name without quotes is not valid (it requires quotes to be a valid identifier such as a key name with a space).

  • Use square brackets with an integer index ([n]) to select an element in an array. Indexes start at 0.

  • Paths can contain the wildcards * or ** as follows.
    • .[*] evaluates to the values of all members in a JSON object.

    • [*] evaluates to the values of all elements in a JSON array.

    • A sequence such as prefix**suffix evaluates to all paths that begin with the named prefix and end with the named suffix.

  • Paths can be nested using a period as the separator. In this case, the path after the period is evaluated within the context of the parent path context. For example, $.name.first limits the search for a key named first to the name JSON object.

If a path expression is evaluated as false or fails to locate a data item, the server will return null. For example, the following returns null because there are only 6 items in the array. Can you see why? Remember, counting starts at 0. This is a common mistake for those new to using path expressions (or arrays in programming languages).

MySQL  localhost:33060+ ssl  SQL > SELECT JSON_EXTRACT('[1,2,3,4,5,6]', '$[6]');
+---------------------------------------+
| JSON_EXTRACT('[1,2,3,4,5,6]', '$[6]') |
+---------------------------------------+
| NULL                                  |
+---------------------------------------+
1 row in set (0.00 sec)

But wait, there’s one more nifty option for path expressions. We can use a shortcut! That is, the dash and greater than symbol (->) can be used in place of the JSON_EXTRACT() function when accessing data in SQL statements by column. How cool is that? The use of the -> operation is sometimes called an inline path expression . For example, we could have written the example above to find the third item in a JSON array from a table as follows.

 MySQL  localhost:33060+ ssl  SQL > USE test;
Query OK, 0 rows affected (0.00 sec)
 MySQL  localhost:33060+ ssl  SQL > CREATE TABLE ex1 (id int AUTO_INCREMENT PRIMARY KEY, recorded_data JSON);
Query OK, 0 rows affected (0.00 sec)
 MySQL  localhost:33060+ ssl  SQL > INSERT INTO test.ex1 VALUES (NULL, JSON_ARRAY(1,2,3,4,5,6));
Query OK, 1 row affected (0.00 sec)
 MySQL  localhost:33060+ ssl  SQL > INSERT INTO test.ex1 VALUES (NULL, JSON_ARRAY(7,8,9));
Query OK, 1 row affected (0.00 sec)
 MySQL  localhost:33060+ ssl  SQL > SELECT * FROM test.ex1 WHERE recorded_data->'$[2]' = 3;
+----+--------------------+
| id | recorded_data      |
+----+--------------------+
|  1 | [1, 2, 3, 4, 5, 6] |
+----+--------------------+
1 row in set (0.00 sec)

Note that I simply used the column name, recorded_data, and appended the -> to the end then listed the path expression. Brilliant!

There is one other form of this shortcut. If the result of the -> operation (JSON_EXTRACT) evaluates to a quoted string, we can use the ->> symbol (called the inline path operator) to retrieve the value without quotes. This is helpful when dealing with values that are numbers. The following shows two examples. One example is with the -> operation and the same with the ->> operation.

 MySQL  localhost:33060+ ssl  SQL > INSERT INTO test.ex1 VALUES (NULL, '{"name":"will","age":"43"}');
Query OK, 1 row affected (0.00 sec)
 MySQL  localhost:33060+ ssl  SQL > INSERT INTO test.ex1 VALUES (NULL, '{"name":"joseph","age":"11"}');
Query OK, 1 row affected (0.00 sec)
 MySQL  localhost:33060+ ssl  SQL > SELECT * FROM test.ex1 WHERE recorded_data->>'$.age' = 43;
+----+-------------------------------+
| id | recorded_data                 |
+----+-------------------------------+
|  3 | {"age": "43", "name": "will"} |
+----+-------------------------------+
1 row in set (0.00 sec)
MySQL  localhost:33060+ ssl  SQL > SELECT * FROM test.ex1 WHERE recorded_data->'$.age' = 43;
Empty set (0.00 sec)

Note that the recorded_data values (age and name) were stored as a string. But what if the data were stored as an integer? Observe.

 MySQL  localhost:33060+ ssl  SQL > INSERT INTO test.ex1 VALUES (NULL, '{"name":"amy","age":22}');
Query OK, 1 row affected (0.00 sec)
 MySQL  localhost:33060+ ssl  SQL > SELECT * FROM test.ex1 WHERE recorded_data->'$.age' = 22;
+----+----------------------------+
| id | recorded_data              |
+----+----------------------------+
|  5 | {"age": 22, "name": "amy"} |
+----+----------------------------+
1 row in set (0.00 sec)
 MySQL  localhost:33060+ ssl  SQL > SELECT * FROM test.ex1 WHERE recorded_data->>'$.age' = 22;
+----+----------------------------+
| id | recorded_data              |
+----+----------------------------+
|  5 | {"age": 22, "name": "amy"} |
+----+----------------------------+
1 row in set (0.00 sec)

Aha! So, the ->> operation is most useful when values must be unquoted. If they were already unquoted (such as an integer), the ->> operation returns the same as the -> operation.

Now, let’s see a few more examples of path expressions. Listing 3-2 shows several examples without explanation. Take a few minutes to look through these and examine the data it is operating on so you can see how each works. With a little imagination, you can drill down to a single data element!

 MySQL  localhost:33060+ ssl  SQL > INSERT INTO test.ex1 VALUES (NULL, '{"name": {"last": "Throckmutton", "first": "Billy-bob"}, "address": {"zip": "90125", "city": "Melborne", "state": "California", "street": "4 Main Street"}}');
Query OK, 1 row affected (0.00 sec)
 MySQL  localhost:33060+ ssl  SQL > SELECT recorded_data FROM test.ex1 WHERE recorded_data->'$.name' IS NOT NULL G
*************************** 1. row ***************************
recorded_data: {"age": "43", "name": "will"}
*************************** 2. row ***************************
recorded_data: {"age": "11", "name": "joseph"}
*************************** 3. row ***************************
recorded_data: {"age": 22, "name": "amy"}
*************************** 4. row ***************************
recorded_data: {"name": {"last": "Throckmutton", "first": "Billy-bob"}, "address": {"zip": "90125", "city": "Melborne", "state": "California", "street": "4 Main Street"}}
4 rows in set (0.00 sec)
 MySQL  localhost:33060+ ssl  SQL > SELECT recorded_data->'$.name' FROM test.ex1 WHERE recorded_data->'$.name' IS NOT NULL;
+------------------------------------------------+
| recorded_data->'$.name'                        |
+------------------------------------------------+
| "will"                                         |
| "joseph"                                       |
| "amy"                                          |
| {"last": "Throckmutton", "first": "Billy-bob"} |
+------------------------------------------------+
4 rows in set (0.00 sec)
 MySQL  localhost:33060+ ssl  SQL > SELECT recorded_data->'$.name.first' as first, recorded_data->'$.name.last' as last FROM test.ex1 WHERE recorded_data->'$.name.first' IS NOT NULL;
+-------------+----------------+
| first       | last           |
+-------------+----------------+
| "Billy-bob" | "Throckmutton" |
+-------------+----------------+
1 row in set (0.00 sec)
 MySQL  localhost:33060+ ssl  SQL > INSERT INTO test.ex1 VALUES (NULL, '{"phones": [{"work": "555-1212"}, {"cell": "555-2121"}]}');
Query OK, 1 row affected (0.00 sec)
 MySQL  localhost:33060+ ssl  SQL > SELECT recorded_data->>'$.phones' FROM test.ex1 WHERE recorded_data->>'$.phones' IS NOT NULL;
+----------------------------------------------+
| recorded_data->>'$.phones'                   |
+----------------------------------------------+
| [{"work": "555-1212"}, {"cell": "555-2121"}] |
+----------------------------------------------+
1 row in set (0.00 sec)
 MySQL  localhost:33060+ ssl  SQL > SELECT recorded_data->'$.phones[1]' FROM test.ex1 WHERE recorded_data->>'$.phones' IS NOT NULL;
+------------------------------+
| recorded_data->'$.phones[1]' |
+------------------------------+
| {"cell": "555-2121"}         |
+------------------------------+
1 row in set (0.00 sec)
 MySQL  localhost:33060+ ssl  SQL > SELECT recorded_data->'$.phones[1].cell' FROM test.ex1 WHERE recorded_data->>'$.phones' IS NOT NULL;
+-----------------------------------+
| recorded_data->'$.phones[1].cell' |
+-----------------------------------+
| "555-2121"                        |
+-----------------------------------+
1 row in set (0.00 sec)
Listing 3-2

Examples of Path Expressions

Note that I use the path expression in the WHERE clause checking to see if the result is not NULL. This is a good trick on selecting rows in a table that have the elements you’re looking for in the document. That is, you want only the rows that contain a specific data element (via the path expression).

However, the use of the shortcuts (inline path expressions) is not a direct replacement for the JSON_EXTRACT() function. The following summarizes the limitations.
  • Data source: When used in a SQL statement, the inline path expression uses the field (column) specified only. The function can use any JSON typed value.

  • Path expression string: An inline path expression must use a plain string; the function can use any string typed value.

  • Number of expressions: An inline path expression can use only one path expression against a single field (column). The function can use multiple path expressions against a JSON document.

Tip

For more information about path expressions, see the section, “The JSON Data Type” in the online MySQL reference manual.

Now let’s look at the various JSON functions that we can use to work with JSON documents.

JSON Functions

There are several functions for working with JSON in MySQL. I describe many of the functions available in this section. Although we won’t explore the nuance of every function, we will see the more commonly used functions for working with JSON documents. Let’s begin with an overview in the form of a list of the available functions. Table 3-1 lists the JSON functions available in MySQL 8.0.11.
Table 3-1

JSON Functions in MySQL

Function

Description and Use

JSON_ARRAY()

Evaluates a list of values and returns a JSON array containing those values

JSON_ARRAYAGG()

Aggregates a result set as a single JSON array whose elements consist of the rows

JSON_ARRAY_APPEND()

Appends values to the end of the indicated arrays within a JSON document and returns the result

JSON_ARRAY_INSERT()

Updates a JSON document, inserting an array within the document and returning the modified document

JSON_CONTAINS()

Returns 0 or 1 to indicate whether a specific value is contained in a target JSON document, or, if a path argument is given, at a specific path within the target document

JSON_CONTAINS_PATH()

Returns 0 or 1 to indicate whether a JSON document contains data at a given path or paths

JSON_DEPTH()

Returns the maximum depth of a JSON document

JSON_EXTRACT()

Returns data from a JSON document, selected from the parts of the document matched by the path arguments

JSON_INSERT()

Inserts data into a JSON document and returns the result

JSON_KEYS()

Returns the keys from the top-level value of a JSON object as a JSON array, or, if a path argument is given, the top-level keys from the selected path

JSON_LENGTH()

Returns the length of JSON document, or, if a path argument is given, the length of the value within the document identified by the path

JSON_MERGE()

Merges two or more JSON documents and returns the merged result

JSON_MERGE_PATCH()

Merges two or more JSON documents replacing values where keys are duplicated

JSON_MERGE_PRESERVE()

Merges two or more JSON documents saving values where keys are duplicated

JSON_OBJECT()

Evaluates a list of key/value pairs and returns a JSON object containing those pairs

JSON_OBJECTAGG()

Takes two column names or expressions as arguments, the first of these being used as a key and the second as a value, and returns a JSON object that contains key/value pairs

JSON_PRETTY()

Prints a nicer looking layout of the JSON document

JSON_QUOTE()

Quotes a string as a JSON value by wrapping it with double quote characters and escaping interior quote and other characters, then returned the result as a utf8mb4 string

JSON_REMOVE()

Removes data from a JSON document and returns the result

JSON_REPLACE()

Replaces existing values in a JSON document and returns the result

JSON_SEARCH()

Returns the path to the given string within a JSON document

JSON_SET()

Inserts or updates data in a JSON document and returns the result

JSON_STORAGE_FREE()

Displays amount of space remaining in a JSON column following a partial update

JSON_STORAGE_SIZE()

Displays the storage used by a JSON value

JSON_TABLE()

Extracts data from a JSON document and returns it as a relational table

JSON_TYPE()

Returns a utf8mb4 string indicating the type of a JSON value

JSON_UNQUOTE()

Removes quotes from the JSON value and returns the result as a utf8mb4 string

JSON_VALID()

Returns 0 or 1 to indicate whether a value is a valid JSON document

Note

The JSON_MERGE() function was deprecated in version 8.0.3 (and also in 5.7.22).

Mastery of these functions is not essential to working with the document store, but can help greatly when developing hybrid solutions in which you use JSON in SQL statements.

These functions can be grouped into categories based on how they are used. We will see functions useful for adding data, those for retrieving (searching) data, and more. The following show how to use the functions using brief examples.

Most functions take a JSON document as the first parameter and a path expression and value as the second and third parameters. Path expressions must be valid for the document and must not contain the wildcards * or **. The functions also return the result so you can use them in SQL statements.

Creating JSON Data

There are several useful functions for creating JSON data. We have already seen two important functions; JSON_ARRAY() that builds a JSON array type and JSON_OBJECT() that builds a JSON object type. This section discusses some of the other functions that you can use to help create JSON documents including functions for aggregating, appending, and inserting data in JSON arrays.

The JSON_ARRAYAGG() function is used to create an array of JSON documents from several rows. It can be helpful when you want to summarize data or combine data from several rows. The function takes a column name and combines the JSON data from the rows into a new array. Listing 3-3 shows examples of using the function. This example takes the rows in the table and combines them to form a new array of JSON objects.

 MySQL  localhost:33060+ ssl  SQL > CREATE TABLE test.favorites (id int AUTO_INCREMENT PRIMARY KEY, preferences JSON);
Query OK, 0 rows affected (0.00 sec)
 MySQL  localhost:33060+ ssl  SQL > INSERT INTO test.favorites VALUES (NULL, '{"color": "red"}');
Query OK, 1 row affected (0.00 sec)
 MySQL  localhost:33060+ ssl  SQL > INSERT INTO test.favorites VALUES (NULL, '{"color": "blue"}');
Query OK, 1 row affected (0.00 sec)
 MySQL  localhost:33060+ ssl  SQL > INSERT INTO test.favorites VALUES (NULL, '{"color": "purple"}');
Query OK, 1 row affected (0.00 sec)
 MySQL  localhost:33060+ ssl  SQL > SELECT * FROM test.favorites;
+----+---------------------+
| id | preferences         |
+----+---------------------+
|  1 | {"color": "red"}    |
|  2 | {"color": "blue"}   |
|  3 | {"color": "purple"} |
+----+---------------------+
3 rows in set (0.00 sec)
 MySQL  localhost:33060+ ssl  SQL > SELECT JSON_ARRAYAGG(preferences) FROM test.favorites;
+------------------------------------------------------------+
| JSON_ARRAYAGG(preferences)                                 |
+------------------------------------------------------------+
| [{"color": "red"}, {"color": "blue"}, {"color": "purple"}] |
+------------------------------------------------------------+
1 row in set (0.00 sec)
Listing 3-3

Using the JSON_ARRAYARG Function

The JSON_ARRAY_APPEND() is an interesting function that allows you to append data to a JSON array either at the end or immediately after a given path expression. The function takes as parameters a JSON array, a path expression, and the value (including a JSON document) to be inserted. Listing 3-4 shows several examples.

 MySQL  localhost:33060+ ssl  SQL > SET @base = '["apple","pear",{"grape":"red"},"strawberry"]';
Query OK, 0 rows affected (0.00 sec)
 MySQL  localhost:33060+ ssl  SQL > SELECT JSON_ARRAY_APPEND(@base, '$', "banana");
+-------------------------------------------------------------+
| JSON_ARRAY_APPEND(@base, '$', "banana")                     |
+-------------------------------------------------------------+
| ["apple", "pear", {"grape": "red"}, "strawberry", "banana"] |
+-------------------------------------------------------------+
1 row in set (0.00 sec)
 MySQL  localhost:33060+ ssl  SQL > SELECT JSON_ARRAY_APPEND(@base, '$[2].grape', "green");
+--------------------------------------------------------------+
| JSON_ARRAY_APPEND(@base, '$[2].grape', "green")              |
+--------------------------------------------------------------+
| ["apple", "pear", {"grape": ["red", "green"]}, "strawberry"] |
+--------------------------------------------------------------+
1 row in set (0.00 sec)
 MySQL  localhost:33060+ ssl  SQL > SET @base = '{"grape":"red"}';
Query OK, 0 rows affected (0.00 sec)
 MySQL  localhost:33060+ ssl  SQL > SELECT JSON_ARRAY_APPEND(@base, '$', '{"grape":"red"}');
+--------------------------------------------------+
| JSON_ARRAY_APPEND(@base, '$', '{"grape":"red"}') |
+--------------------------------------------------+
| [{"grape": "red"}, "{"grape":"red"}"]        |
+--------------------------------------------------+
1 row in set (0.00 sec)
Listing 3-4

Using the JSON_ARRAY_APPEND Function

Note that the first example simply adds a new value to the end of the array. The second example changes the value of the key in the JSON object in the third index to an array and adds a new value. This is an interesting by-product of this function. We see this again in the third example where we change a basic JSON object to a JSON array of JSON objects.

The JSON_ARRAY_INSERT() function is similar except it inserts the value before the path expression. The function takes as parameters a JSON array, a path expression, and the value (including a JSON document) to be inserted. When including multiple path expression and value pairs, the effect is cumulative where the function evaluates the first path expression and value applying the next pair to the result, and so on. Listing 3-5 shows some examples using the new function, which are similar to the previous examples. Note that the positions of the data inserted is before the path expression.

 MySQL  localhost:33060+ ssl  SQL > SET @base = '["apple","pear",{"grape":["red","green"]},"strawberry"]';
Query OK, 0 rows affected (0.00 sec)
 MySQL  localhost:33060+ ssl  SQL > SELECT JSON_ARRAY_INSERT(@base, '$[0]', "banana");
+------------------------------------------------------------------------+
| JSON_ARRAY_INSERT(@base, '$[0]', "banana")                             |
+------------------------------------------------------------------------+
| ["banana", "apple", "pear", {"grape": ["red", "green"]}, "strawberry"] |
+------------------------------------------------------------------------+
1 row in set (0.00 sec)
 MySQL  localhost:33060+ ssl  SQL > SELECT JSON_ARRAY_INSERT(@base, '$[2].grape[0]', "white");
+-----------------------------------------------------------------------+
| JSON_ARRAY_INSERT(@base, '$[2].grape[0]', "white")                    |
+-----------------------------------------------------------------------+
| ["apple", "pear", {"grape": ["white", "red", "green"]}, "strawberry"] |
+-----------------------------------------------------------------------+
1 row in set (0.00 sec)
 MySQL  localhost:33060+ ssl  SQL > SET @base = '[{"grape":"red"}]';
Query OK, 0 rows affected (0.00 sec)
 MySQL  localhost:33060+ ssl  SQL > SELECT JSON_ARRAY_INSERT(@base, '$[0]', '{"grape":"red"}');
+-----------------------------------------------------+
| JSON_ARRAY_INSERT(@base, '$[0]', '{"grape":"red"}') |
+-----------------------------------------------------+
| ["{"grape":"red"}", {"grape": "red"}]           |
+-----------------------------------------------------+
1 row in set (0.00 sec)
Listing 3-5

Using the JSON_ARRAY_INSERT Function

The JSON_INSERT() function is designed to take a JSON document and inserts one or more values at a specified path expression. That is, you can pass pairs of path expression and value at one time. But there is a catch. The path expression in this case must not evaluate to an element in the document. As with the last function, when including multiple path expressions, the effect is cumulative where the function evaluates the first path expression applying the next path expression to the result, and so on. Listing 3-6 shows an example. Note that the third path expression and value is not inserted because the path expression, $[0], evaluates to the first element, apple.

 MySQL  localhost:33060+ ssl  SQL > SET @base = '["apple","pear",{"grape":["red","green"]},"strawberry"]';
Query OK, 0 rows affected (0.00 sec)
 MySQL  localhost:33060+ ssl  SQL > SELECT JSON_INSERT(@base, '$[9]', "banana", '$[2].grape[3]', "white", '$[0]', "orange");
+-------------------------------------------------------------------------+
| JSON_INSERT(@base, '$[9]', "banana", '$[2].grape[3]', "white", '$[0]', "orange") |
+-------------------------------------------------------------------------+
| ["apple", "pear", {"grape": ["red", "green", "white"]}, "strawberry", "banana"]  |
+-------------------------------------------------------------------------+
1 row in set (0.00 sec)
Listing 3-6

Using the JSON_INSERT Function

The JSON_MERGE_PATCH() and JSON_MERGE_PRESERVE() functions are designed to take two or more JSON documents and combine them. The JSON_MERGE_PATH() function replaces values for duplicate keys whereas the JSON_MERGE_PRESERVE() preserves the values for duplicate keys. As with the previous function, you can include as many JSON documents as you want. Note how I used this function to build the example JSON document from the earlier examples. Listing 3-7 shows an example using the methods.

 MySQL  localhost:33060+ ssl  SQL > SELECT JSON_MERGE_PATCH('["apple","pear"]', '{"grape":["red","green"]}', '["strawberry"]');
+-------------------------------------------------------------------------+
| JSON_MERGE_PATCH('["apple","pear"]', '{"grape":["red","green"]}', '["strawberry"]') |
+-------------------------------------------------------------------------+
| ["strawberry"]                                                          |
+-------------------------------------------------------------------------+
1 row in set (0.00 sec)
 MySQL  localhost:33060+ ssl  SQL > SELECT JSON_MERGE_PRESERVE('{"grape":["red","green"]}', '{"grape":["white"]}');
+-------------------------------------------------------------------------+
| JSON_MERGE_PRESERVE('{"grape":["red","green"]}', '{"grape":["white"]}') |
+-------------------------------------------------------------------------+
| {"grape": ["red", "green", "white"]}                                    |
+-------------------------------------------------------------------------+
1 row in set (0.00 sec)
Listing 3-7

Using the JSON_MERGE_ PATCH and JSON_MERGE_PRESERVE Functions

If any JSON function is passed an invalid parameter, invalid JSON document, or the path expression does not find an element, some functions return null whereas others may return the original JSON document. Listing 3-8 shows an example. In this case, there is no element at position 8 because the array only has 4 elements.

 MySQL  localhost:33060+ ssl  SQL > SET @base = '["apple","pear",{"grape":"red"},"strawberry"]';
Query OK, 0 rows affected (0.00 sec)
 MySQL  localhost:33060+ ssl  SQL > SELECT JSON_ARRAY_APPEND(@base, '$[7]', "flesh");
+---------------------------------------------------+
| JSON_ARRAY_APPEND(@base, '$[7]', "flesh")         |
+---------------------------------------------------+
| ["apple", "pear", {"grape": "red"}, "strawberry"] |
+---------------------------------------------------+
1 row in set (0.00 sec)
Listing 3-8

Using the JSON_ARRAY_APPEND Function

Now let’s see functions that we can use to modify JSON data.

Modifying JSON Data

There are several useful functions for modifying JSON data. This section discusses functions that you can use to help modify JSON documents by removing, replacing, and updating elements in the JSON document.

The JSON_REMOVE() function is used to remove elements that match a path expression. You must provide the JSON document to operate on and one or more path expressions and the result will be the JSON document with the elements removed. When including multiple path expressions, the effect is cumulative where the function evaluates the first path expression applying the next path expression to the result, and so on. Listing 3-9 shows an example. Note that I had to imagine what the intermediate results would be—that is, I used $[0] three times because the function removed the first element twice leaving the JSON object as the first element.

 MySQL  localhost:33060+ ssl  SQL > SET @base = '["apple","pear",{"grape":["red","white"]},"strawberry"]';
Query OK, 0 rows affected (0.00 sec)
 MySQL  localhost:33060+ ssl  SQL > SELECT JSON_REMOVE(@base, '$[0]', '$[0]', '$[0].grape[1]');
+-----------------------------------------------------+
| JSON_REMOVE(@base, '$[0]', '$[0]', '$[0].grape[1]') |
+-----------------------------------------------------+
| [{"grape": ["red"]}, "strawberry"]                  |
+-----------------------------------------------------+
1 row in set (0.00 sec)
Listing 3-9

Using the JSON_REMOVE Function (Single)

This may take a little getting used to but you can use the function multiple times or nested as shown in the examples in Listing 3-10.

 MySQL  localhost:33060+ ssl  SQL > SET @base = '["apple","pear",{"grape":["red","white"]},"strawberry"]';
Query OK, 0 rows affected (0.00 sec)
 MySQL  localhost:33060+ ssl  SQL > SET @base = JSON_REMOVE(@base, '$[0]');
Query OK, 0 rows affected (0.00 sec)
 MySQL  localhost:33060+ ssl  SQL > SET @base = JSON_REMOVE(@base, '$[0]');
Query OK, 0 rows affected (0.00 sec)
 MySQL  localhost:33060+ ssl  SQL > SELECT JSON_REMOVE(@base, '$[0].grape[1]');
+-------------------------------------+
| JSON_REMOVE(@base, '$[0].grape[1]') |
+-------------------------------------+
| [{"grape": ["red"]}, "strawberry"]  |
+-------------------------------------+
1 row in set (0.00 sec)
 MySQL  localhost:33060+ ssl  SQL > SET @base = '["apple","pear",{"grape":["red","white"]},"strawberry"]';
Query OK, 0 rows affected (0.00 sec)
 MySQL  localhost:33060+ ssl  SQL > SELECT JSON_REMOVE(JSON_REMOVE(JSON_REMOVE(@base, '$[0]'), '$[0]'), '$[0].grape[1]');
+-------------------------------------------------------------------------+
| JSON_REMOVE(JSON_REMOVE(JSON_REMOVE(@base, '$[0]'), '$[0]'), '$[0].grape[1]') |
+-------------------------------------------------------------------------+
| [{"grape": ["red"]}, "strawberry"]                                      |
+-------------------------------------------------------------------------+
1 row in set (0.00 sec)
Listing 3-10

Using the JSON_REMOVE Function (Nested)

The JSON_REPLACE() function takes a JSON document and pairs of path expression and value replacing the element that matches the path expression with the new value. Once again, the results are cumulative and work in order left to right. There is a catch with this function too. It ignores any new values or path expressions that evaluate to new values. Listing 3-11 shows an example. Note that the third pair was not removed because there is no tenth element.

 MySQL  localhost:33060+ ssl  SQL > SET @base = '["apple","pear",{"grape":["red","white"]},"strawberry"]';
Query OK, 0 rows affected (0.00 sec)
 MySQL  localhost:33060+ ssl  SQL > SELECT JSON_REPLACE(@base, '$[0]', "orange", '$[2].grape[0]', "green", '$[9]', "waffles");
+-------------------------------------------------------------------------+
| JSON_REPLACE(@base, '$[0]', "orange", '$[2].grape[0]', "green", '$[9]', "waffles") |
+-------------------------------------------------------------------------+
| ["orange", "pear", {"grape": ["green", "white"]}, "strawberry"]         |
+-------------------------------------------------------------------------+
1 row in set (0.00 sec)
Listing 3-11

Using the JSON_REPLACE Function

The JSON_SET() function is designed to modify JSON document elements. As with the other functions, you pass a JSON document as the first parameter and then one or more pairs of path expression and value to replace. However, this function also inserts any elements that are not in the document (the path expression is not found). Listing 3-12 shows an example. Note that the last element did not exist so it adds it to the documents.

 MySQL  localhost:33060+ ssl  SQL > SET @base = '["apple","pear",{"grape":["red","white"]},"strawberry"]';
Query OK, 0 rows affected (0.00 sec)
 MySQL  localhost:33060+ ssl  SQL > SELECT JSON_SET(@base, '$[0]', "orange", '$[2].grape[1]', "green", '$[9]', "123");
+-------------------------------------------------------------------------+
| JSON_SET(@base, '$[0]', "orange", '$[2].grape[1]', "green", '$[9]', "123") |
+-------------------------------------------------------------------------+
| ["orange", "pear", {"grape": ["red", "green"]}, "strawberry", "123"]    |
+-------------------------------------------------------------------------+
1 row in set (0.00 sec)
Listing 3-12

Using the JSON_SET Function

Ignore or Not Ignore, Which Does What?

One issue with the JSON functions is that some will operate on values that exist, others ignore values that exist, some add values that do not already exist, and so forth. It can become confusing if you aren’t familiar with all the functions. The following summarizes the differences for those functions that can be the most confusing.
  • JSON_INSERT(): adds new values but does not replace existing values

  • JSON_REMOVE(): removes elements that exist in the document and ignores those that do not exist

  • JSON_REPLACE(): replaces existing values and ignores new values

  • JSON_SET(): replaces values for paths that exist and adds values for paths that do not exist

If you want to use these functions, be sure to check them with sample data until you understand the conditions.

Now let’s look at the JSON functions you can use to find elements in the document.

Searching JSON Data

Another important operation for working with SQL and JSON data is searching for data in the JSON document. We discovered previously in the chapter how to reference data in the document with the special notation (path expressions), and we learned there are JSON functions that we can use to search for the data. In fact, we saw these two concepts used together in the previous section. In this section, we review the JSON data searching mechanism because you are likely to use these functions more than any other, especially in your queries.

There are four JSON functions that allow you to search JSON documents. As with the previous functions, these operate on a JSON document with one or more parameters. I call them searching functions not because they allow you to search a database or table for JSON data, but rather they allow you to find things in JSON documents. The functions include those for checking to see if a value or element exists in the document, whether a path expression is valid (something can be found using it), and retrieving information from the document.

The JSON_CONTAINS() function has two options: you can use it to return whether a value exists anywhere in the document or if a value exists using a path expression (the path expression is an optional parameter). The function returns a 0 or 1 where a 0 means the value was not found. An error occurs if either document argument is not a valid JSON document, the path argument is not a valid path expression, or contains a * or ** wildcard. There is another catch. The value you pass in must be a valid JSON string or document. Listing 3-13 shows several examples of using the function to search a JSON document.

 MySQL  localhost:33060+ ssl  SQL > SET @base = '{"grapes":["red","white","green"],"berries":["strawberry","raspberry","boysenberry","blackberrry"]}';
Query OK, 0 rows affected (0.00 sec)
 MySQL  localhost:33060+ ssl  SQL > SELECT JSON_CONTAINS(@base,'["red","white","green"]');
+------------------------------------------------+
| JSON_CONTAINS(@base,'["red","white","green"]') |
+------------------------------------------------+
|                                              0 |
+------------------------------------------------+
1 row in set (0.00 sec)
 MySQL  localhost:33060+ ssl  SQL > SELECT JSON_CONTAINS(@base,'{"grapes":["red","white","green"]}');
+-----------------------------------------------------------+
| JSON_CONTAINS(@base,'{"grapes":["red","white","green"]}') |
+-----------------------------------------------------------+
|                                                         1 |
+-----------------------------------------------------------+
1 row in set (0.00 sec)
 MySQL  localhost:33060+ ssl  SQL > SELECT JSON_CONTAINS(@base,'["red","white","green"]','$.grapes');
+-----------------------------------------------------------+
| JSON_CONTAINS(@base,'["red","white","green"]','$.grapes') |
+-----------------------------------------------------------+
|                                                         1 |
+-----------------------------------------------------------+
1 row in set (0.00 sec)
 MySQL  localhost:33060+ ssl  SQL > SELECT JSON_CONTAINS(@base,'"blackberry"','$.berries');
+-------------------------------------------------+
| JSON_CONTAINS(@base,'"blackberry"','$.berries') |
+-------------------------------------------------+
|                                               0 |
+-------------------------------------------------+
1 row in set (0.00 sec)
 MySQL  localhost:33060+ ssl  SQL > SELECT JSON_CONTAINS(@base,'blackberry','$.berries');
ERROR: 3141: Invalid JSON text in argument 2 to function json_contains: "Invalid value." at position 0.
 MySQL  localhost:33060+ ssl  SQL > SELECT JSON_CONTAINS(@base,'"red"','$.grapes');
+-----------------------------------------+
| JSON_CONTAINS(@base,'"red"','$.grapes') |
+-----------------------------------------+
|                                       1 |
+-----------------------------------------+
1 row in set (0.00 sec)
Listing 3-13

Using the JSON_CONTAINS Function

As you can see, this is a very useful function but it requires a bit of care to use properly. That is, you must make sure the value is a valid string. In all examples save one, I am searching the JSON document for either a JSON document (that makes searching for nested data easier), or a single value using a path expression. Remember, the function searches for values, not keys.

Note the second to last example: this returns an error because the value is not a valid JSON string. You must use double quotes around it to correct it as shown in the following example.

The JSON_CONTAINS_PATH() function uses a parameter strategy that is a little different. The function searches a JSON document to see if a path expression exists but it also allows you to find the first occurrence or all occurrences. It also can take multiple paths and evaluate them either as an “or” or “and” condition depending on what value you pass as the second parameter as in the following:
  • If you pass one, the function will return 1 if at least one path expression is found (OR).

  • If you pass all, the function will return 1 only if all path expressions are found (AND).

The function returns 0 or 1 to indicate whether a JSON document contains data at a given path or paths. Note that it can return null if any of the path expressions or the document is null. An error occurs if the JSON document, or any path expression is not valid, or the second parameter is not one or all. Listing 3-14 shows several examples of using the function.

 MySQL  localhost:33060+ ssl  SQL > SET @base = '{"grapes":["red","white","green"],"berries":["strawberry","raspberry","boysenberry","blackberrry"],"numbers":["1","2","3","4","5"]}';
Query OK, 0 rows affected (0.00 sec)
 MySQL  localhost:33060+ ssl  SQL > SELECT JSON_CONTAINS_PATH(@base,'one','$');
+-------------------------------------+
| JSON_CONTAINS_PATH(@base,'one','$') |
+-------------------------------------+
|                                   1 |
+-------------------------------------+
1 row in set (0.00 sec)
 MySQL  localhost:33060+ ssl  SQL > SELECT JSON_CONTAINS_PATH(@base,'all','$');
+-------------------------------------+
| JSON_CONTAINS_PATH(@base,'all','$') |
+-------------------------------------+
|                                   1 |
+-------------------------------------+
1 row in set (0.00 sec)
 MySQL  localhost:33060+ ssl  SQL > SELECT JSON_CONTAINS_PATH(@base,'all','$.grapes','$.berries');
+--------------------------------------------------------+
| JSON_CONTAINS_PATH(@base,'all','$.grapes','$.berries') |
+--------------------------------------------------------+
|                                                      1 |
+--------------------------------------------------------+
1 row in set (0.00 sec)
 MySQL  localhost:33060+ ssl  SQL > SELECT JSON_CONTAINS_PATH(@base,'all','$.grapes','$.berries','$.numbers');
+--------------------------------------------------------------------+
| JSON_CONTAINS_PATH(@base,'all','$.grapes','$.berries','$.numbers') |
+--------------------------------------------------------------------+
|                                                                  1 |
+--------------------------------------------------------------------+
1 row in set (0.00 sec)
 MySQL  localhost:33060+ ssl  SQL > SELECT JSON_CONTAINS_PATH(@base,'all','$.grapes','$.berries','$.num');
+----------------------------------------------------------------+
| JSON_CONTAINS_PATH(@base,'all','$.grapes','$.berries','$.num') |
+----------------------------------------------------------------+
|                                                              0 |
+----------------------------------------------------------------+
1 row in set (0.00 sec)
 MySQL  localhost:33060+ ssl  SQL > SELECT JSON_CONTAINS_PATH(@base,'one','$.grapes','$.berries','$.num');
+----------------------------------------------------------------+
| JSON_CONTAINS_PATH(@base,'one','$.grapes','$.berries','$.num') |
+----------------------------------------------------------------+
|                                                              1 |
+----------------------------------------------------------------+
1 row in set (0.00 sec)
 MySQL  localhost:33060+ ssl  SQL > SELECT JSON_CONTAINS_PATH(@base,'one','$.grapes');
+--------------------------------------------+
| JSON_CONTAINS_PATH(@base,'one','$.grapes') |
+--------------------------------------------+
|                                          1 |
+--------------------------------------------+
1 row in set (0.00 sec)
 MySQL  localhost:33060+ ssl  SQL > SELECT JSON_CONTAINS_PATH(@base,'all','$.grape');
+-------------------------------------------+
| JSON_CONTAINS_PATH(@base,'all','$.grape') |
+-------------------------------------------+
|                                         0 |
+-------------------------------------------+
1 row in set (0.00 sec)
 MySQL  localhost:33060+ ssl  SQL > SELECT JSON_CONTAINS_PATH(@base,'one','$.berries');
+---------------------------------------------+
| JSON_CONTAINS_PATH(@base,'one','$.berries') |
+---------------------------------------------+
|                                           1 |
+---------------------------------------------+
1 row in set (0.00 sec)
 MySQL  localhost:33060+ ssl  SQL > SELECT JSON_CONTAINS_PATH(@base,'all','$.berries');
+---------------------------------------------+
| JSON_CONTAINS_PATH(@base,'all','$.berries') |
+---------------------------------------------+
|                                           1 |
+---------------------------------------------+
1 row in set (0.00 sec)
Listing 3-14

Using the JSON_CONTAINS_PATH Function

Take some time to look through these examples so you can see how they work. Note that in the first two commands I used a path expression of a single dollar sign. This is simply the path expression to the entire document so naturally, it exists. Note also the differences in the use of one or all for the last two examples.

The JSON_EXTRACT() function is one of the most used functions. It allows you to extract a value, JSON array, JSON object, and so forth from a JSON document using one or more path expressions. We have already seen a couple of examples. Recall the function returns the portion of the JSON document that matches the path expression. Listing 3-15 shows a few more examples using complex path expressions.

 MySQL  localhost:33060+ ssl  SQL > SET@base = '{"grapes":["red","white","green"],"berries":["strawberry","raspberry","boysenberry","blackberrry"],"numbers":["1","2","3","4","5"]}';
Query OK, 0 rows affected (0.00 sec)
 MySQL  localhost:33060+ ssl  SQL > SELECT JSON_EXTRACT(@base,'$');
+-------------------------------------------------------------------------+
| JSON_EXTRACT(@base,'$')                                                            
+-------------------------------------------------------------------------+
| {"grapes": ["red", "white", "green"], "berries": ["strawberry", "raspberry", "boysenberry", "blackberry"], "numbers": ["1", "2", "3", "4", "5"]} |
+-------------------------------------------------------------------------+
1 row in set (0.00 sec)
 MySQL  localhost:33060+ ssl  SQL > SELECT JSON_EXTRACT(@base,'$.grapes');
+--------------------------------+
| JSON_EXTRACT(@base,'$.grapes') |
+--------------------------------+
| ["red", "white", "green"]      |
+--------------------------------+
1 row in set (0.00 sec)
 MySQL  localhost:33060+ ssl  SQL > SELECT JSON_EXTRACT(@base,'$.grapes[*]');
+-----------------------------------+
| JSON_EXTRACT(@base,'$.grapes[*]') |
+-----------------------------------+
| ["red", "white", "green"]         |
+-----------------------------------+
1 row in set (0.00 sec)
 MySQL  localhost:33060+ ssl  SQL > SELECT JSON_EXTRACT(@base,'$.grapes[1]');
+-----------------------------------+
| JSON_EXTRACT(@base,'$.grapes[1]') |
+-----------------------------------+
| "white"                           |
+-----------------------------------+
1 row in set (0.00 sec)
 MySQL  localhost:33060+ ssl  SQL > SELECT JSON_EXTRACT(@base,'$.grapes[4]');
+-----------------------------------+
| JSON_EXTRACT(@base,'$.grapes[4]') |
+-----------------------------------+
| NULL                              |
+-----------------------------------+
1 row in set (0.00 sec)
 MySQL  localhost:33060+ ssl  SQL > SELECT JSON_EXTRACT(@base,'$.berries');
+-----------------------------------------------------------+
| JSON_EXTRACT(@base,'$.berries')                           |
+-----------------------------------------------------------+
| ["strawberry", "raspberry", "boysenberry", "blackberry"] |
+-----------------------------------------------------------+
1 row in set (0.00 sec)
 MySQL  localhost:33060+ ssl  SQL > SELECT JSON_EXTRACT(@base,'$.berries[2]');
+------------------------------------+
| JSON_EXTRACT(@base,'$.berries[2]') |
+------------------------------------+
| "boysenberry"                      |
+------------------------------------+
1 row in set (0.00 sec)
 MySQL  localhost:33060+ ssl  SQL > SELECT JSON_EXTRACT(@base,'$.berries[2]','$.berries[3]');
+---------------------------------------------------+
| JSON_EXTRACT(@base,'$.berries[2]','$.berries[3]') |
+---------------------------------------------------+
| ["boysenberry", "blackberry"]                    |
+---------------------------------------------------+
1 row in set (0.00 sec)
Listing 3-15

Using the JSON_EXTRACT Function

Note what happens when we use the single dollar sign. The function returns the entire document. Also, note what happens when we use a path expression, although its syntax is valid it does not evaluate to an element in the document (see the fifth command).

Note the last example where we pass in two path expressions. Then notice how it returns a JSON array whereas the example before it with only one path expression returns a JSON string value. This is one of the trickier aspects of the function. So long as you remember it returns a valid JSON string, array, or object, you will be able to use the function without issue.

The JSON_SEARCH() function is interesting because it is the opposite of the JSON_EXTRACT() function. More specific, it takes one or more values and returns path expressions to the values if they are found in the document. This makes it easier to validate your path expressions or to build path expressions on the fly.

As with the JSON_CONTAINS_PATH() function, the JSON_SEARCH() function also allows you to find the first occurrence or all occurrences returning the path expressions depending on what value you pass as the second parameter as in the following:
  • If you pass one, the function will return the first match.

  • If you pass all, the function will return all matches.

But there is a trick here too. The function takes a third parameter that forms a special search string that works as the LIKE operator in SQL statements. That is, search string argument can use the % and _ characters the same way as the LIKE operator. Note that to use a % or _ as a literal, you must precede it with the (escape) character.

The function returns 0 or 1 to indicate whether a JSON document contains the values. Note that it can return null if any of the path expressions or the document is null. An error occurs if the JSON document, or any path expression is not valid, or the second parameter is not one or all. Listing 3-16 shows several examples of using the function.

 MySQL  localhost:33060+ ssl  SQL > SET @base = '{"grapes":["red","white","green"],"berries":["strawberry","raspberry","boysenberry","blackberrry"],"numbers":["1","2","3","4","5"]}';
Query OK, 0 rows affected (0.00 sec)
 MySQL  localhost:33060+ ssl  SQL > SELECT JSON_SEARCH(@base,'all','red');
+--------------------------------+
| JSON_SEARCH(@base,'all','red') |
+--------------------------------+
| "$.grapes[0]"                  |
+--------------------------------+
1 row in set (0.00 sec)
 MySQL  localhost:33060+ ssl  SQL > SELECT JSON_SEARCH(@base,'all','gr____');
+-----------------------------------+
| JSON_SEARCH(@base,'all','gr____') |
+-----------------------------------+
| NULL                              |
+-----------------------------------+
1 row in set (0.00 sec)
 MySQL  localhost:33060+ ssl  SQL > SELECT JSON_SEARCH(@base,'one','%berry');
+-----------------------------------+
| JSON_SEARCH(@base,'one','%berry') |
+-----------------------------------+
| "$.berries[0]"                    |
+-----------------------------------+
1 row in set (0.00 sec)
 MySQL  localhost:33060+ ssl  SQL > SELECT JSON_SEARCH(@base,'all','%berry');
+--------------------------------------------------+
| JSON_SEARCH(@base,'all','%berry')                |
+--------------------------------------------------+
| ["$.berries[0]", "$.berries[1]", "$.berries[2]"] |
+--------------------------------------------------+
1 row in set (0.00 sec)
Listing 3-16

Using the JSON_SEARCH Function

Now let’s look at the last group of JSON functions; those that are utilitarian in nature allowing you to get information about the JSON document and perform simple operations to help work with JSON documents.

Utility Functions

Last, there are several functions that can return information about the JSON document, help add or remove quotes, and even find the keys in a document. We have already seen several of the utility JSON_TYPE() and JSON_VALID() functions. The following are additional utility functions you may find useful when working with JSON documents.

The JSON_DEPTH() function returns the maximum depth of a JSON document. If the document is an empty array, object, or a scalar value; the function returns a depth of 1. An array containing only elements of depth 1 or nonempty objects containing only member values of depth 1 returns a depth of 2. Listing 3-17 shows several examples.

 MySQL  localhost:33060+ ssl  SQL > SELECT JSON_DEPTH('8');
+-----------------+
| JSON_DEPTH('8') |
+-----------------+
|               1 |
+-----------------+
1 row in set (0.00 sec)
 MySQL  localhost:33060+ ssl  SQL > SELECT JSON_DEPTH('[]');
+------------------+
| JSON_DEPTH('[]') |
+------------------+
|                1 |
+------------------+
1 row in set (0.00 sec)
 MySQL  localhost:33060+ ssl  SQL > SELECT JSON_DEPTH('{}');
+------------------+
| JSON_DEPTH('{}') |
+------------------+
|                1 |
+------------------+
1 row in set (0.00 sec)
 MySQL  localhost:33060+ ssl  SQL > SELECT JSON_DEPTH('[12,3,4,5,6]');
+----------------------------+
| JSON_DEPTH('[12,3,4,5,6]') |
+----------------------------+
|                          2 |
+----------------------------+
1 row in set (0.00 sec)
 MySQL  localhost:33060+ ssl  SQL > SELECT JSON_DEPTH('[[], {}]');
+------------------------+
| JSON_DEPTH('[[], {}]') |
+------------------------+
|                      2 |
+------------------------+
1 row in set (0.00 sec)
 MySQL  localhost:33060+ ssl  SQL > SET @base = '{"grapes":["red","white","green"],"berries":["strawberry","raspberry","boysenberry","blackberrry"],"numbers":["1","2","3","4","5"]}';
Query OK, 0 rows affected (0.00 sec)
 MySQL  localhost:33060+ ssl  SQL > SELECT JSON_DEPTH(@base);
+-------------------+
| JSON_DEPTH(@base) |
+-------------------+
|                 3 |
+-------------------+
1 row in set (0.00 sec)
 MySQL  localhost:33060+ ssl  SQL > SELECT JSON_DEPTH(JSON_EXTRACT(@base, '$.grapes'));
+---------------------------------------------+
| JSON_DEPTH(JSON_EXTRACT(@base, '$.grapes')) |
+---------------------------------------------+
|                                           2 |
+---------------------------------------------+
1 row in set (0.00 sec)
Listing 3-17

Using the JSON_DEPTH Function

The JSON_KEYS() function is used to return a list of keys from the top-level value of a JSON object as a JSON array. The function also allows you to pass a path expression, which results in a list of the top-level keys from the selected path expression value. An error occurs if the json_doc argument is not a valid JSON document or the path argument is not a valid path expression or contains a * or ** wildcard. The resulting array is empty if the selected object is empty.

There is one limitation. If the top-level value has nested JSON objects, the array returned does not include keys from those nested objects. Listing 3-18 shows several examples of using this function.

 MySQL  localhost:33060+ ssl  SQL > SET @base = '{"grapes":["red","white","green"],"berries":["strawberry","raspberry","boysenberry","blackberrry"],"numbers":["1","2","3","4","5"]}';
Query OK, 0 rows affected (0.00 sec)
 MySQL  localhost:33060+ ssl  SQL > SELECT JSON_KEYS(@base);
+----------------------------------+
| JSON_KEYS(@base)                 |
+----------------------------------+
| ["grapes", "berries", "numbers"] |
+----------------------------------+
1 row in set (0.00 sec)
 MySQL  localhost:33060+ ssl  SQL > SELECT JSON_KEYS(@base,'$');
+----------------------------------+
| JSON_KEYS(@base,'$')             |
+----------------------------------+
| ["grapes", "berries", "numbers"] |
+----------------------------------+
1 row in set (0.00 sec)
 MySQL  localhost:33060+ ssl  SQL > SELECT JSON_KEYS('{"z":123,"x":{"albedo":50}}');
+------------------------------------------+
| JSON_KEYS('{"z":123,"x":{"albedo":50}}') |
+------------------------------------------+
| ["x", "z"]                               |
+------------------------------------------+
1 row in set (0.00 sec)
 MySQL  localhost:33060+ ssl  SQL > SELECT JSON_KEYS('{"z":123,"x":{"albedo":50}}', '$.x');
+-------------------------------------------------+
| JSON_KEYS('{"z":123,"x":{"albedo":50}}', '$.x') |
+-------------------------------------------------+
| ["albedo"]                                      |
+-------------------------------------------------+
1 row in set (0.00 sec)
Listing 3-18

Using the JSON_KEYS Function

The JSON_LENGTH() function returns the length of the JSON document passed. It also allows you to pass in a path expression and if provided, will return the length of the value that matches the path expression. An error occurs if the json_doc argument is not a valid JSON document or the path argument is not a valid path expression or contains a * or ** wildcard. However, the value returned has several constraints as in the following:
  • A scalar has length 1.

  • An array has a length equal to the number of array elements.

  • An object has a length equal to the number of object members.

However, there is one surprising limitation: the length returned does not count the length of nested arrays or objects. Thus, you must use this function carefully using the path expression for nested documents.

Listing 3-19 shows several examples of using the function.

 MySQL  localhost:33060+ ssl  SQL > SET @base = '{"grapes":["red","white","green"],"berries":["strawberry","raspberry","boysenberry","blackberrry"],"numbers":["1","2","3","4","5"]}';
Query OK, 0 rows affected (0.00 sec)
 MySQL  localhost:33060+ ssl  SQL > SELECT JSON_LENGTH(@base,'$');
+------------------------+
| JSON_LENGTH(@base,'$') |
+------------------------+
|                      3 |
+------------------------+
1 row in set (0.00 sec)
 MySQL  localhost:33060+ ssl  SQL > SELECT JSON_LENGTH(@base,'$.grapes');
+-------------------------------+
| JSON_LENGTH(@base,'$.grapes') |
+-------------------------------+
|                             3 |
+-------------------------------+
1 row in set (0.00 sec)
 MySQL  localhost:33060+ ssl  SQL > SELECT JSON_LENGTH(@base,'$.grapes[1]');
+----------------------------------+
| JSON_LENGTH(@base,'$.grapes[1]') |
+----------------------------------+
|                                1 |
+----------------------------------+
1 row in set (0.00 sec)
 MySQL  localhost:33060+ ssl  SQL > SELECT JSON_LENGTH(@base,'$.grapes[4]');
+----------------------------------+
| JSON_LENGTH(@base,'$.grapes[4]') |
+----------------------------------+
|                             NULL |
+----------------------------------+
1 row in set (0.00 sec)
 MySQL  localhost:33060+ ssl  SQL > SELECT JSON_LENGTH(@base,'$.berries');
+--------------------------------+
| JSON_LENGTH(@base,'$.berries') |
+--------------------------------+
|                              4 |
+--------------------------------+
1 row in set (0.00 sec)
 MySQL  localhost:33060+ ssl  SQL > SELECT JSON_LENGTH(@base,'$.numbers');
+--------------------------------+
| JSON_LENGTH(@base,'$.numbers') |
+--------------------------------+
|                              5 |
+--------------------------------+
1 row in set (0.00 sec)
Listing 3-19

Using the JSON_LENGTH Function

Note the fourth command returns null because the path expression, although valid syntax, does not evaluate to a value or nested JSON array or object.

The JSON_QUOTE() function is a handy function to use that will help you add quotes where they are appropriate. That is, the function quotes a string as a JSON string by wrapping it with double quote characters and escaping interior quote and other characters and returns the result. Note that this function does not operate on a JSON document, rather, only a string.

You can use this function to produce a valid JSON string literal for inclusion within a JSON document. Listing 3-20 shows a few short examples of using the function to quote JSON strings.

 MySQL  localhost:33060+ ssl  SQL > SELECT JSON_QUOTE("test");
+--------------------+
| JSON_QUOTE("test") |
+--------------------+
| "test"             |
+--------------------+
1 row in set (0.00 sec)
 MySQL  localhost:33060+ ssl  SQL > SELECT JSON_QUOTE('[true]');
+----------------------+
| JSON_QUOTE('[true]') |
+----------------------+
| "[true]"             |
+----------------------+
1 row in set (0.00 sec)
 MySQL  localhost:33060+ ssl  SQL > SELECT JSON_QUOTE('90125');
+---------------------+
| JSON_QUOTE('90125') |
+---------------------+
| "90125"             |
+---------------------+
1 row in set (0.00 sec)
 MySQL  localhost:33060+ ssl SQL > SELECT JSON_QUOTE('["red","white","green"]');
+---------------------------------------+
| JSON_QUOTE('["red","white","green"]') |
+---------------------------------------+
| "["red","white","green"]"       |
+---------------------------------------+
1 row in set (0.00 sec)
Listing 3-20

Using the JSON_QUOTE Function

Note that in the last example the function adds the escape character () because the string passed contains quotes. Why is this happening? Remember, this function takes a string, not a JSON array as the parameter.

The JSON_UNQUOTE() function is the opposite of the JSON_QUOTE() function. The JSON_UNQUOTE() function removes quotes JSON value and returns the result as a utf8mb4 string. The function is designed to recognize and not alter markup sequences as in the following:
  • ": A double quote (") character

  • : A backspace character

  • f: A formfeed character

  • : A newline (linefeed) character

  • : A carriage return character

  • : A tab character

  • \: A backslash () character

Listing 3-21 shows examples of using the function.

 MySQL  localhost:33060+ ssl  SQL > SELECT JSON_UNQUOTE("test 123");
+--------------------------+
| JSON_UNQUOTE("test 123") |
+--------------------------+
| test 123                 |
+--------------------------+
1 row in set (0.00 sec)
 MySQL  localhost:33060+ ssl  SQL > SELECT JSON_UNQUOTE('"true"');
+------------------------+
| JSON_UNQUOTE('"true"') |
+------------------------+
| true                   |
+------------------------+
1 row in set (0.00 sec)
 MySQL  localhost:33060+ ssl  SQL > SELECT JSON_UNQUOTE('"true"');
+--------------------------+
| JSON_UNQUOTE('"true"') |
+--------------------------+
| true                     |
+--------------------------+
1 row in set (0.00 sec)
 MySQL  localhost:33060+ ssl  SQL > SELECT JSON_UNQUOTE('9 0 125');
+-----------------------------+
| JSON_UNQUOTE('9 0 125') |
+-----------------------------+
| 9 0 125                    |
+-----------------------------+
1 row in set (0.00 sec)
Listing 3-21

Using the JSON_UNQUOTE Function

The JSON_PRETTY() function formats a JSON document for easier viewing. You can use this to produce an output to send to users or to make the JSON look a bit nicer in the shell. Listing 3-22 shows an example without the function and the same with the function. Note how much easier it is to read when using JSON_PRETTY().

 MySQL  localhost:33060+ ssl  SQL > SET @base = '{"name": {"last": "Throckmutton", "first": "Billy-bob"}, "address": {"zip": "90125", "city": "Melborne", "state": "California", "street": "4 Main Street"}}';
Query OK, 0 rows affected (0.00 sec)
 MySQL  localhost:33060+ ssl  SQL > SELECT @base G
*************************** 1. row ***************************
@base: {"name": {"last": "Throckmutton", "first": "Billy-bob"}, "address": {"zip": "90125", "city": "Melborne", "state": "California", "street": "4 Main Street"}}
1 row in set (0.00 sec)
 MySQL  localhost:33060+ ssl  SQL > SELECT JSON_PRETTY(@base) G
*************************** 1. row ***************************
JSON_PRETTY(@base): {
  "name": {
    "last": "Throckmutton",
    "first": "Billy-bob"
  },
  "address": {
    "zip": "90125",
    "city": "Melborne",
    "state": "California",
    "street": "4 Main Street"
  }
}
1 row in set (0.00 sec)
Listing 3-22

Using the JSON_PRETTY Function

There are also functions for checking size; JSON_STORAGE_FREE() and JSON_STORAGE_SIZE() . The first is used after a partial update and the second is used to get the size of the binary representation of the JSON document. See the online MySQL reference manual for more details on these functions as they are new and not commonly used except for very special circumstances in which size is a concern.

Finally, there is a new function released in version 8.0.4 intriguingly named JSON_TABLE(). This function takes a JSON document and returns a tabular data list. In basic terms, rather than returning output as JSON, this function returns rows as a result set. Thus, you can use this function where you need more traditional rows to work within your applications.

The function has some peculiar syntax. It takes as parameters a JSON document (array), and an expression path and column definition. The last two are not separated by a comma (strangely). This arrangement makes the function a bit harder to use but once you see a working example it is easier to understand. So, let’s do that. Listing 3-23 demonstrates how to use the function.

MySQL  localhost:33060+  SQL > set @phones = '[{"name":"Bill Smith","phone":"8013321033"},{"name":"Folley Finn","phone":"9991112222"},{"name":"Carrie Tonnesth","phone":"6498881212"}]';
Query OK, 0 rows affected (0.00 sec)
MySQL  localhost:33060+  SQL > SELECT * FROM JSON_TABLE(@phones, "$[*]" COLUMNS(name char(20) PATH '$.name', phone char(16) PATH '$.phone')) as phone_list;
 +-----------------+------------+
| name            | phone      |
+-----------------+------------+
| Bill Smith      | 8013321033 |
| Folley Finn     | 9991112222 |
| Carrie Tonnesth | 6498881212 |
+-----------------+------------+
3 rows in set (0.00 sec)
Listing 3-23

Using the JSON_TABLE Function

Note that we are using a JSON array of names and phone numbers to keep it simple. The function is used as if it were a table so we add it to the FROM clause on a SELECT statement. The parameters are the JSON document, then the path and column definition. The expression path used is simply retrieving the entire element from the array. You can use a variety of path expression here if you wanted to select only part of the document to operate on. Next is the column definition and this should look familiar to you—it’s like column definitions for tables. The difference is we append a path expression on the end with the keyword PATH. This simply locates the value in the JSON document.

As you can imagine, you can form complex definitions drilling down to precisely the elements you want. The demand and use cases for this function will likely grow given that it is a recent addition, but if you need to turn a JSON document into a result set, this function can achieve those results albeit with some creativity and path expressions.

For more information about the JSON_TABLE() function, see the section entitled “JSON Table Functions” in the online MySQL reference manual.

Tip

For more information about JSON functions, see the online MySQL reference manual. The JSON functions are listed with the other functions based on use. I recommend searching the document for the function you want to learn more about or use the index entitled, “Function Index,” which lists all the functions in alphabetical order.

Now that we know more about JSON, the "Combining SQL and JSON—Indexing JSON DATA" section presents some advanced topics for working with JSON in SQL statements.

Combining SQL and JSON - Indexing JSON Data

One of the definitions of NoSQL is “not only SQL” and that moniker applies to MySQL when you consider that you can use JSON documents with your relational data. As we have seen in the examples describing the JSON functions, you can add JSON columns to your tables and store JSON data in the fields.

However, instead of storing the JSON document as a string, MySQL stores the JSON document using a special internal structure that permits MySQL to access, find, and extract the JSON document elements quickly from the row data. Note that this does not mean that MySQL can index the JSON data. In fact, JSON data columns cannot be indexed. At least, not directly. In this section, we will see how to index JSON data to help optimize searching on data elements for rows that contain JSON documents.

What About Converting Text To Json?

If you have a database in which you have stored semistructured data in a TEXT or BLOB field, you may want to consider converting the data to JSON documents. The JSON functions we’ve seen in this chapter are your key to successfully converting the data such as JSON_ARRAY(), JSON_OBJECT(), and JSON_VALID(). I will discuss more about this topic in Chapter 9, including suggestions and examples on how to convert existing data. You may also want to check out various blogs on converting data to JSON—just google phrases similar to, “convert to JSON.” Although most blogs are Java-based, you can use them to get ideas for how to convert your own data.

Some may think the restriction prohibiting indexing of JSON columns an oversight, but it isn’t. Consider the fact that JSON documents are semistructured data that is not required to conform to any specific layout. That is, one row could contain a JSON document that not only has different keys but also may arrange the document in a different order.

Although this isn’t necessarily a show stopper for indexing and despite the special, internal mechanism used to access data in the document, indexing JSON documents directly would be cumbersome and likely to perform poorly. However, all is not lost. MySQL 5.7 introduced a new feature called generated columns (sometimes called virtual columns) .

Generated columns are dynamically resolved columns that are defined by the CREATE or ALTER TABLE statements. There are two types of virtual columns: those that are generated on demand (called virtual generated columns), which do not use any additional storage; and those generated columns that can be stored in the rows. Virtual generated columns use the VIRTUAL option and stored generated columns use the STORED option in the CREATE or ALTER TABLE statement.

So how does this work? We create the generated column to extract data from the JSON document then use that column to create an index. Thus, the index can be used to find rows more quickly. That is if you want to perform grouping, ordering, or want to search for a subset of rows that predicate on the JSON data, you can create and index for the optimizer to use to retrieve the data more quickly.

Let’s see an example. The following shows a table I created to store information in a JSON column.

CREATE TABLE `test`.`thermostats` (
  `model_number` char(20) NOT NULL,
  `manufacturer` char(30) DEFAULT NULL,
  `capabilities` json DEFAULT NULL,
  PRIMARY KEY (`model_number`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
INSERT INTO `test`.`thermostats` VALUES ('AB-90125-C1', 'Jasper', '{"rpm": 1500, "color": "beige", "modes": ["ac"], "voltage": 110, "capability": "auto fan"}');
INSERT INTO `test`.`thermostats` VALUES ('ODX-123','Genie','{"rpm": 3000, "color": "white", "modes": ["ac", "furnace"], "voltage": 220, "capability": "fan"}');

Note that this table has a single JSON field and a single character field for the model number that is also the primary key. Suppose the rows contain JSON data such as the following in the capabilities column.

 MySQL  localhost:33060+ ssl  SQL > SELECT * FROM `test`.`thermostats` LIMIT 2 G
*************************** 1. row ***************************
model_number: AB-90125-C1
manufacturer: Jasper
capabilities: {"rpm": 1500, "color": "beige", "modes": ["ac"], "voltage": 110, "capability": "auto fan"}
*************************** 2. row ***************************
model_number: ODX-123
manufacturer: Genie
capabilities: {"rpm": 3000, "color": "white", "modes": ["ac", "furnace"], "voltage": 220, "capability": "fan"}
2 rows in set (0.00 sec)

Now suppose we wanted to execute queries to select rows by one or more of the data elements in the JSON document. For example, suppose we wanted to run queries that locate rows that have fans that operate at 110 volts. If the table contains hundreds of thousands or even tens of millions of rows and there is not index, the optimizer must read all the rows (a table scan). However, if there is an index on the data, the optimizer merely needs to generate the virtual generated column, which is potentially more efficient.

To mitigate the potential performance issue, we can add a virtual generated column on the table using the voltage element. The following shows the ALTER TABLE statements we can use to add the virtual generated column.

ALTER TABLE `test`.`thermostats` ADD COLUMN voltage INT GENERATED ALWAYS AS (capabilities->'$.voltage') VIRTUAL;
ALTER TABLE `test`.`thermostats` ADD INDEX volts (voltage);

Note

If you leave off the option, the generated column generated is a virtual generated column.

You also can recreate the table if you want, but that will require reloading the data. However, I show the new CREATE TABLE statement below so you can see how to create a virtual generated column on the table at the time that it is created.

CREATE TABLE `test`.`thermostats` (
  `model_number` char(20) NOT NULL,
  `manufacturer` char(30) DEFAULT NULL,
  `capabilities` json DEFAULT NULL,
  `voltage` int(11) GENERATED ALWAYS AS (json_extract(`capabilities`,'$.voltage')) VIRTUAL,
  PRIMARY KEY (`model_number`),
  KEY `volts` (`voltage`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;

Note that I used the shortcut -> in the ALTER TABLE statement but the CREATE TABLE statement has the JSON_EXTRACT() function instead.

If you’re curious if adding the virtual generated column and index makes a difference, Listing 3-24 shows how the optimizer would run the query before adding the column and after adding the column.

 MySQL  localhost:33060+ ssl  SQL > DROP TABLE IF EXISTS `test`.`thermostats`;
Query OK, 0 rows affected (0.00 sec)
 MySQL  localhost:33060+ ssl  SQL > CREATE TABLE `test`.`thermostats` (`model_number` char(20) NOT NULL,`manufacturer` char(30) DEFAULT NULL,`capabilities` json DEFAULT NULL,PRIMARY KEY (`model_number`)) ENGINE=InnoDB DEFAULT CHARSET=latin1;
Query OK, 0 rows affected (0.00 sec)
 MySQL  localhost:33060+ ssl  SQL > INSERT INTO `test`.`thermostats` VALUES ('ODX-123','Genie','{"rpm": 3000, "color": "white", "modes": ["ac", "furnace"], "voltage": 220, "capability": "fan"}');
Query OK, 1 row affected (0.00 sec)
 MySQL  localhost:33060+ ssl  SQL > INSERT INTO `test`.`thermostats` VALUES ('AB-90125-C1', 'Jasper', '{"rpm": 1500, "color": "beige", "modes": ["ac"], "voltage": 110, "capability": "auto fan"}');
Query OK, 1 row affected (0.00 sec)
# Query without virtual generated column .
 MySQL  localhost:33060+ ssl  SQL > EXPLAIN SELECT * FROM thermostats WHERE capabilities->'$.voltage' = 110 G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: thermostats
   partitions: NULL
         type: ALL
possible_keys: NULL
          key: NULL
      key_len: NULL
          ref: NULL
         rows: 23302
     filtered: 100.00
        Extra: Using where
1 row in set, 1 warning (0.00 sec)
Note (code 1003): /* select#1 */ select `test`.`thermostats`.`model_number` AS `model_number`,`test`.`thermostats`.`manufacturer` AS `manufacturer`,`test`.`thermostats`.`capabilities` AS `capabilities` from `test`.`thermostats` where (json_extract(`test`.`thermostats`.`capabilities`,'$.voltage') = 110)
 MySQL  localhost:33060+ ssl  SQL > ALTER TABLE `test`.`thermostats` ADD COLUMN color char(20) GENERATED ALWAYS AS (capabilities->'$.color') VIRTUAL;
Query OK, 0 rows affected (0.00 sec)
# Query with virtual generated column.
 MySQL  localhost:33060+ ssl  SQL > DROP TABLE `test`.`thermostats`;
Query OK, 0 rows affected (0.00 sec)
 MySQL  localhost:33060+ ssl  SQL > CREATE TABLE `thermostats` (`model_number` char(20) NOT NULL, `manufacturer` char(30) DEFAULT NULL, `capabilities` json DEFAULT NULL, `voltage` int(11) GENERATED ALWAYS AS (json_extract(`capabilities`,'$.voltage')) VIRTUAL, PRIMARY KEY (`model_number`), KEY `volts` (`voltage`)) ENGINE=InnoDB DEFAULT CHARSET=latin1;
Query OK, 0 rows affected (0.00 sec)
 MySQL  localhost:33060+ ssl  SQL > EXPLAIN SELECT * FROM thermostats WHERE capabilities->'$.voltage' = 110 G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: thermostats
   partitions: NULL
         type: ALL
possible_keys: NULL
          key: NULL
      key_len: NULL
          ref: NULL
         rows: 1102
     filtered: 100.00
        Extra: Using where
1 row in set, 1 warning (0.00 sec)
Note (code 1003): /* select#1 */ select `test`.`thermostats`.`model_number` AS `model_number`,`test`.`thermostats`.`manufacturer` AS `manufacturer`,`test`.`thermostats`.`capabilities` AS `capabilities`,`test`.`thermostats`.`color` AS `color` from `test`.`thermostats` where (json_extract(`test`.`thermostats`.`capabilities`,'$.voltage') = 110)
Listing 3-24

Optimizer EXPLAIN Results for Query

Note that the first EXPLAIN shows no use of an index (no key, key_len) whereas the second does show the use of an index. The rows result shows how many rows (estimated) will be read to make the comparison. It is clear that adding a generated column and an index can help us optimize our queries of JSON data in relational tables. Cool.

However, there is one thing the example did not cover. If the JSON data element is a string, you must use the JSON_UNQUOTE() function to remove the quotes from the string. Let’s suppose we wanted to add a generated column for the color data element. If we add the column and index with the ALTER TABLE statements without removing the quotes, we will get some unusual results as shown in Listing 3-25.

 MySQL  localhost:33060+ ssl  SQL > DROP TABLE IF EXISTS `test`.`thermostats`;
Query OK, 0 rows affected (0.00 sec)
 MySQL  localhost:33060+ ssl  SQL > CREATE TABLE `test`.`thermostats` (`model_number` char(20) NOT NULL,`manufacturer` char(30) DEFAULT NULL,`capabilities` json DEFAULT NULL,PRIMARY KEY (`model_number`)) ENGINE=InnoDB DEFAULT CHARSET=latin1;
Query OK, 0 rows affected (0.00 sec)
 MySQL  localhost:33060+ ssl  SQL > INSERT INTO `test`.`thermostats` VALUES ('ODX-123','Genie','{"rpm": 3000, "color": "white", "modes": ["ac", "furnace"], "voltage": 220, "capability": "fan"}');
Query OK, 1 row affected (0.00 sec)
 MySQL  localhost:33060+ ssl  SQL > INSERT INTO `test`.`thermostats` VALUES ('AB-90125-C1', 'Jasper', '{"rpm": 1500, "color": "beige", "modes": ["ac"], "voltage": 110, "capability": "auto fan"}');
Query OK, 1 row affected (0.00 sec)
 MySQL  localhost:33060+ ssl  SQL > ALTER TABLE `test`.`thermostats` ADD COLUMN color char(20) GENERATED ALWAYS AS (capabilities->'$.color') VIRTUAL;
Query OK, 0 rows affected (0.00 sec)
 MySQL  localhost:33060+ ssl  SQL > SELECT model_number, color FROM thermostats WHERE color = "beige";                          
Empty set (0.00 sec)
 MySQL  localhost:33060+ ssl  SQL > SELECT model_number, color FROM thermostats LIMIT 2;                                                
+--------------+---------+
| model_number | color   |
+--------------+---------+
| AB-90125-C1  | "beige" |
| ODX-123      | "white" |
+--------------+---------+
2 rows in set (0.00 sec)
 MySQL  localhost:33060+ ssl  SQL > ALTER TABLE thermostats DROP COLUMN color;
Query OK, 0 rows affected (0.00 sec)
 MySQL  localhost:33060+ ssl  SQL > ALTER TABLE thermostats ADD COLUMN color char(20) GENERATED ALWAYS AS (JSON_UNQUOTE(capabilities->'$.color')) VIRTUAL;
Query OK, 0 rows affected (0.00 sec)
 MySQL  localhost:33060+ ssl  SQL > SELECT model_number, color FROM thermostats WHERE color = 'beige' LIMIT 1;   
+--------------+-------+
| model_number | color |
+--------------+-------+
| AB-90125-C1  | beige |
+--------------+-------+
1 row in set (0.00 sec)
Listing 3-25

Removing Quotes for Generated Columns on JSON Strings

Note that in the first SELECT statement, there is nothing returned. This is because the virtual generated column used the JSON string with the quotes. This is often a source of confusion when mixing SQL and JSON data. Note that in the second SELECT statement, we see there should have been several rows returned. Note also that after I dropped the column and added it again with the JSON_UNQUOTE() function, the SELECT returns the correct data.

We normally use a virtual generated column so that we don’t store anything extra in the row. This is partly because we may not use the index on the JSON data very often and may not need it maintained, but more important because there are restrictions on how you can use/define a stored generated column. The following summarize the restrictions.
  • The table must have a primary key defined.

  • You must use either a FULLTEXT or RTREE index (instead of the default BTREE).

However, if you have a lot of rows or are using the index on the JSON data frequently or have more than one index on the JSON data, you may want to consider using the stored generated column because virtual generated columns can be computationally taxing when accessing complex or deeply nested data frequently.

Tip

For more information about virtual columns, see the section, “CREATE TABLE and Generated Columns” or “ALTER TABLE and Generated Columns” in the online MySQL reference manual ( https://dev.mysql.com/doc/refman/8.0/en/ ).

Summary

The addition of the JSON data type to MySQL has ushered a paradigm shift for how we use MySQL. For the first time, we can store semistructured data inside our relational data (tables). Not only does this give us far more flexibility that we ever had before, it also means we can leverage modern programming techniques to access the data in our applications without major efforts and complexity. JSON is a well-known format and used widely in many applications.

Understanding the JSON data type is key to understanding the document store. This is because the JSON data type, while designed to work with relational data, forms the pattern for how we store data in the document store—in JSON documents! We will see more about the document store in later chapters.

In this chapter, we explored the JSON data type in more detail. We saw examples of how to work with the JSON data in relational tables via the numerous built-in JSON functions provided in MySQL. The JSON data type is key to allowing users to develop hybrid solutions that span the gulf of SQL and NoSQL applications.

In Chapter 4, I explore the MySQL Shell in more detail including an introduction on how to use the MySQL Shell to develop your applications.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.15.206.25