Chapter 4

Using JSON with SQL

IN THIS CHAPTER

Bullet Translating JSON-formatted data to SQL-formatted data, and vice versa

Bullet The SQL/JSON data model

Bullet SQL/JSON functions

Just as XML is a data interchange format for transmitting data from one platform to a different incompatible platform, JSON is another such format that performs that function. JSON is specifically designed to transfer unstructured or semi-structured data from one NoSQL database to a different, incompatible NoSQL database. As large and ever-increasing amounts of data are being stored in the semi-structured databases addressed by NoSQL technology, the need to share that data with databases adhering to the highly structured relational technology that is the domain of SQL has become clear. The SQL:2016 specification officially brings JSON into the SQL world.

Using JSON with SQL

Organizations have been storing important data in relational databases for close to 40 years, in many cases representing the most critical assets those organizations have. Nowadays, huge amounts of unstructured and semi-structured data are being generated by sensors on the Internet of Things, as well as data coursing over the Internet from billions of users around the world. It is evident that there is value in being able to combine this data with the more structured data stored in relational databases. For this reason, SQL has been expanded to enable relational databases to ingest data in JSON format, and conversely to convert relational data into JSON format for consumption by non-relational data stores.

JSON supports several data types, as does SQL. However, the JSON types don’t necessarily match up with the corresponding SQL types. As a result, SQL accepts only the two types that JSON and SQL have in common, character strings and binary strings. This means that any JSON data that you want to ingest into a relational database must first be converted into either character strings or binary strings.

In the reverse direction, data in relational tables can be transformed into JSON objects and arrays, using built-in functions.

To query JSON data stored in SQL tables, a new language named SQL/JSON path language is embedded in SQL operators. You can use these operators to query JSON data stored in relational database tables.

The SQL/JSON Data Model

The SQL/JSON data model is the solution to the problem that SQL and JSON store data in different ways. The SQL/JSON data model sits between the SQL world and the JSON world and can talk to each in its native language.

JSON data comes in a variety of forms, including JSON arrays, JSON objects, JSON members, JSON literal null values, JSON literal true values, JSON literal false values, JSON numbers, and JSON strings. SQL, on the other hand, has no counterpart for JSON arrays, JSON objects, or JSON members. Furthermore, JSON nulls, numbers, and strings are not exactly the same as SQL nulls, numbers, and strings. For its part, JSON has no counterpart for SQL datetime data. This “impedance mismatch” is addressed by a set of SQL/JSON items.

SQL/JSON items

JSON data, stored in the form of character or binary strings, can be parsed into SQL/JSON items. An SQL/JSON item can be an

  • SQL/JSON scalar
  • SQL/JSON null
  • SQL/JSON array
  • SQL/JSON object

SQL/JSON scalar

An SQL/JSON scalar is a non-null value of any of the following SQL types:

  • Character string, using Unicode characters
  • Numeric
  • Boolean
  • Datetime

SQL/JSON null

The SQL/JSON null differs from the SQL null in that it has a value distinct from any value of any SQL type, including the null type.

SQL/JSON array

An SQL/JSON array is an ordered list or zero or more SQL/JSON items. These items are called SQL/JSON elements of the SQL/JSON array. Array elements are separated by commas and enclosed in square brackets. For example:

[ 49.95, 67.60, "Swedish cabbage", false, "assertion"]

SQL/JSON object

A SQL/JSON object is an unordered collection of zero or more SQL/JSON members, where a member is a pair whose first value is a character string from the Unicode character set and whose second value is an SQL/JSON item. The first value of an SQL/JSON member is called the key, and the second value is called the bound value. Members are often called key/value pairs. They may also be called name/value pairs. SQL/JSON objects can be serialized by separating the members with commas and enclosing the entire object in curly braces. For example:

{ "name" : "Santa", "vehicle" : "sleigh", "home" : "North Pole" }

SQL/JSON sequences

An SQL/JSON sequence is an ordered list of zero or more SQL/JSON items. Such a sequence can be viewed as a container of zero or more SQL/JSON items.

Parsing JSON

The parsing operation is the importing of data in some external data format into the SQL/JSON data model. In most cases the import will be in the form of a Unicode character string, although other implementation-dependent formats are possible.

Serializing JSON

Serializing JSON is the opposite of parsing JSON. It is the exporting of values from the SQL/JSON data model back into the format of the external storage device. One restriction on this operation is that SQL/JSON datetimes cannot be serialized, and another is that SQL/JSON sequences of length greater than one also may not be serialized.

SQL/JSON Functions

Built-in functions perform the operations on JSON data. There are two types of SQL/JSON functions, query functions and constructor functions. Query functions evaluate SQL/JSON path language expressions against JSON values, producing corresponding values of SQL/JSON types, which are then converted to SQL types. I discuss SQL/JSON path language briefly later in this chapter.

Constructor functions use values of SQL types to produce JSON values, either JSON objects or JSON arrays. These JSON objects are represented as either SQL character strings or binary strings.

Query functions

There are several query functions, all of which share a common syntax. They all require a path expression, the JSON value to be input to that path expression for querying and processing, and possibly optional parameter values passed to the path expression.

The syntax is

<JSON API common syntax> ::=

<JSON context item> <comma>

<JSON path specification>

[ AS <JSON table path name> ]

[ <JSON passing clause> ]

<JSON context item> ::= <JSON value expression>

<JSON path specification> ::=

<character string literal>

<JSON passing clause> ::=

PASSING <JSON argument>

[ { <comma> <JSON argument> } ]

<JSON argument> ::=

<JSON value expression> AS <identifier>

The value expression contained in the preceding <JSON context item> is of the string type.

JSON value expression

As noted in the previous Backus-Naur Form (BNF) syntax definition, a JSON context item is just a JSON value expression. A JSON value expression is defined as

<JSON value expression> ::=

<value expression> [ <JSON input clause> ]

<JSON input clause> ::= FORMAT <JSON representation>

<JSON representation> ::=

JSON [ ENCODING { UTF8 | UTF16 | UTF32 } ]

| Implementation-defined JSON representation option>

You see from this that a JSON value expression is a value expression with an optional input clause. The input clause specifies the format of the JSON representation, which specifies the encoding as either UTF8, UTF16, UTF32, or an implementation-defined representation.

Path expression

Following the JSON context item and a comma is the JSON path specification, which must be a character string literal. The table path name and passing clause are optional parts of the JSON path specification.

Passing clause

The passing clause, as the name implies, passes parameters to the SQL/JSON path expression.

JSON output clause

When JSON data returns to an application because of the operation of a function, the application author can specify the data type, format, and encoding of the JSON text created by the function. The syntax for the output clause is

<JSON output clause> ::=

RETURNING ,data type>

[ FORMAT <JSON representation> ]

<JSON representation> ::=

JSON [ ENCODING { UTF8 | UTF16 | UTF32 } ]

| Implementation-defined JSON representation option>

If FORMAT is not specified, then JSON format is assumed.

The SQL/JSON query functions are JSON_EXISTS, JSON_VALUE, JSON_QUERY, and JSON_TABLE. These functions evaluate path language expressions against JSON values. The results returned are values of SQL/JSON types, which are then converted to SQL types. Path language is described later in this chapter.

JSON_EXISTS

JSON_EXISTS determines whether a JSON value satisfies a search condition in the path specification. The syntax is

<JSON exists predicate> ::=

JSON_EXISTS <left paren>

<JSON API common syntax>

[ <JSON exists error behavior> ON ERROR ]

<right paren>

<JSON exists error behavior> ::=

TRUE | FALSE | UNKNOWN | ERROR

If the optional ON ERROR clause is not included, the default assumption is FALSE ON ERROR. JSON_EXISTS evaluates the SQL/JSON path expression, returning a TRUE result if the path expression finds one or more SQL/JSON items.

Sample data that can be used to learn how to use the query functions, including JSON_EXISTS, can be found on pages 24 and 25 of Section 6 of the SQL Technical Report ISO/IEC TR 19075-6:2017(E), which can be downloaded from

https://standards.iso.org/ittf/PubliclyAvailableStandards/c067367_ISO_IEC_TR_19075-6_2017.zip

The data consists of two columns, K and J of a table T. K is the primary key of the table, and J is the data. The table consists of key-value pairs or arrays of key-value pairs. JASON_EXISTS tests for the existence of a given character string literal in the JSON path expression. For example

SELECT T.K

FROM T

WHERE JSON_EXISTS (T.J, 'lax $.where') ;

The primary keys of the rows that contain the word 'where' are returned as the result set of the SELECT query. The keyword 'lax' refers to the error handling that is more forgiving than 'strict' error handling. It has no effect on the result of the query. $ is the accessor that accesses the word 'where' in the current JSON object.

JSON_VALUE

The JSON_VALUE function extracts a SQL scalar value from a JSON value. The syntax is

<JSON value function> ::=

JSON VALUE <left paren>

<JSON API common syntax>

[ <JSON returning clause> ]

[ <JSON value empty behavior> ON EMPTY ]

[ <JSON value error behavior ON ERROR ]

<right paren>

<JSON returning clause> ::= RETURNING <data type>

<JSON value empty behavior> ::=

ERROR

| NULL

| DEFAULT <value expression>

<JSON value error behavior> ::=

ERROR

| NULL

| DEFAULT <value expression>

As you can probably surmise, <JSON value empty behavior> tells what to return if the result of the SQL/JSON path expression is empty.

  • NULL ON EMPTY means the result of JSON_VALUE is empty.
  • ERROR ON EMPTY means an exception is raised.
  • DEFAULT <value expression> ON EMPTY means that the value expression is evaluated and cast to the target type.

<JSON value error behavior> is similar. It specifies what to do if there is an unhandled error.

In the preceding JSON_EXISTS example, all the rows where the keyword 'where' was present in the J value column were returned. With JSON_VALUE, the value associated with a target keyword is returned. Using the same data set as the JSON_EXISTS example, where the keyword 'who' is paired with a person’s name, the following SQL code will return the names of people from all the rows where the keyword 'who' is present.

SELECT T.K,

JSON_VALUE (T.J, 'lax $.who') AS Who

FROM T ;

The result set will contain a column named K, containing the primary keys of the rows being returned, and a column named Who, containing the names that were paired with the 'who' keyword in the source data.

By default, JSON_VALUE returns an implementation-defined character string data type. The user can specify other types with a RETURNING clause.

JSON_QUERY

Although JSON_VALUE does a fine job of extracting a scalar from an SQL/JSON value, it is unable to extract an SQL/JSON array or an SQL/JSON object from an SQL/JSON value. JSON_QUERY performs those functions. The syntax for JSON_QUERY is

<JSON query> ::=

JSON_QUERY <left paren>

<JSON API common syntax>

[ <JSON output clause> ]

[ <JSON query wrapper behavior> ]

[ <JSON query quotes behavior> QUOTES

[ ON SCALAR STRING ] ]

[ <JSON query empty behavior> ON EMPTY ]

[ <JSON query error behavior> ON ERROR ]

<right paren>

The ON EMPTY and ON ERROR clauses are similar to the ones in JSON_VALUE, and are handled the same way. The difference is that the user can specify behavior when either the empty case or the error case arises.

  • If <JSON output clause> is not specified, RETURNING JSON FORMAT is the default.
  • If <JSON query empty behavior> is not specified, then NULL ON EMPTY is the default.
  • If <JSON query error behavior> is not specified, then NULL ON ERROR is the default.
  • If <JSON query wrapper behavior> is not specified, then WITHOUT ARRAY is the default.
  • If <JSON query wrapper behavior> specifies WITH and if neither CONDITIONAL nor UNCONDITIONAL is specified, then UNCONDITIONAL is the default.
  • If the value of the <JSON context item> simply contained in the <JSON API common syntax> is the null value, then the result of <JSON query> is the null value.

Using the same sample data used for the JSON_EXISTS sample query and the JSON_VALUE sample query, you can add array data to the result set, along with the results obtained with the JSON_VALUE clauses.

SELECT T.K,

JSON_VALUE (T.J, 'lax $.who') AS Who,

JSON_VALUE (T.J, 'lax $.where' NULL ON EMPTY)

AS Nali,

JSON_QUERY (T.J, 'lax $.friends') AS Friends

FROM T

WHERE JSON_EXISTS (T.J, 'lax $.friends')

The WHERE JSON_EXISTS clause eliminates any rows that do not have a key-value pair for friends. If WITH ARRAY WRAPPER is specified, then array elements returned are enclosed in a pair of square brackets.

JSON TABLE

The JSON_TABLE function is complicated, much more so than the other functions previously covered. It takes JSON data as an input and generates a relational output table from that data. The syntax definition for the simplest variant of the JSON_TABLE function takes up a full page. Nested paths and plan clauses add even more complexity to what is already pretty hard to comprehend. There is not enough room here to cover JSON_TABLE to the depth that it deserves. Instead, I refer you to page 35 and following of the SQL Technical Report ISO/IEC TR 19075-6 2017(E), which you can download from

https://standards.iso.org/ittf/PubliclyAvailableStandards/c067367_ISO_IEC_TR_19075-6_2017.zip

Constructor functions

The SQL/JSON constructor functions construct JSON objects, arrays, and aggregates, based on information stored in relational tables. This is essentially the reverse of the operations performed by the SQL/JSON query functions.

JSON_OBJECT

The JSON_OBJECT function constructs JSON objects from explicit name/value pairs. The syntax is

<JSON object constructor> ::=

JSON_OBJECT <left paren>

[ <JSON name and value> [ { <comma>

<JSON name and value> } … ]

[ <JSON constructor null clause> ]

[ <JSON key uniqueness constraint> ] ]

[ <JSON output clause> ]

<right paren>

<JSON name and value> ::=

[KEY] <JSON name> VALUE <JSON value expression>

| <JSON name> <colon> <JSON value expression>

<JSON name> ::= <character value expression>

<JSON constructor null clause> ::=

NULL ON NULL

| ABSENT ON NULL

<JSON key uniqueness constraint> ::=

WITH UNIQUE [ KEYS ]

| WITHOUT UNIQUE [ KEYS ]

Some rules go along with this syntax:

  • <JSON name> may not be NULL.
  • <JSON value expression> may not be NULL.
  • The <JSON constructor null clause>, if NULL ON NULL, produces a SQL/JSON null. If ABSENT ON NULL, it omits the key-value pair from the resulting SQL/JSON object.
  • If no JSON constructor null clause is present, the default is NULL ON NULL.

JSON_OBJECTAGG

An application developer may want to construct a JSON object from an aggregation of the data in a relational table. If, for example, a table contains two columns, one with JSON names and the other with JSON values, the JSON_OBJECTAGG function can act on that data to create a JSON object. The syntax to perform this operation is

<JSON object aggregate constructor> ::=

JSON_OBJECTAGG <left paren>

<JSON name and value>

[ <JSON constructor null clause> ]

[ <JSON key uniqueness constraint> ]

[ <JSON output clause> ]

<right paren>

If <json constructor null clause> is absent, NULL ON NULL is the default.

JSON_ARRAY

To create a JSON array, based on a list of data items in a relational database table, use the JSON_ARRAY function. The syntax is

<JSON array constructor> ::=

<JSON array constructor by enumeration>

| <JSON array constructor by query>

<JSON array constructor by enumeration ::=

JSON_ARRAY <left paren>

[ <JSON value expression> [ { <comma>

<JSON value expression> }… ]

<JSON constructor null clause> ] ]

<JSON output clause>

<right paren>

<JSON array constructor by query> ::=

JSON_ARRAY <left paren>

<query expression>

[ <JSON input clause> ]

[ <JSON constructor null clause> ]

[ <JSON output clause> ]

<right paren>

JSON_ARRAY has two variants, one that produces its result from an input list of SQL values, and the other that produces its results from a query expression invoked from within the function. If the optional JSON constructor null clause is absent, the default is ABSENT ON NULL, which is the opposite of the default behavior for JSON_OBJECT.

JSON_ARRAYAGG

You see in the section on JSON_OBJECTAGG how you could construct a JSON object based on an aggregation of relational data. In the same way, you can construct a JSON array based on an aggregation of relational data. To do so, you use the JSON_ARRAYAGG constructor function. The syntax is

<JSON array aggregate constructor> ::=

JSON_ARRAYAGG <left paren>

<JSON value expression>

[ <JSON array aggregate order by clause> ]

[ <JSON constructor null clause> ]

[ <JSON output clause> ]

<right paren>

<JSON array aggregate order by clause> ::=

ORDER BY <sort specification list>

If there is no JSON constructor null clause, the default is ABSENT ON NULL. The JSON array ORDER BY clause enables the developer to order output array elements according to one or more sort specifications, similar to the way an ORDER BY clause operates in an ordinary SQL statement.

IS JSON predicate

The IS JSON predicate tests whether a string purported to be JSON data is indeed valid JSON data. The syntax of the IS JSON predicate is

<JSON predicate> ::=

<string value expression> [ <JSON input clause> ]

IS [NOT] JSON

[ <JSON predicate type constraint> ]

[ <JSON key uniqueness constraint> ]

<JSON predicate type constraint> ::=

VALUE

| ARRAY

| OBJECT

| SCALAR

If the optional JSON input clause is not specified, then FORMAT JSON is the default. If the JSON key uniqueness constraint is not specified, then WITHOUT UNIQUE KEYS is the default.

JSON nulls and SQL nulls

JSON nulls are not the same as SQL nulls. In SQL, a zero-length string ("") is distinct from an SQL null value, which represents the absence of a definite value. In JSON, null is an actual value, and is represented by a JSON literal ("null"). JSON nulls must be distinguishable from SQL nulls. The developer must decide whether SQL null values are included in the JSON object or array being constructed, or whether they should be omitted from the object or array.

SQL/JSON Path Language

SQL/JSON path language is a query language used by the SQL/JSON query functions. It accepts a context item, a path specification, and a PASSING clause as inputs, potentially along with ON ERROR and other clauses, to execute the JSON_EXISTS, JSON_VALUE, JSON_QUERY, and JSON_TABLE functions. A path engine executes these functions and returns results to the function, which in turn passes the results on to the user.

In path language, the dollar sign ($) is the current context element, and the period (.) is an object member. Square brackets enclose array elements. From this you see that

  • $.name denotes the value of the name attribute of the current JSON object.
  • $.phones[last] denotes the last element of the array stored in the phones attribute of the current JSON object.
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.16.29.209