Chapter 1

Using XML with SQL

IN THIS CHAPTER

Bullet Using XML to bridge communication gaps

Bullet Becoming familiar with XML and XML document parts

Bullet Working with SQL’s XML data type

Bullet Mapping SQL to XML

Bullet Operating on XML data with SQL functions

XML stands for Extensible Markup Language, a general-purpose markup language that, like HTML, is a subset of SGML (Standard Generalized Markup Language). XML’s primary purpose is to serve as a means of sharing information between information systems that could have very different architectures. SQL provides the worldwide standard method for storing data in a highly structured fashion, which enables users to maintain data stores of a wide range of sizes and efficiently extract from those data stores the information they want. XML has risen from a de facto standard to an official standard vehicle for transporting data between incompatible systems, particularly over the Internet. Bringing these two powerful methods together greatly increases the value of both. Now SQL can handle data that doesn’t fit nicely into the strict relational paradigm that was originally defined by Dr. E.F. Codd. (Dr. Codd’s role in the development of SQL is spelled out in Book 1, Chapter 1.) Likewise, XML can efficiently take data from SQL databases or send data to them. The result is information that’s more readily available and easier to share.

Remember XML has come to be a popular means of sharing data over the Internet, particularly over the World Wide Web. Several derivatives of XML designed to carry specific kinds of data are in use. A few examples are RSS, XHTML, MathML, Scalable Vector Graphics, and MusicML.

Introducing XML

The XML language marks up text documents with start and end tags. The tags are in some way descriptive of the meaning of the text that they enclose. Key features are the character data itself, containers called elements, and the attributes of those elements. The data is structured as a tree, with a root element playing host to branch elements, which can in turn give rise to additional branches.

The fundamental unit of XML is a Unicode character. The ability to use both 8-bit and 16-bit versions of Unicode is required by the international XML specification. When characters are combined, they form an XML document. The document consists of one or more entities, each of which holds a portion of the document’s characters.

The XML specification doesn’t specify the names of the elements, the allowable hierarchy, or the meanings of the elements and attributes, as languages like HTML do. XML is much more flexible, leaving the specification of those items to a customizable schema. The XML specification concentrates on specifying what syntax is legal; it’s the schema that supplements the syntax rules with a set of constraints. Such a constraint can restrict element and attribute names, as well as the structure of the containment hierarchy. An element named book, for example, could be restricted to contain no more than ten elements named chapter. A different schema could allow up to 20 chapters in a book.

Following are some of XML’s salient characteristics:

  • It’s readable by both humans and machines.
  • It supports Unicode, so even ideographic languages such as Chinese can be represented.
  • It can represent a variety of common data structures, including records, lists, and trees.
  • It’s self-documenting, meaning that you can tell what it is doing just by looking at it.
  • Its elements have a simple structure and, thus, are easily parsed.
  • It adheres to an international standard.
  • It’s platform-independent.

Knowing the Parts of an XML Document

An XML document contains several parts. I describe them briefly in a moment, but first, check out a value assignment prototype:

<name attribute="value">content</name>

Here’s how you’d use that prototype in describing the components of a popular game, expressed in XML:

<?xml version="1.0" encoding="UTF-8"?>

<game name="chess">

<title>Chess game</title>

<gameboard quantity="1">board</gameboard>

<whitepiece quantity="1">king</whitepiece>

<whitepiece quantity="1">queen</whitepiece>

<whitepiece quantity="2">rook</whitepiece>

<whitepiece quantity="2">bishop</whitepiece>

<whitepiece quantity="2">knight</whitepiece>

<whitepiece quantity="8">pawn</whitepiece>

<blackpiece quantity="1">king</blackpiece>

<blackpiece quantity="1">queen</blackpiece>

<blackpiece quantity="2">rook</blackpiece>

<blackpiece quantity="2">bishop</blackpiece>

<blackpiece quantity="2">knight</blackpiece>

<blackpiece quantity="8">pawn</blackpiece>

<instructions>

<action>Place pieces on their start squares.</action>

<action>Play chess, white moving first.</action>

<action>Play until someone wins or a draw is declared.</action>

<action>Shake hands.</action>

</instructions>

</game>

XML declaration

The first line of an XML document usually is its declaration. The declaration is optional but informative: It states the version of XML that’s being used and may also contain information about character encoding and external objects that the document depends on. An XML declaration looks something like this:

<?xml version "1.0" encoding="UTF-8"?>

UTF indicates that a version of Unicode is being used that employs 1 to 4 bytes to hold a character. For alphabetic languages such as English, 1 byte (UTF-8) is fine. Chinese, for example, would use UTF-16, which uses a minimum of 2 bytes per character.

Elements

After the XML declaration, the rest of an XML document consists of elements. Elements may contain attributes and content, and they may be nested. An element starts with a start tag consisting of a name enclosed in angle brackets and ends with an end tag consisting of the same name preceded by a slash, also enclosed in angle brackets. The element’s content is anything that appears between the tags. Here’s an example of an element:

<action>Place pieces on their start squares.</action>

Nested elements

Elements can be nested inside other elements, as in this example:

<instructions>

<action>Place pieces on their start squares.</action>

<action>Play chess, white moving first.</action>

<action>Play until someone wins or a draw is declared.</action>

<action>Shake hands.</action>

</instructions>

The instructions element contains the four action elements.

The document element

Every XML document must have one (and only one) top-level element serving as the root of the tree structure. This element is called the document element. The XML description of a chess game given in the section titled “Knowing the Parts of an XML Document” is an example of an XML document element.

Empty elements

An element may exist but have no content. In such a case, it consists of nothing but its start and end tags. A couple of alternative syntax possibilities for an empty element are available. Here are the three possible syntaxes, all of which are equivalent:

<nothing></nothing>

<nothing />

<nothing/>

You are probably saying to yourself, “That seems pretty worthless! Why would anybody want to do that?” Being able to set up empty elements is actually a very useful feature. Granted, you probably wouldn’t write an XML element that looked like this:

<nothing></nothing>

But you might write one that looks like this:

<book name = "GAN"

<title> = Great American Novel</title>

<chapter></chapter>

<chapter></chapter>

<chapter></chapter>

<chapter></chapter>

<chapter></chapter>

<chapter></chapter>

</book>

You are at the beginning of writing your first book. You don’t have any content yet, but you know you want it to have six chapters. You can start out with the chapter elements empty and fill them one by one as you write. It’s often a good idea to plan the skeleton of a project before starting right away with the detailed content.

Attributes

An element may or may not have attributes. Attributes are pairs of names and values included in the start tag — after the element name — which give you some information about the element. Attribute values must be enclosed in quotes, either single or double, and each attribute name should appear only once in an element.

<blackpiece quantity="2">rook</blackpiece>

In this example, the blackpiece element has one attribute: quantity, which has a value of 2. The quotes show that 2 is a character that represents the number 2 rather than being the number itself.

Entity references

As I discuss later in this chapter, in the section titled “Mapping identifiers to XML,” XML is considerably more restrictive than SQL in terms of the characters it recognizes. Whereas SQL recognizes a large number of special characters, XML pretty much recognizes only the uppercase and lowercase letters, the integers, and a few punctuation marks. To include special characters in an XML document, you can use entity references. An entity reference is a placeholder that represents an entity — typically, an unusual character. An entity reference consists of an ampersand (&), the reference, and an ending semicolon (;). XML has five predeclared entity references:

  • &amp; (&)
  • &lt; (<)
  • &gt; (>)
  • &apos; ()
  • &quot; ()

Here’s an example of a predeclared XML entity that uses the entity reference for the ampersand:

<company>Smith &amp; Sons, Plumbing, Inc.</company>

When viewed in a web browser, this code displays the following:

Smith & Sons, Plumbing, Inc.

In addition to the five predeclared entity references, you can create additional ones by using the document’s Document Type Definition (DTD) or XML schema. Here’s an example of the declaration of an entity that hasn’t been predeclared:

<?xml version="1.0" encoding=UTF-8"?>

<!DOCTYPE rtm [

<!ENTITY reg "&#xAE;">

<!ENTITY registered-TM "ACME® Fireworks, Inc.">

]>

<rtm>

&registered-TM;

</rtm>

AE is the hexadecimal code for the registered trademark symbol. When displayed in a browser, the rtm document appears as follows:

<rtm> ACME® Fireworks, Inc. </rtm>

Numeric character references

Another way of representing a nonstandard character is with a numeric character reference. This method just uses the decimal or hexadecimal code for a character. For decimal, the code is preceded by a # sign. For hexadecimal, for example, the code is preceded by #x. Here’s a decimal example:

<trademark>ACME&#174; Fireworks, Inc.</trademark>

Here’s a hexadecimal example:

<trademark>ACME&#xAE; Fireworks, Inc.</trademark>

Using XML Schema

XML Schema is one of several XML schema languages that are more powerful and flexible than the DTD used in “Entity references,” earlier in this chapter. A schema is a set of rules that a valid XML document must conform to. XML Schema sets up such a set of rules but goes beyond that basic task to the extent of validating information that adheres to specific data types. It’s particularly well suited to validating document processing software.

An XML schema definition (XSD) is an instance of XML Schema and usually has a file extension of .xsd. Here’s an example of a simple XSD describing a member of the Oregon Lunar Society:

<xs:schema

xmlns:xs=http://www.w3.org/2001/XMLSchema>

<xs:element name="members" type="Members"/>

<xs:complexType name="Members">

<xs:sequence>

<xs:element name="firstname" type="xs:string"/>

<xs:element name="lastname" type="xs:string"/>

<xs:element name="officeheld" type="xs:string"/>

</xs:sequence>

</ComplexType>

</xs:schema>

An XML document that conforms to this schema might look like the following:

<members

xmlns:xsi=http://www.w3.org/2001/XMLSchema-instance

xsi:noNameSpaceSchemaLocation="members.xsd">

<firstname>Bryce</firstname>

<lastname>Thoreau</lastname>

<officeheld>Archivist</officeheld>

</members>

An XSD is written in XML Schema and applies constraints on the elements and attributes that may appear in the XML documents to which it applies. It also defines the relationships among the elements and attributes, as well as the types of data that they may contain. XSDs are used to validate XML documents. With constraints, if data appears in an XML document that the XSD considers to be invalid, an error flag is raised.

Relating SQL to XML

XML, like HTML, is a markup language, which means that it’s not a full-function language such as C++ or Java. It’s not even a data sublanguage such as SQL. Unlike those languages, however, it’s cognizant of the content of the data it transports. Whereas HTML deals only with formatting the text and graphics in a document, XML gives structure to the document’s content. XML itself doesn’t deal with formatting. To do that, you have to augment XML with a style sheet. As it does with HTML, a style sheet applies formatting to an XML document.

SQL and XML provide two ways of structuring data so that you can save it and retrieve selected information from it:

  • SQL is an excellent tool for dealing with numeric and text data that can be categorized by data type and that has a well-defined size. SQL was created as a standard way to maintain and operate on data kept in relational databases.
  • XML is better at dealing with free-form data that can’t be easily categorized. The driving motivations for the creation of XML were to provide a universal standard for transferring data between dissimilar computers and for displaying it on the World Wide Web.

The strengths and goals of SQL and XML are complementary. Each reigns supreme in its own domain and forms alliances with the other to give users the information they want, when they want it, and where they want it.

Using the XML Data Type

SQL:2003 introduced a new data type to SQL: the XML type. This means that conforming implementations can store and operate on XML-formatted data directly without first converting it to XML from one of the other SQL data types.

Although it’s intrinsic to any implementation that supports it, the XML data type (including its subtypes) acts like a user-defined type (UDT). The XML type brings SQL and XML into close contact because it enables applications to perform SQL operations on XML content and XML operations on SQL content. You can include a column of the XML type with columns of any of the other predefined types covered in Book 1, Chapter 6 in a join operation in the WHERE clause of a query. In true relational database fashion, your database management system (DBMS) determines the optimal way to execute the query and then goes out and does it.

When to use the XML type

Whether you should store data in XML format depends on what you plan to do with that data. Here are some instances in which it makes sense to store data in XML format:

  • When you want to store an entire block of data and retrieve the whole block later.
  • When you want to be able to query the whole XML document. Some implementations have expanded the scope of the EXTRACT operator to enable extracting desired content from an XML document.
  • When you need strong typing of data inside SQL statements — meaning you want to severely restrict operations that mix data of different types. Using the XML type guarantees that data values are valid XML values and not just arbitrary text strings.
  • When you want to ensure compatibility with future, as-yet unspecified, storage systems that may not support existing types such as CLOB. (See Book 1, Chapter 6 for more information on CLOB.)
  • To take advantage of future optimizations that support only the XML type.

Here’s an example of how you might use the XML type:

CREATE TABLE CLIENT (

ClientName CHAR (30) NOT NULL,

Address1 CHAR (30),

Address2 CHAR (30),

City CHAR (25),

State CHAR (2),

PostalCode CHAR (10),

Phone CHAR (13),

Fax CHAR (13),

ContactPerson CHAR (30),

Comments XML(SEQUENCE) ) ;

This syntax stores an XML document in the Comments column of the CLIENT table, although not all implementations may support it yet. The document might look something like the following:

<Comments>

<Comment>

<CommentNo>1</CommentNo>

<MessageText>Is VetLab equipped to analyze penguin blood?</MessageText>

<ResponseRequested>Yes</ResponseRequested>

</Comment>

<Comment>

<CommentNo>2</CommentNo>

<MessageText>Thanks for the fast turnaround on the leopard seal sputum sample.</MessageText>

<ResponseRequested>No</ResponseRequested>

</Comment>

</Comments>

When not to use the XML type

On many occasions, it doesn’t make sense to use the XML type. Most data in relational databases today is better off in its current format than it is in XML format. Here are a couple of examples of when not to use the XML type:

  • When the data breaks down naturally into a relational structure with tables, rows, and columns.
  • When you need to do a comparison or sort on a data element, use the data element as a parameter to any scalar, built-in function other than ISNULL and COALESCE.

Mapping SQL to XML

Before you can exchange data between SQL databases and XML documents, the various elements of an SQL database must be translatable (mapped) into equivalent elements of an XML document, and vice versa. This translation needs to happen for several kinds of elements, as described in the following sections.

Mapping character sets to XML

In SQL, the character sets supported are implementation-dependent. This means that IBM’s DB2 may support character sets that aren’t supported by Microsoft’s SQL Server, and SQL Server may support character sets that aren’t supported by Oracle. Although the most common character sets are almost universally supported, using a less-common character set may make it difficult to migrate your database and application from one relational database management system (RDBMS) platform to another.

XML has no compatibility issue with character sets. It supports only one: Unicode. This is a good thing from the point of view of exchanging data between any given SQL implementation and XML. All the RDBMS vendors have to define a mapping between strings of each of their character sets and Unicode, as well as a reverse mapping from Unicode to each of their character sets. Luckily, XML doesn’t also support multiple character sets. If it did, vendors would have a many-to-many problem, requiring many more mappings and reverse mappings.

Mapping identifiers to XML

What XML calls an identifier corresponds to what SQL calls a primary key. XML is much stricter than SQL in the characters it allows in identifiers. Characters that are legal in SQL but illegal in XML must be mapped to something legal before they can become part of an XML document. SQL supports delimited identifiers, which means that all sorts of odd characters (such as %, $, and &) are legal as long as they’re enclosed within double quotes. Such characters aren’t legal in XML, however. Furthermore, XML names that begin with the characters XML in any combination of cases are reserved and, thus, can’t be used with impunity. SQL identifiers that begin with those letters have to be changed.

An agreed-on mapping bridges the identifier gap between SQL and XML. In moving from SQL to XML, all SQL identifiers are converted to Unicode. From there, any SQL identifiers that are also legal XML names are left unchanged. SQL identifier characters that aren’t legal XML names are replaced by a hexadecimal code that takes the form _xNNNN_ or _xNNNNNNNN_, where N represents an uppercase hexadecimal digit. The underscore _ is represented by _x005F_, for example, and the colon is represented by _x003A_. These representations are the codes for the Unicode characters for the underscore and colon. The case in which an SQL identifier starts with the characters x, m, and l is handled by prefixing all such instances with a code in the form _xFFFF_.

Conversion from XML to SQL is much easier. All you need to do is scan the characters of an XML name for a sequence of _xNNNN_ or _xNNNNNNNN_. Whenever you find such a sequence, replace it with the character that the Unicode corresponds to. When you come across _x003A_, for example, replace it with :. If an XML name begins with the characters _xFFFF_, ignore them.

By following these simple rules, you can map an SQL identifier to an XML name and then back to an SQL identifier again. This happy situation, however, doesn’t hold for a mapping from XML name to SQL identifier and back to XML name.

Mapping data types to XML

The SQL standard specifies that an SQL data type be mapped to the closest possible XML schema data type. The designation closest possible means that all values allowed by the SQL type are allowed by the XML schema type, and the fewest possible values not allowed by the SQL type are allowed by the XML schema type. XML facets, such as maxInclusive and minInclusive, can restrict the values allowed by the XML schema type to the values allowed by the corresponding SQL type. (A facet is a single defining aspect of a value space.) If the SQL data type restricts values of the INTEGER type to the range –2157483648<value<2157483647, in XML the maxInclusive value can be set to 2157483647, and the minInclusive value can be set to –2157483648. Here’s an example of such a mapping:

<xsd:simpleType>

<xsd:restriction base="xsd:integer">

<xsd:maxInclusive value="2157483647"/>

<xsd:minInclusive value="-2157483648"/>

<xsd:annotation>

<sqlxml:sqltype name="INTEGER"/>

</xsd:annotation>

</xsd:restriction>

</xsd:simpleType>

The annotation section retains information from the SQL type definition that isn’t used by XML but may be of value later if this document is mapped back to SQL.

Mapping nonpredefined data types to XML

In the SQL standard, the nonpredefined data types include DOMAIN, DISTINCT UDT, ROW, ARRAY, and MULTISET. You can map each of these data types to XML-formatted data by using appropriate XML code. The next few sections show examples of how to map these types.

DOMAIN

To map an SQL domain to XML, first you must have a domain. For this example, create one by using a CREATE DOMAIN statement, as follows:

CREATE DOMAIN WestCoast AS CHAR (2)

CHECK (State IN ('CA', 'OR', 'WA', 'AK')) ;

Now create a table that uses that domain, as follows:

CREATE TABLE WestRegion (

ClientName CHAR (20) NOT NULL,

State WestCoast NOT NULL

) ;

Here’s the XML schema to map the domain to XML:

<xsd:simpleType>

Name="DOMAIN.Sales.WestCoast">

 

<xsd:annotation>

<xsd:appinfo>

<sqlxml:sqltype kind="DOMAIN"

schemaName="Sales"

typeName="WestCoast"

mappedType="CHAR_2"

final="true" />

</xsd:appinfo>

</xsd:annotation>

 

<xsd:restriction base="CHAR_2"/>

 

</xsd:simpleType>

From the appinfo element we see that a domain of type WestCoast exists in the Sales schema, and that the data in the domain is of the CHAR type with a maximum length of 2 characters.

When this mapping is applied, it results in an XML document that contains something like the following:

<WestRegion>

<row>

<ClientName>Nootka Enterprises</ClientName>

<State>AK</State>

</row>

<row>

<ClientName>Surfin' USA</ClientName>

<State>CA</State>

</row>

<row>

<ClientName>Cornelius Semiconductor</ClientName>

<State>OR</State>

</row>

<row>

<ClientName>Orca Inc.</ClientName>

<State>WA</State>

</row>

</WestRegion>

DISTINCT UDT

With a distinct UDT, you can do much the same things that you can do with a domain, but with stronger typing. Start by creating a distinct UDT with the help of a CREATE TYPE statement, like this one:

CREATE TYPE WestCoast AS CHAR (2) FINAL ;

The XML Schema to map this type to XML is as follows:

<xsd:simpleType>

Name="UDT.Sales.WestCoast">

 

<xsd:annotation>

<xsd:appinfo>

<sqlxml:sqltype kind="DISTINCT"

schemaName="Sales"

typeName="WestCoast"

mappedType="CHAR_2"

final="true"/>

<xsd:appinfo>

</xsd:annotation>

<xsd:restriction base="CHAR_2"/>

 

</xsd:simpleType>

This code creates an element that’s the same as the one created for the preceding domain.

ROW

The ROW type enables you to cram a whole row’s worth of information into a single field of a table row. You can create a ROW type in SQL as part of the table definition in the following manner:

CREATE TABLE CONTACTINFO (

Name CHAR (30)

Phone ROW (Home CHAR (13), Work CHAR (13))

) ;

Now you can map this type to XML with the following schema:

<xsd:complexType Name="ROW.1">

 

<xsd:annotation>

<xsd:appinfo>

<sqlxml:sqltype kind="ROW">

<sqlxml:field name="Home"

mappedType="CHAR_13"/>

<sqlxml:field name="Work"

mappedType="CHAR_13"/>

</sqlxml:sqltype>

<xsd:appinfo>

</xsd:annotation>

 

<xsd:sequence>

<xsd:element Name="Home" nillable="true"

Type="CHAR_13"/>

<xsd:element Name="Work" nillable="true"

Type="CHAR_13"/>

</xsd:sequence>

 

</xsd:complexType>

This mapping could generate the following XML for a column:

<Phone>

<Home>(888)555-1111</Home>

<Work>(888)555-1212</Work>

</Phone>

ARRAY

You can put more than one element in a single field by using an ARRAY type rather than the ROW type. In the CONTACTINFO table, for example, declare Phone as an array and then generate the XML Schema that maps the array to XML, as follows:

CREATE TABLE CONTACTINFO (

Name CHAR (30),

Phone CHAR (13) ARRAY [4]

) ;

Now you can map this type to XML with the following schema:

<xsd:complexType Name="ARRAY_4.CHAR_13">

 

<xsd:annotation>

<xsd:appinfo>

<sqlxml:sqltype kind="ARRAY"

maxElements="4"

mappedElementType="CHAR_13"/>

</xsd:appinfo>

</xsd:annotation>

 

<xsd:sequence>

<xsd:element Name="element"

minOccurs="0" maxOccurs="4"

nillable="true" type="CHAR_13"/>

</xsd:sequence>

 

</xsd:complexType>

This code would generate something like this:

<Phone>

<element>(888)555-1111</element>

<element>xsi:nil="true"/>

<element>(888)555-3434</element>

</Phone>

The element in the array containing xsi:nil="true" reflects the fact that the second phone number in the source table contains a null value.

MULTISET

The phone numbers in the preceding example could just as well be stored in a multiset as in an array. To map a multiset, use something akin to the following:

CREATE TABLE CONTACTINFO (

Name CHAR (30),

Phone CHAR (13) MULTISET

) ;

Now you can map this MULTISET type to XML with the following schema:

<xsd:complexType Name="MULTISET.CHAR_13">

 

<xsd:annotation>

<xsd:appinfo>

<sqlxml:sqltype kind="MULTISET"

mappedElementType="CHAR_13"/>

</xsd:appinfo>

</xsd:annotation>

 

<xsd:sequence>

<xsd:element Name="element"

minOccurs="0" maxOccurs="unbounded"

nillable="true" type="CHAR_13"/>

</xsd:sequence>

 

</xsd:complexType>

This code would generate something like the following:

<Phone>

<element>(888)555-1111</element>

<element>xsi:nil="true"/>

<element>(888)555-3434</element>

</Phone>

Mapping tables to XML

You can map a table to an XML document. Similarly, you can map all the tables in a schema or all the tables in a catalog. Privileges are maintained by the mapping. A person who has the SELECT privilege on only some table columns is allowed to map only those columns to the XML document. The mapping actually produces two documents: one containing the data in the table and the other containing the XML schema that describes the first document. Here’s an example of the mapping of an SQL table to an XML data-containing document:

<CUSTOMER>

<row>

<FirstName>Abe</FirstName>

<LastName>Abelson</LastName>

<City>Springfield</City>

<AreaCode>714</AreaCode>

<Telephone>555-1111</Telephone>

</row>

<row>

<FirstName>Bill</FirstName>

<LastName>Bailey</LastName>

<City>Decatur</City>

<AreaCode>714</AreaCode>

<Telephone>555-2222</Telephone>

</row>

.

.

.

</CUSTOMER>

The root element of the XML document has been given the name of the table. Each table row is contained within a <row> element, and each <row> element contains a sequence of column elements, each named after the corresponding column in the source table. Each column element contains a data value.

Handling null values

Because SQL data may include null values, you must decide how to represent them in an XML document. You can represent a null value as nil or absent. If you choose the nil option, the attribute xsi:nil="true" marks the column elements that represent null values. Null values might be represented in the following way:

<row>

<FirstName>Bill</FirstName>

<LastName>Bailey</LastName>

<City xsi:nil="true"/>

<AreaCode>714</AreaCode>

<Telephone>555-2222</Telephone>

</row>

If you choose the absent option, you could implement it as follows:

<row>

<FirstName>Bill</FirstName>

<LastName>Bailey</LastName>

<AreaCode>714</AreaCode>

<Telephone>555-2222</Telephone>

</row>

In this case, the row containing the null value is simply absent. No reference to it appears.

Creating an XML schema for an SQL table

You may remember from the section titled “Mapping tables to XML” that when mapping from SQL to XML, the first document generated is the one that contains the data, and the second document contains the schema information. As an example, consider the schema for the CUSTOMER document shown in that section.

<xsd:schema>

<xsd:simpleType name="CHAR_15">

<xsd:restriction base="xsd:string">

<xsd:length value = "15"/>

</xsd:restriction>

</xsd:simpleType>

 

<xsd:simpleType name="CHAR_25">

<xsd:restriction base="xsd:string">

<xsd:length value = "25"/>

</xsd:restriction>

</xsd:simpleType>

 

<xsd:simpleType name="CHAR_3">

<xsd:restriction base="xsd:string">

<xsd:length value = "3"/>

</xsd:restriction>

</xsd:simpleType>

 

<xsd:simpleType name="CHAR_8">

<xsd:restriction base="xsd:string">

<xsd:length value = "8"/>

</xsd:restriction>

</xsd:simpleType>

 

<xsd:sequence>

<xsd:element name="FirstName" type="CHAR_15"/>

<xsd:element name="LastName" type="CHAR_25"/>

<xsd:element

name="City" type="CHAR_25" nillable="true"/>

<xsd:element

name="AreaCode" type="CHAR_3" nillable="true"/>

<xsd:element

name="Telephone" type="CHAR_8" nillable="true"/>

</xsd:sequence>

 

</xsd:schema>

This schema is appropriate if the nil approach to handling nulls is used. The absent approach requires a slightly different element definition, as in this example:

<xsd:element

name="City" type="CHAR_25" minOccurs="0"/>

This element specifies that the minimum number of occurrences of City is 0. In other words, the City field need not hold a value.

Operating on XML Data with SQL Functions

The SQL standard defines several operators, functions, and pseudofunctions that, when applied to an SQL database, produce an XML result or that, when applied to XML data, produce a result in standard SQL form. The functions include XMLELEMENT, XMLFOREST, XMLCONCAT, and XMLAGG. In the following sections, I provide brief descriptions of these functions, as well as several others that are frequently used when publishing to the web. Some of the functions rely heavily on XQuery, a new standard query language designed specifically for querying XML data. I say more about XQuery in Book 6, Chapter 3.

XMLELEMENT

The XMLELEMENT operator translates a relational value into an XML element. You can use the operator in a SELECT statement to pull data in XML format from an SQL database and publish it on the web. Here’s an example:

SELECT c.LastName

XMLELEMENT ( NAME "City", c.City ) AS "Result"

FROM CUSTOMER c

WHERE LastName="Abelson" ;

Here’s the result returned:

LastName

Result

Abelson

<City>Springfield</City>

XMLFOREST

The XMLFOREST operator produces a list, or forest, of XML elements from a list of relational values. Each of the operator’s arguments produces a new element. Here’s an example of this operator:

SELECT c.LastName

XMLFOREST (c.City,

c.AreaCode,

c.Telephone ) AS "Result"

FROM CUSTOMER c

WHERE LastName="Abelson" OR LastName="Bailey" ;

This code produces the following output:

LastName

Result

Abelson

<City>Springfield</City>

<AreaCode>714</AreaCode>

<Telephone>555-1111</Telephone>

Bailey

<City>Decatur</City>

<AreaCode>714</AreaCode>

<Telephone>555-2222</Telephone>

XMLCONCAT

XMLCONCAT provides an alternative way to produce a forest of elements. It does so by concatenating its XML arguments, as in this example:

SELECT c.LastName,

XMLCONCAT(

XMLELEMENT ( NAME "first", c.FirstName,

XMLELEMENT ( NAME "last", c.LastName)

) AS "Result"

FROM CUSTOMER c ;

This code produces the following result:

LastName

Result

Abelson

<first>Abe</first>

<last>Abelson</last>

Bailey

<first>Bill</first>

<last>Bailey</last>

XMLAGG

XMLAGG, the aggregate function, takes XML documents or fragments of XML documents as input and produces a single XML document as output in GROUP BY queries. The aggregation contains a forest of elements. To illustrate the concept, take a look at the following query:

SELECT XMLELEMENT

( NAME "City",

XMLATTRIBUTES ( c.City AS "name" ) ,

XMLAGG (XMLELEMENT ( NAME "last" c.LastName )

)

) AS "CityList"

FROM CUSTOMER c

GROUP BY City ;

When run against the CUSTOMER table, this query produces the following:

CityList

<City name="Decatur">

<last>Bailey</last>

</City>

<City name="Philo">

<last>Stetson</last>

<last>Stetson</last>

<last>Wood</last>

</City

<City name="Springfield">

<last>Abelson</last>

</City>

XMLCOMMENT

The XMLCOMMENT function enables an application to create an XML comment. Its syntax is as follows:

XMLCOMMENT ( 'comment content'

[RETURNING

{ CONTENT | SEQUENCE } ] )

 

The example

XMLCOMMENT ('Back up database at 2 am every night.')

would create an XML comment that looks like this:

<!--Back up database at 2 am every night. -->

XMLPARSE

The XMLPARSE function performs a nonvalidating parse of a string to produce an XML value. You might use it like this:

XMLPARSE (DOCUMENT ' GREAT JOB! '

PRESERVE WHITESPACE )

The preceding code would produce an XML value that is either XML(UNTYPED DOCUMENT) or XML(ANY DOCUMENT). Which of the two subtypes would be chosen is implementation-defined.

XMLPI

The XMLPI function allows applications to create XML processing instructions. The syntax for this function is as follows:

XMLPI NAME target

[ , string-expression ]

[RETURNING

{ CONTENT | SEQUENCE } ] )

target identifies the target of the processing instruction (PI). string-expression is the content of the PI. This function creates an XML comment of the following form:

<? target string-expression ?>

XMLQUERY

The XMLQUERY function evaluates an XQuery expression (for more about XQuery, see Book 6, Chapter 3) and returns the result to the SQL application. The syntax of XMLQUERY is

XMLQUERY ( XQuery-expression

[ PASSING { By REF | BY VALUE }

argument-list ]

RETURNING { CONTENT | SEQUENCE }

{ BY REF | BY VALUE } )

Here’s an example of the use of XMLQUERY:

SELECT max_average,

XMLQUERY (

'for $batting_average in

/player/batting_average

where /player/lastname = $var1

return $batting_average'

PASSING BY VALUE

'Mantle' AS var1,

RETURNING SEQUENCE BY VALUE )

FROM offensive_stats

This statement returns the batting average for New York Yankees star Mickey Mantle stored in the offensive_stats XML document.

XMLCAST

The XMLCAST function is similar to an ordinary SQL CAST function but has some additional restrictions. XMLCAST enables an application to cast a value from an XML type to another XML type or to an SQL type. Similarly, you can use it to cast a value from an SQL type to an XML type. The restrictions are as follows:

  • At least one of the types involved — either the source type or the destination type — must be an XML type.
  • Neither of the types involved may be an SQL collection type, row type, structured type, or reference type.
  • Only the values of one of the XML types or the SQL null type may be cast to XML(UNTYPED DOCUMENT) or to XML(ANY DOCUMENT).

Here’s an example:

XMLCAST ( CLIENT.ClientName AS XML(UNTYPED CONTENT)

The XMLCAST function is transformed into an ordinary SQL CAST. The only reason to use a separate keyword is to enforce the preceding restrictions.

Working with XML Predicates

Predicates return a value of TRUE or FALSE. Some new predicates have been added that specifically relate to XML.

DOCUMENT

The purpose of the DOCUMENT predicate is to determine whether an XML value is an XML document. It tests whether an XML value is an instance of either XML(ANY DOCUMENT) or XML(UNTYPED DOCUMENT). The syntax is as follows:

XML-value IS [NOT]

[ANY | UNTYPED] DOCUMENT

If the expression evaluates to true, the predicate returns a TRUE value; otherwise, it returns a FALSE value unless the XML value is a null value, in which case it returns an UNKNOWN value. If you don’t specify either ANY or UNTYPED, the default assumption is ANY.

CONTENT

You would use the CONTENT predicate to determine whether an XML value is an instance of XML(ANY CONTENT) or XML(UNTYPED CONTENT). The syntax is

XML-value IS [NOT]

[ANY | UNTYPED] CONTENT

As is the case with the DOCUMENT predicate, if you don’t specify either ANY or UNTYPED, ANY is the default.

XMLEXISTS

As the name implies, you can use this predicate to determine whether a value exists. Here’s the syntax:

XMLEXISTS ( XQuery-expression

[ argument-list ])

The XQuery expression is evaluated, using the values provided in the argument list. If the value queried by the XQuery expression is the SQL NULL value, the predicate’s result is UNKNOWN. If the evaluation returns an empty XQuery sequence, the predicate’s result is FALSE; otherwise, the result is TRUE. You can use this predicate to determine whether an XML document contains some particular content before using a portion of that content in an expression.

VALID

The VALID predicate is used to evaluate an XML value to see whether it’s valid in the context of a registered XML schema. The syntax of the VALID predicate is more complex than is the case for most predicates, as you see in this example:

xml-value IS [NOT] VALID

[XML valid identity constraint option]

[XML valid according-to clause]

This predicate checks whether the XML value is one of the five XML types: XML(SEQUENCE), XML(ANY CONTENT), XML(UNTYPED CONTENT), XML(ANY DOCUMENT), or XML(UNTYPED DOCUMENT). Additionally, it may check whether the validity of the XML value depends on identity constraints and whether the value is valid with respect to a particular XML schema (the validity target).

There are four possibilities for the identify-constraint-option component of the syntax:

  • WITHOUT IDENTITY CONSTRAINTS
  • WITH IDENTITY CONSTRAINTS GLOBAL
  • WITH IDENTITY CONSTRAINTS LOCAL
  • DOCUMENT

If the identify-constraint-option syntax component isn’t specified, WITHOUT IDENTITY CONSTRAINTS is assumed. If DOCUMENT is specified, it acts like a combination of the DOCUMENT predicate and the VALID predicate WITH IDENTITY CONSTRAINTS GLOBAL.

WITH IDENTITY CONSTRAINTS GLOBAL means that the value is checked not only against the XML schema, but also against the XML rules for ID/IDREF relationships. ID and IDREF are XML attribute types that identify elements of a document.

WITH IDENTITY CONSTRAINTS LOCAL means that the value is checked against the XML schema but not against the XML rules for ID/IDREF or the XML schema rules for identify constraints.

The XML valid according-to clause identifies the schema that the value will be validated against.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.145.42.94