Extensible Markup Language (XML) has become the standard mechanism for sharing data between applications. This chapter will explore how XML documents may be stored in an Oracle database, how the data within an XML document can be extracted and stored in relational tables, and how an XML document can be constructed from data in relational tables.
XML is a close cousin to HTML, but while HTML is primarily concerned with formatting and displaying data, XML is concerned with the data itself. Unlike HTML, with XML you may create your own tags, which is why XML is perfectly suited for describing data. The easiest way to understand XML is through an example, so here is an XML document that represents a customer purchase order:
<?xml version="1.0"?> <purchase_order> <customer_name>Alpha Technologies</customer_name> <po_number>11257</po_number> <po_date>2004-01-20</po_date> <po_items> <item> <part_number>AI5-4557</part_number> <quantity>20</quantity> </item> <item> <part_number>EI-T5-001</part_number> <quantity>12</quantity> </item> </po_items> </purchase_order>
The first line defines that the text that follows is an XML document
that adheres to Version 1.0 of the XML specification. The second line
is the root node
of the document
and has the tag <purchase_order>
. A valid
XML document must contain exactly one root node. Other nodes of the
document can appear more than once as necessary to describe the data,
as illustrated by the multiple <item>
tags
under the <po_items>
tag. This document
describes that Alpha Technologies issued P.O. #11257 on January 20,
2004, in which 20 units of part number AI5-4557 and 12 units of
EI-T5-001 were requested. Since this document will be the basis for
every example in this chapter, it would be worthwhile to become
comfortable with it before forging ahead.
Because this book is about SQL, the focus of this chapter is how to utilize SQL to interact with XML documents. Thus, the XML examples used in this chapter are rather simple so as not to needlessly complicate things. If you are interested in delving deeper into XML, here are a few excellent resources:
The World Wide Web Consortium (W3C) site, useful for history of the XML specification, as well as the specification itself
Portal for everything XML, including tutorials, FAQs, white papers, news, etc.
An excellent reference guide for XML and related technologies
Oracle first began adding support for XML in the Oracle8i Database release. This support, which consisted largely of XML parser toolkits for Java, PL/SQL, and C/C++, allowed users to manipulate XML data but did not offer any native support for XML within the database kernel. The Oracle 9i Database releases raised the bar significantly by adding a new data type called XMLType for storing XML documents in the database and by creating a mechanism for organizing, accessing, and versioning XML documents called XML Repository. Oracle branded this set of technologies as Oracle XML DB. Additionally, XML data can be loaded and unloaded from a database using Oracle’s import/export utilities, read from external files using SQL*Loader, and published via Advanced Queuing, making it clear that Oracle has made the integration of XML technologies a high priority over the past few releases.
The XMLType data type, first introduced in Oracle9i Database, allows an XML document to be stored in a table. XMLType is actually an object type, so you have your choice of creating a column of type XMLType within a a table or creating an object table (i.e., CREATE TABLE purchase_order OF xmltype). Since we may want to store additional data about a purchase order along with the XML document itself, it might be best to create a table that contains a unique identifier, several a ttribute columns, and the XML document:
CREATE TABLE purchase_order (po_id NUMBER(5) NOT NULL, customer_po_nbr VARCHAR2(20), customer_inception_date DATE, order_nbr NUMBER(5), purchase_order_doc XMLTYPE, CONSTRAINT purchase_order_pk PRIMARY KEY (po_id) );
By default, the purchase_order_doc
column will be
stored as a
CLOB
(Character Large Object). Later in the chapter, you will see how the
XML document can be stored as a set of objects by defining an XML
Schema for your XML documents, but we’ll keep it
simple now and move on to the more complicated case later in this
chapter.
The XMLType object type includes constructors that accept many different data types, including VARCHAR2, CLOB, BFILE, and REF CURSOR. For example, here is the definition for the constructor used in the remainder of the chapter:
FINAL CONSTRUCTOR FUNCTION XMLTYPE RETURNS SELF AS RESULT Argument Name Type In/Out Default? ------------------------------ ----------------------- ------ -------- XMLDATA BINARY FILE LOB IN CSID NUMBER IN SCHEMA VARCHAR2 IN DEFAULT VALIDATED NUMBER IN DEFAULT WELLFORMED NUMBER IN DEFAULT
As you can see, the first two parameters are required, and the next
three parameters are optional. The following PL/SQL block uses this
constructor without the optional parameters to instantiate an XMLType
object and insert it into the purchase_order
table:
/* create directory to point to where XML docs are stored */ CREATE DIRECTORY xml_data AS 'c:\alan\OReilly\2nd_Edition'; DECLARE bfl BFILE; BEGIN /* attach XML document purch_ord.xml to bfile locator */ bfl := BFILENAME('XML_DATA', 'purch_ord.xml'), /* add to purchase_order table */ INSERT INTO purchase_order (po_id, purchase_order_doc) VALUES (1000, XMLTYPE(bfl, nls_charset_id('WE8MSWIN1252'))); COMMIT; END;
The purchase_order
table now contains a record
with the contents of the purch_ord.xml file
stored as a CLOB. At this point, the file is assumed to contain a
valid XML document, but the contents have not been checked for
validity (more on this later). If you would like to see the contents
of the document, you can simply select the XMLType column:
SELECT po.purchase_order_doc
FROM purchase_order po;
PURCHASE_ORDER_DOC --------------------------------------------------------------------- <?xml version="1.0"?> <purchase_order> <customer_name>Alpha Technologies</customer_name> <po_number>11257</po_number> <po_date>2004-01-20</po_date> <po_items> <item> <part_number>AI5-4557</part_number> <quantity>20</quantity> </item> <item> <part_number>EI-T5-001</part_number> <quantity>12</quantity> </item> </po_items> </purchase_order>
Now that the XML document has been stored in the
purchase_order
table, what should you do with it?
Unless your intent is to simply store the document for safe-keeping,
you will probably want to at least inspect the data inside the
document, if not extract that data for storage in relational tables.
The XMLType object contains numerous methods to help with this
effort.
Before you can begin inspecting XML documents, you will need to have a method for identifying different parts of a document. Oracle has adopted the use of XPath expressions for this purpose. XPath is a W3C recommendation used for walking a tree of nodes. Before describing how to build an XPath expression, it might be helpful to view the purchase order document as a tree, as shown in Figure 16-1.
The root node has four children, one of which
(po_items
) has two children of its own. Each of
these child nodes has two children nodes as well. XPath specifies a
notation for describing a specific node or nodes in the tree, as
shown in Table 16-1.
Element |
Meaning |
/ |
Used to separate nodes or to denote the root node if first character of expression. |
// |
Used to denote all children of a given node. |
* |
Wildcard character. |
[] |
Used to identify a specific child if a node has more than one child
(i.e., |
Using this notation, the customer_name
node in the
purchase_order
tree would be represented as
/purchase_order/customer_name
, and the second item
node would be represented as
/purchase_order/po_items/item[2]
. To find all
item
nodes in the purchase order document, you
could specify /purchase_order//item
or
/purchase_order/*/item
. Along with finding nodes
based on tag names, you may also search for nodes based on values,
such as /purchase_order[po_number=11257]/po_items
,
which returns the po_items
for the purchase order
having a po_number
of 11257. Now that you have a
way of describing nodes in the XML document, you can begin extracting
data.
The
extract( )
member function takes an XMLType
instance and an XPath expression and returns an XMLType instance. The
XMLType instance returned by the function represents some fragment of
the original as resolved by the XPath expression, and it does not
need to be a valid XML document. If the XPath expression does not
resolve to a node of the XML document, a NULL is returned.
Here’s a simple example:
SELECT extract(po.purchase_order_doc,
'/purchase_order/customer_name') xml_fragment
FROM purchase_order po
WHERE po.po_id = 1000;
XML_FRAGMENT --------------------------------------------------- <customer_name>Alpha Technologies</customer_name>
The XML fragment being returned is a perfectly valid, albeit brief, XML document consisting of just a root node. Here’s another example that returns all of the purchase order items:
SELECT extract(po.purchase_order_doc,
'/purchase_order//item') xml_fragment
FROM purchase_order po
WHERE po.po_id = 1000;
XML_FRAGMENT --------------------------------------------------- <item> <part_number>AI5-4557</part_number> <quantity>20</quantity> </item> <item> <part_number>EI-T5-001</part_number> <quantity>12</quantity> </item>
The XML fragment returned from this query is not a valid XML
document, since it contains two root nodes. If you are interested in
retrieving only the first item, you can specify this using
[1]
in your XPath expression, as demonstrated by
the following:
SELECT extract(po.purchase_order_doc,
'/purchase_order/po_items/item[1]') xml_fragment
FROM purchase_order po
WHERE po.po_id = 1000;
XML_FRAGMENT -------------------------------------------------- <item> <part_number>AI5-4557</part_number> <quantity>20</quantity> </item>
The extractValue(
)
member function is similar to
the extract( )
member function, except that it
returns a string (VARCHAR2) instead of an instance of XMLType. For
example, if you wanted to extract the customer name without the
enclosing tags, you could do the following:
SELECT extractValue(po.purchase_order_doc,
'/purchase_order/customer_name') cust_name
FROM purchase_order po
WHERE po.po_id = 1000;
CUST_NAME -------------------------------------------------- Alpha Technologies
If you know that the value being returned is a number or date, you
can wrap the call to extractValue( )
with
to_number
or to_date
as needed.
Keep in mind that you cannot substitute extractValue(
)
for extract( )
in every situation; to
use extractValue( )
, the node resolved from the
XPath expression must be a single child node. For example, the
following statement generates an error because the node resolved by
the XPath expression has child nodes beneath it:
SELECT extractValue(po.purchase_order_doc, '/purchase_order/po_items') cust_name FROM purchase_order po WHERE po.po_id = 1000; * ERROR at line 3: ORA-19025: EXTRACTVALUE returns value of only one node
If you would like to determine whether a specific node exists in your
document, you can use the
existsNode( )
member function. existsNode( )
takes an XMLType
instance and an XPath expression and returns 1 if one or more nodes
are found and 0 otherwise. This method is most often used in the
WHERE clause of a query, although it is also useful in the SELECT
clause (generally within a CASE expression) or the FROM clause
(generally within an inline view). The following example uses
existsNode( )
in the WHERE clause to ensure that
the specified node exists:
SELECT extract(po.purchase_order_doc,
'/purchase_order/customer_name') xml_fragment
FROM purchase_order po
WHERE po.po_id = 1000
AND 1 = existsNode(po.purchase_order_doc,
'/purchase_order/customer_name'),
XML_FRAGMENT ---------------------------------------------------- <customer_name>Alpha Technologies</customer_name>
The next example uses existsNode( )
in the SELECT
clause to determine how many line items are in the purchase order:
SELECT CASE
WHEN 1 = existsNode(po.purchase_order_doc,
'/purchase_order/po_items/item[6]') THEN '>5'
WHEN 1 = existsNode(po.purchase_order_doc,
'/purchase_order/po_items/item[5]') THEN '5'
WHEN 1 = existsNode(po.purchase_order_doc,
'/purchase_order/po_items/item[4]') THEN '4'
WHEN 1 = existsNode(po.purchase_order_doc,
'/purchase_order/po_items/item[3]') THEN '3'
WHEN 1 = existsNode(po.purchase_order_doc,
'/purchase_order/po_items/item[2]') THEN '2'
WHEN 1 = existsNode(po.purchase_order_doc,
'/purchase_order/po_items/item[1]') THEN '1'
END num
FROM purchase_order po
WHERE po.po_id = 1000;
NUM -- 2
Keep in mind that existsNode( )
will return 1 if a
node exists, regardless of whether or not the node has a value.
Let’s say you received the following purchase order:
<?xml version="1.0"?>
<purchase_order>
<customer_name>Alpha Technologies</customer_name>
<po_number></po_number>
<po_date>2004-01-20</po_date>
<po_items>
<item>
<part_number>AI5-4557</part_number>
<quantity>20</quantity>
</item>
<item>
<part_number>EI-T5-001</part_number>
<quantity>12</quantity>
</item>
</po_items>
</purchase_order>
If you search for the existence of the
<po_number>
node, the existsNode(
)
function will return 1, even though no value has been
supplied for this node. You will need to use the
extractValue( )
function to determine whether a
valid PO number has been provided.
Now that you know how to
navigate through and extract fragments from
your XML documents, you can build DML statements to move data from
your XML documents to relational tables. The
purchase_order
table created earlier contains two
columns that must be populated from data in the purchase order
document. The next statement shows how this data can be populated via
a single UPDATE statement:
UPDATE purchase_order po SET po.customer_po_nbr = extractvalue(po.purchase_order_doc, '/purchase_order/po_number'), po.customer_inception_date = to_date(extractvalue(po.purchase_order_doc, '/purchase_order/po_date'),'YYYY-MM-DD'), po.order_nbr = 7101 WHERE po.po_id = 1000;
For this set of examples, we’ve arbitrarily chosen
to use 7101 as the primary key value for the
cust_order
table, for the purchase order
we’re working with. Normally, this value would be
generated via a sequence.
Now that the purchase_order
table has been
completed, the next step is to generate data for the
cust_order
and line_item
tables
so that the customer’s order can begin processing.
Here is the INSERT statement for the cust_order
table:
INSERT INTO cust_order (order_nbr, cust_nbr, sales_emp_id, order_dt, expected_ship_dt, status) SELECT 7101, (SELECT c.cust_nbr FROM customer c WHERE c.name = ext.cust_name), 0, SYSDATE, TRUNC(SYSDATE + 7), 'NEW' FROM (SELECT extractValue(po.purchase_order_doc, '/purchase_order/customer_name') cust_name FROM purchase_order po WHERE po.po_id = 1000) ext; 1 row created.
The previous statement is fairly straightforward, in that it extracts
the value of the customer_name
node from the
purchase order and uses the value to look up the appropriate
cust_nbr
value in the customer
table. The statement for inserting the two line items, however, is a
bit more complex because there are multiple item nodes, each with two
child nodes underneath them: part_number
and
quantity
. Before you can interact with the data in
a SQL statement, you need to transform it from a single XML fragment
containing two item
nodes to a table of
item
nodes containing two rows. Fortunately,
Oracle has included a built-in function called xmlSequence(
)
specifically for this task. First, here are the two line
items returned as a single XMLType instance using the
extract( )
function:
SELECT extract(po.purchase_order_doc,
'/purchase_order//item')
FROM purchase_order po
WHERE po.po_id = 1000;
EXTRACT(PO.PURCHASE_ORDER_DOC,'/PURCHASE_ORDER//ITEM') ---------------------------------------------------------- <item> <part_number>AI5-4557</part_number> <quantity>20</quantity> </item> <item> <part_number>EI-T5-001</part_number> <quantity>12</quantity> </item> 1 row selected.
The next example uses xmlSequence( )
to generate a
varray of two items from the XML fragment returned by the
extract( )
method. You can then query the output
of the xmlSequence( )
function by wrapping it in a
TABLE expression and placing it in the FROM clause of a containing
query:
SELECT itm.*
FROM TABLE(SELECT xmlSequence(extract(po.purchase_order_doc,
'/purchase_order//item'))
FROM purchase_order po
WHERE po.po_id = 1000) itm;
COLUMN_VALUE ----------------------------------------------------------- <item> <part_number>AI5-4557</part_number> <quantity>20</quantity> </item> <item> <part_number>EI-T5-001</part_number> <quantity>12</quantity> </item> 2 rows selected.
The result set now consists of two rows, each containing a single
XMLType instance. You can then use this query to insert data into the
line_item
table by using the
extractValue( )
method to extract the text from
the part_number
and quantity
nodes:
INSERT INTO line_item (order_nbr, part_nbr, qty) SELECT 7101, extractValue(itm.column_value, '/item/part_number'), extractvalue(itm.column_value, '/item/quantity') FROM TABLE(SELECT xmlSequence(extract(po.purchase_order_doc, '/purchase_order//item')) FROM purchase_order po WHERE po.po_id = 1000) itm; 2 rows created.
All of the previous examples have one thing in common: the user understands how the XML document is organized, but Oracle just stores the document as a CLOB without any understanding of what the document contains. You have the option, however, of providing Oracle with a roadmap of the documents that will be stored in an XMLtype column by defining an XML Schema. Once a schema has been defined, the Oracle server can check the validity of the XML documents being stored, as well as add additional functionality to many of the member functions of the XMLType object type.
Defining a schema for your documents allows you to specify, among other things, what elements are required and what elements are optional, in what order your document elements must appear, and min/max and default values for attributes. Here’s a simple schema for the purchase order documents:
<?xml version="1.0" encoding="UTF-8"?> <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" version="1.0" elementFormDefault="unqualified"> <xs:element name="purchase_order"> <xs:complexType> <xs:sequence> <xs:element name="customer_name" type="xs:string"/> <xs:element name="po_number" type="xs:string"/> <xs:element name="po_date" type="xs:date"/> <xs:element name="po_items"> <xs:complexType> <xs:sequence> <xs:element name="item" maxOccurs="9999"> <xs:complexType> <xs:sequence> <xs:element name="part_number" type="xs:string"/> <xs:element name="quantity" type="xs:integer"/> </xs:sequence> </xs:complexType> </xs:element> </xs:sequence> </xs:complexType> </xs:element> </xs:sequence> </xs:complexType> </xs:element> </xs:schema>
Without going into great detail on how to construct XML Schema definitions, here’s a brief description of the above schema:
The first element in the document is
purchase_order
.
The purchase_order
element contains an ordered set
of elements: customer_name
,
po_number
, po_date
, and
po_items
.
The po_items
element can contain up to 9,999
item
elements.
The item
element contains an ordered set of two
elements: part_number
and
quantity
.
This schema just brushes the surface of what can be done with XML Schema, but it is sufficient to illustrate how a schema is used by Oracle. If you would like to explore the full power of XML Schema, you might consider picking up XML Schema by Eric van der Vlist (O’Reilly).
Now that you have defined
a schema for your purchase order documents, you need to register it
with Oracle before you can assign it to XMLType columns. To register
the schema, you will need to call the
dbms_xmlschema.registerSchema(
)
built-in procedure, which
requires an identifying URL and the schema definition. The schema
definition, which is stored in the
purch_ord.xsd
file in the XML_DATA directory created
earlier, will be loaded into a binary file variable and passed in as
the second parameter to registerSchema
( ):
DECLARE bfl BFILE; BEGIN /* attach XSD document to bfile locator */ bfl := BFILENAME('XML_DATA', 'purch_ord.xsd'), /* register schema */ dbms_xmlschema.registerSchema( 'http://localhost:8080/home/xml/schemas/purch_ord.xsd', bfl); END;
The registerSchema( )
procedure reads the schema
definition and creates whatever database objects it deems appropriate
for storing the data defined in the schema definition. Although it is
beyond the scope of this book, you may annotate your schema
definition to tell Oracle how to store the data and what to call the
database objects. Because we didn’t annotate the
purch_ord.xsd file, Oracle created one table
(purchase_order165_tab
), three object types
(purchase_order161_t
,
po_items162_t
, and item163_t
),
and a collection type (item164_coll
) to store the
purchase order data. Keep in mind that these are Oracle-generated
names, so your results will vary.
Now that the Oracle server is aware
of
your
schema, you can assign it to a column. The following DDL statement
creates the purchase_order2
table and, at the end
of the statement, specifies that the schema associated with the URL
http://localhost:8080/home/xml/schemas/purch_ord.xsd
is to be applied to the purchase_order_doc
column:
CREATE TABLE purchase_order2 (po_id NUMBER(5) NOT NULL, customer_po_nbr VARCHAR2(20), customer_inception_date DATE, order_nbr NUMBER(5), purchase_order_doc XMLTYPE, CONSTRAINT purchase_order2_pk PRIMARY KEY (po_id) ) XMLTYPE COLUMN purchase_order_doc XMLSCHEMA "http://localhost:8080/home/xml/schemas/purch_ord.xsd" ELEMENT "purchase_order";
Whenever a document is added to the
purchase_order2
table, the data from the XML
document will be extracted and stored in the table that was generated
when you registered your schema using the registerSchema(
)
procedure. Thus, the actual XML document will not be
stored in the database, so keep this in mind if your business rules
require that the document be stored intact. If you would like to
assign a schema to your XMLType column, but you want to store the XML
documents intact, you can specify that the documents be stored in a
CLOB. The next example demonstrates how this is done by specifying
the STORE AS CLOB phrase before naming the schema URL:
CREATE TABLE purchase_order3
(po_id NUMBER(5) NOT NULL,
customer_po_nbr VARCHAR2(20),
customer_inception_date DATE,
order_nbr NUMBER(5),
purchase_order_doc XMLTYPE,
CONSTRAINT purchase_order3_pk PRIMARY KEY (po_id)
)
XMLTYPE COLUMN purchase_order_doc
STORE AS CLOB
XMLSCHEMA "http://localhost:8080/home/xml/schemas/purch_ord.xsd"
ELEMENT "purchase_order";
To insert data into the schema-based table
purchase_order2
, you will use the same mechanism
as you did for the non-schema-based purchase_order
table. However, you will first need to alter the root node of your
document to include the schema URL:
<purchase_order xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="http://localhost:8080/home/xml/schemas/purch_ord.xsd">
With this change in place, you can insert your document as before:
DECLARE bfl BFILE; BEGIN /* attach XML document to bfile locator */ bfl := BFILENAME('XML_DATA', 'purch_ord.xml'), /* add to purchase_order2 table */ INSERT INTO purchase_order2 (po_id, purchase_order_doc) VALUES (2000, XMLTYPE(bfl, nls_charset_id('WE8MSWIN1252'))); COMMIT; END;
If the document matches the schema definition, then the data will be extracted from the document and stored. At this point, the document has been partially validated against the schema, meaning that Oracle has checked that all mandatory elements are present and that no undefined elements are present. If you need to fully validate your documents against the schema definition, you will need to call one of several XMLType member functions demonstrated in the next section.
To illustrate what happens if the document does not match the schema
definition, assume we change the root node from
<purchase_order>
to
<customer_invoice>
and try to insert the
document:
DECLARE
bfl BFILE;
BEGIN
/* attach XML document to bfile locator */
bfl := BFILENAME('XML_DATA', 'purch_ord.xml'),
/* add to purchase_order2 table */
INSERT INTO purchase_order2 (po_id, purchase_order_doc)
VALUES (2001, XMLTYPE(bfl, nls_charset_id('WE8MSWIN1252')));
COMMIT;
END;
/
DECLARE * ERROR at line 1: ORA-31043: Element 'customer_invoice' not globally defined in schema 'http://localhost:8080/home/xml/schemas/purch_ord.xsd'
Upon changing the root node back to
<purchase_order>
but adding a new child node
called <comments>
, we would see the
following error:
DECLARE
bfl BFILE;
BEGIN
/* attach XML document to bfile locator */
bfl := BFILENAME('XML_DATA', 'purch_ord.xml'),
/* add to purchase_order2 table */
INSERT INTO purchase_order2 (po_id, purchase_order_doc)
VALUES (2001, XMLTYPE(bfl, nls_charset_id('WE8MSWIN1252')));
COMMIT;
END;
/
DECLARE * ERROR at line 1: ORA-30937: No schema definition for 'comments' (namespace '##local') in parent 'purchase_order'
Depending on the needs of your application, you can make your schema fairly simple, like the purchase order schema used here, or you can make your schema much more restrictive. If you find yourself writing code to check the validity of an XML document, you might be better off creating a robust schema definition and letting Oracle do the work for you.
If you are checking your XML documents against a schema, then you will be able to make use of several member functions of the XMLType object type. If you want to check to see if your documents are based on a schema, and get the schema’s URL, you could do the following:
SELECT CASE WHEN 1 = po.purchase_order_doc.isSchemaBased( )
THEN po.purchase_order_doc.getSchemaURL( )
ELSE 'No Schema Defined'
END schema_name
FROM purchase_order2 po
WHERE po.po_id = 2000;
SCHEMA_NAME ------------------------------------------------------- http://localhost:8080/home/xml/schemas/purch_ord.xsd
This query uses two of XMLType’s member functions:
the isSchemaBased( )
function returns 1 if the
XMLType column has been assigned a schema, and the
getSchemaURL( )
function returns the
schema’s URL.
If you want to check to see if a particular XMLType instance has been
fully validated, you can use the isSchemaValidated(
)
member function:
SELECT CASE WHEN 1 = po.purchase_order_doc.isSchemaValidated( )
THEN 'VALIDATED'
ELSE 'NOT VALIDATED' END status
FROM purchase_order2 po
WHERE po.po_id = 2000;
STATUS ------------- NOT VALIDATED
Since the document has not been fully validated, you have your choice
of calling the member function isSchemaValid( )
to
check for validity without changing the document’s
status, or calling the member procedure schemaValidate(
)
to check for validity and change the status.
Here’s an example of calling the
isSchemaValid( )
function:
SELECT CASE WHEN 1 = po.purchase_order_doc.isSchemaValid( )
THEN 'VALID'
ELSE 'NOT VALID' END validity
FROM purchase_order2 po
WHERE po.po_id = 2000;
VALIDITY --------- VALID
Finally, here’s an example of calling the member
procedure schemaValidate( )
:
DECLARE doc XMLTYPE; BEGIN SELECT po.purchase_order_doc INTO doc FROM purchase_order2 po WHERE po.po_id = 2000; doc.schemaValidate( ); END;
Remember, isSchemaValid( )
returns the status
without changing it, whereas schemaValidate( )
potentially changes the status without returning it.
If you want to modify the contents of an XML document, you will need to replace the XMLType instance stored in the table with another instance. If you would like to replace the entire document, you can simply generate a new XMLType instance and replace the existing one:
UPDATE purchase_order po SET po.purchase_order_doc = XMLTYPE(BFILENAME('XML_DATA', 'other_purch_ord.xml'), nls_charset_id('WE8MSWIN1252')) WHERE po.po_id = 2000;
However, if your intent is to modify the existing document, you can
do so using the member function updateXML( )
,
which uses a find/replace mechanism to alter the document content and
returns an XMLType instance. To specify the content to be replaced,
you need to specify an XPath expression. For example, the following
UPDATE statement replaces the value of the
customer_name
node with “Wallace
Industries”:
SELECT extract(po.purchase_order_doc,
'/purchase_order/customer_name') xml_fragment
FROM purchase_order po
WHERE po_id = 1000;
XML_FRAGMENT ------------------------------------------------- <customer_name>Alpha Technologies</customer_name>UPDATE purchase_order po
SET po.purchase_order_doc =
updateXML(po.purchase_order_doc,
'/purchase_order/customer_name/text( )', 'Wallace Industries')
WHERE po.po_id = 1000;
SELECT extract(po.purchase_order_doc,
'/purchase_order/customer_name') xml_fragment
FROM purchase_order po
WHERE po_id = 1000;
XML_FRAGMENT --------------------------------------------------- <customer_name>Wallace Industries</customer_name>
The XPath expression used to update the customer name in this example
includes a call to the text( )
function, which
causes the value of the text node to be returned instead of the
entire node. Thus, an XPATH expression of
/purchase_order/customer_name
resolves to the XML
fragment <customer_name>Alpha
Technologies</customer_name>
, whereas the XPATH
expression /purchase_order/customer_name/text( )
resolves to the string “Alpha
Technologies”. You can accomplish the same text
substitution without the use of the text( )
function, but your third parameter to the updateXML(
)
function would need to be an XML fragment rather than a
string:
UPDATE purchase_order po SET po.purchase_order_doc = updateXML(po.purchase_order_doc, '/purchase_order/customer_name', XMLTYPE('<customer_name>Wallace Industries</customer_name>')) WHERE po.po_id = 1000;
If you need to make multiple changes to your document, you can specify multiple find/replace pairs, as in the following example:
UPDATE purchase_order po SET po.purchase_order_doc = updateXML(po.purchase_order_doc, '/purchase_order/customer_name/text( )', 'Wallace Industries', '/purchase_order/po_number/text( )', '11359') WHERE po.po_id = 1000;
Along with replacing individual text attributes, you can also use
updateXML( )
to replace an item in a collection.
To do so, you can use one of the XMLType constructors to generate an
XML fragment and then substitute the fragment into the document. For
example, the following example replaces the entire second line item
of the purchase order:
SELECT extract(po.purchase_order_doc,
'/purchase_order//item') xml_fragment
FROM purchase_order po
WHERE po.po_id = 1000;
XML_FRAGMENT ----------------------------------------- <item> <part_number>AI5-4557</part_number> <quantity>20</quantity> </item> <item> <part_number>EI-T5-001</part_number> <quantity>12</quantity> </item>UPDATE purchase_order po
SET po.purchase_order_doc =
updateXML(po.purchase_order_doc,
'/purchase_order/po_items/item[2]',
XMLTYPE('<item>
<part_number>TZ50828</part_number>
<quantity>12</quantity>
</item>'))
WHERE po.po_id = 1000;
SELECT extract(po.purchase_order_doc,
'/purchase_order//item') xml_fragment
FROM purchase_order po
WHERE po.po_id = 1000;
XML_FRAGMENT ----------------------------------------- <item> <part_number>AI5-4557</part_number> <quantity>20</quantity> </item> <item> <part_number>TZ50828</part_number> <quantity>12</quantity> </item>
The first sections
of this chapter focused on how to
store, inspect, validate, and modify an XML document. While this is
all well and good if someone has provided you with an XML document,
what should you do if you need to construct an XML document from data
in your database? For this purpose, Oracle has included a set of
built-in SQL functions that conform to the emerging SQL/XML standard.
After describing the functions individually, we will demonstrate how
these functions can be used to generate a purchase order document
from the data stored in the customer
,
cust_order
, and line_item
tables.
The XMLElement( )
function is tasked with generating
XML elements. It takes as arguments an element name, a value, and an
optional set of attributes. The following example generates an XML
fragment consisting of data from the supplier table:
SELECT XMLElement("supplier_id", s.supplier_id) ||
XMLElement("name", s.name) xml_fragment
FROM supplier s;
XML_FRAGMENT ------------------------------------------------------------- <supplier_id>1</supplier_id><name>Acme Industries</name> <supplier_id>2</supplier_id><name>Tilton Enterprises</name> <supplier_id>3</supplier_id><name>Eastern Importers</name>
Calls to XMLElement( )
can be nested to facilitate
parent/child elements. The next example builds on the previous
example by wrapping the two supplier elements in a parent element
called <supplier>
:
SELECT XMLElement("supplier",
XMLElement("supplier_id", s.supplier_id) || XMLElement("name", s.name)
) xml_fragment
FROM supplier s;
XML_FRAGMENT -------------------------------------------------------------------------- <supplier> <supplier_id>1</supplier_id><name>Acme Industries</name> </supplier> <supplier> <supplier_id>2</supplier_id><name>Tilton Enterprises</name> </supplier> <supplier> <supplier_id>3</supplier_id><name>Eastern Importers</name> </supplier>
Please note that the three rows of output from the previous example would normally print as three lines; we have taken the liberty of adding line feeds and tabs for readability and will do so for the rest of the examples.
The XMLAgg( )
function groups together sets of
elements so that they can all be children to the same parent. This
function is useful for working with collections or for wrapping a set
of elements under a parent element. The next example builds on the
previous example by wrapping the supplier elements in a root element
called <supplier_list>
:
SELECT XMLElement("supplier_list",
XMLAgg(XMLElement("supplier",
XMLElement("supplier_id", s.supplier_id) || XMLElement("name", s.name)
))) xml_document
FROM supplier s;
XML_DOCUMENT -------------------------------------------------------------------------- <supplier_list> <supplier> <supplier_id>1</supplier_id><name>Acme Industries</name> </supplier> <supplier> <supplier_id>2</supplier_id><name>Tilton Enterprises</name> </supplier> <supplier> <supplier_id>3</supplier_id><name>Eastern Importers</name> </supplier> </supplier_list>
To spice things up a bit, the next example adds the
part
table to the FROM clause and adds a list of
parts supplied by each supplier:
SELECT XMLElement("supplier_list",
XMLAgg(XMLElement("supplier",
XMLElement("supplier_id", s.supplier_id) || XMLElement("name", s.name)
|| XMLElement("part_list", XMLAgg(XMLElement("part",
XMLElement("part_nbr", p.part_nbr) || XMLElement("name", p.name))))
))) xml_document
FROM supplier s INNER JOIN part p
ON s.supplier_id = p.supplier_id
GROUP BY s.supplier_id, s.name;
XML_DOCUMENT ---------------------------------------------------------------- <supplier_list> <supplier> <supplier_id>1</supplier_id><name>Acme Industries</name> <part_list> <part><part_nbr>AI5-4557</part_nbr><name>Acme Part AI5-4557</name></part> </part_list> </supplier> <supplier> <supplier_id>2</supplier_id><name>Tilton Enterprises</name> <part_list> <part><part_nbr>TZ50828</part_nbr><name>Tilton Part TZ50828</name><part> </part_list> </supplier> <supplier> <supplier_id>3</supplier_id><name>Eastern Importers</name> <part_list> <part><part_nbr>EI-T5-001</part_nbr><name>Eastern Part EI-T5-001</name></part> </part_list> </supplier> </supplier_list>
The previous examples in this section have used concatenation
(||
) to append sibling elements together. The
XMLForest( )
function takes a list of values,
generates elements for each one, and concatenates them together for
you:
SELECT XMLElement("supplier",
XMLForest(s.supplier_id, s.name)
) xml_fragment
FROM supplier s;
XML_FRAGMENT ------------------------------------------------------------------------ <supplier> <SUPPLIER_ID>1</SUPPLIER_ID><NAME>Acme Industries</NAME> </supplier> <supplier> <SUPPLIER_ID>2</SUPPLIER_ID><NAME>Tilton Enterprises</NAME> </supplier> <supplier> <SUPPLIER_ID>3</SUPPLIER_ID><NAME>Eastern Importers</NAME> </supplier>
If you want to specify your own element names, you can optionally use the AS clause, as demonstrated by the following:
SELECT XMLElement("supplier",
XMLForest(s.supplier_id AS "sup_id", s.name AS "sup_name")
) xml_fragment
FROM supplier s;
XML_FRAGMENT -------------------------------------------------------------------------- <supplier> <sup_id>1</sup_id><sup_name>Acme Industries</sup_name> </supplier> <supplier> <sup_id>2</sup_id><sup_name>Tilton Enterprises</sup_name> </supplier> <supplier> <sup_id>3</sup_id><sup_name>Eastern Importers</sup_name> </supplier>
Earlier in the chapter, you saw how data from the purchase order
document could be extracted and stored in various tables. Using the
three built-in SQL functions described above, we will demonstrate how
to recreate the XML document from the
purchase_order
, cust_order
,
line_item
, and customer
tables.
We’ll start by generating the root element and basic
purchase order data from the purchase_order
,
cust_order
, and customer
tables:
SELECT XMLElement("purchase_order",
XMLForest(c.name AS "customer_name",
po.customer_po_nbr AS "po_number",
TO_CHAR(po.customer_inception_date, 'YYYY-MM-DD')
AS "po_date")
) purchase_order
FROM purchase_order po INNER JOIN cust_order co
ON po.order_nbr = co.order_nbr
INNER JOIN customer c ON co.cust_nbr = c.cust_nbr
WHERE po.po_id = 1000;
PURCHASE_ORDER ------------------------------------------------------------------- <purchase_order> <customer_name>Alpha Technologies</customer_name> <po_number>11257</po_number> <po_date>2004-01-20</po_date> </purchase_order>
Next, we will aggregate the line item data from the
line_item
table:
SELECT XMLElement("purchase_order",
XMLForest(c.name AS "customer_name",
po.customer_po_nbr AS "po_number",
TO_CHAR(po.customer_inception_date, 'YYYY-MM-DD') AS "po_date",
XMLAgg(XMLElement("item",
XMLForest(li.part_nbr AS "part_number", li.qty AS "quantity")))
AS "po_items")
) purchase_order
FROM purchase_order po INNER JOIN cust_order co
ON po.order_nbr = co.order_nbr
INNER JOIN customer c ON co.cust_nbr = c.cust_nbr
INNER JOIN line_item li ON co.order_nbr = li.order_nbr
WHERE po.po_id = 1000
GROUP BY po.customer_po_nbr, po.customer_inception_date, c.name;
PURCHASE_ORDER -------------------------------------------------------------------- <purchase_order> <customer_name>Alpha Technologies</customer_name> <po_number>11257</po_number> <po_date>2004-01-20</po_date> <po_items> <item> <part_number>AI5-4557</part_number> <quantity>20</quantity> </item> <item> <part_number>EI-T5-001</part_number> <quantity>12</quantity> </item> </po_items> </purchase_order>
Hopefully, this looks familiar. Although the output is satisfactory, we find the number of joins and the GROUP BY clause to be a bit excessive, so the next version of the query uses subqueries to retrieve the customer name and line items:
SELECT XMLElement("purchase_order",
XMLForest((SELECT c.name
FROM customer c INNER JOIN cust_order co ON c.cust_nbr = co.cust_nbr
WHERE co.order_nbr = po.order_nbr) AS "customer_name",
po.customer_po_nbr AS "po_number",
TO_CHAR(po.customer_inception_date, 'YYYY-MM-DD') AS "po_date",
(SELECT XMLAgg(XMLElement("item",
XMLForest(li.part_nbr AS "part_number", li.qty AS "quantity")))
FROM line_item li
WHERE li.order_nbr = po.order_nbr) AS "po_items")
) purchase_order
FROM purchase_order po
WHERE po.po_id = 1000;
PURCHASE_ORDER ----------------------------------------------------------------------- <purchase_order> <customer_name>Alpha Technologies</customer_name> <po_number>11257</po_number> <po_date>2004-01-20</po_date> <po_items> <item> <part_number>AI5-4557</part_number> <quantity>20</quantity> </item> <item> <part_number>EI-T5-001</part_number> <quantity>12</quantity> </item> </po_items> </purchase_order>
This chapter has covered quite a bit of ground as far as how SQL and XML interact, including the storage, validation, modification, and generation of XML documents, yet it has barely scratched the surface as far as XML is concerned. There is much more to XML than has been demonstrated here, and far more to XML Schema than was utilized in the schema definition for the purchase order documents. Additionally, there are a large number of related technologies, such as Document Object Model (DOM) and Extensible Stylesheet Language Transformations (XSLT), that have not been covered in this chapter, but which are available in Oracle XML DB. It is impossible to cover all of this territory in a single chapter, so we urge you to utilize the resources mentioned earlier in the chapter if you are new to XML.
52.15.245.1