A Disconnected XML Data Reader

By design, a data reader object works while connected, and so do any XML readers you might build on top of it. However, the .NET Framework provides a class that has the ability to expose a disconnected set of rows—a DataSet object—as XML. The DataSet object is designed as a disconnected object with no relationship to any living instance of a DBMS. The XmlDataDocument class takes a DataSet object and transforms it into an XML DOM object—that is, the XmlDocument class we analyzed in Chapter 5. In a nutshell, the XmlDataDocument class provides a client-side and an XML DOM representation of a disconnected set of rows. Let’s see how.

The XmlDataDocument Class

The XmlDataDocument class inherits from XmlDocument, and although it is defined in the system.data assembly, it belongs to the System.Xml namespace. A combined use of the XmlDataDocument class and the DataSet class provides access to the same data using two otherwise alternative approaches: relational and hierarchical. When a DataSet class and an XmlDataDocument class are synchronized, they work on the same set of data and detect each other’s changes in real time.

The XmlDataDocument class has a DataSet property that is bound to the related DataSet object. The class does not duplicate the DataSet contents but simply holds a reference to the object. When the DataSet property is set, the XmlDataDocument registers a listener module for each DataSet event that indicates a change in the data. By hooking the events, the XmlDataDocument class can stay in sync with the DataSet contents.

Event hooking also works the other way around. In Chapter 5, we saw that whenever an application changes the contents of the XML DOM, a NodeChanged event fires. The XmlDataDocument class registers an event handler for NodeChanged and passes the changes down to the referenced DataSet object.

Synchronizing with a DataSet Object

You can synchronize a DataSet object with an XmlDataDocument object in various ways. For example, you can start by populating a DataSet object with schema and data and then pass it on to a new XmlDataDocument object, as shown here:

DataSet data = new DataSet();
// Populate the DataSet with schema and data
XmlDataDocument dataDoc = new XmlDataDocument(data);

In this case, the XML DOM object is created from the relational data. Alternatively, you can set up the DataSet object with schema only, associate it with the XmlDataDocument class, and then populate the XML DOM object with XML data, as shown in the following code. In this way, the DataSet object is filled with hierarchical data.

DataSet data = new DataSet();
// Populate the DataSet only with schema information
XmlDataDocument dataDoc = new XmlDataDocument(data);
dataDoc.Load(xmlfile);

Note that an exception is thrown if you attempt to load an XmlDataDocument object synchronized with a DataSet object that contains data.

You can take a third route. You can instantiate and load an XmlDataDocument object and then extract the corresponding DataSet object from it, as shown here:

XmlDataDocument dataDoc = new XmlDataDocument();
DataSet data = dataDoc.DataSet;
// Add schema information to the DataSet
dataDoc.Load(xmlfile);

In this case, no DataSet object is explicitly passed in by the user. The default constructor creates an empty DataSet object anyway that is then filled when the XmlDataDocument object is loaded. A client application can get a reference to the internal DataSet object by using the DataSet property.

An important issue to consider is that the DataSet object can’t be filled if no schema information has been set. You can manually create tables and columns in the DataSet object or read the information from an XML stream using the ReadXmlSchema method. (More on this topic in Chapter 9.)

XML Data Fidelity

To fill a DataSet object with XML data, you can use one of two methods. The first method is to use the DataSet object’s ReadXml method (see Chapter 9). The second method is to load the data as XML into an instance of the XmlData­Document class, and then use the XmlDataDocument.DataSet method to fill the DataSet object. The two approaches differ significantly in terms of data fidelity.

When ReadXml is used and the data is written back as XML, all extra XML information such as white spaces, processing instructions, and CDATA sections is irreversibly lost. This happens because the DataSet relational format simply does not know how to handle information that is meaningful only to the hierarchical model.

When the DataSet object is filled using an XML document loaded into XmlDataDocument, the DataSet object still contains a simplified and adapted representation of the hierarchical contents but the original XML document is preserved intact.

Nested Data Relations

If the DataSet object to be synchronized with an XmlDataDocument object contains one or more relations (instances of the DataRelation object), you should set the Nested property of the DataRelation object to true. In this way, the child rows of the relation will be nested within the parent column when written as XML data or synchronized with an XmlDataDocument object. By default, the Nested property of the DataRelation object is false.

Reading Data as XML

Representing a DataSet object with an instance of the XmlDataDocument class allows you to use XPath expressions to select data. In general, using XPath queries to select XML data makes sense especially if you have XML DOM data disconnected and stored in memory—that is, if you use XmlDataDocument. In doing so, you actually work on an XML DOM object and don’t in any way tax the database. Pay attention when using this technique in Microsoft ASP.NET applications. In this case, the client lives on the Web server, and you end up occupying the Web server’s memory with potential hits on the overall performance and scalability.

Using XPath to query XML representations of data relationally stored in SQL Server (for example, annotated schemas) seems to be a rather twisted and ineffective way to execute queries. The query engine of SQL Server, therefore, outperforms the XPath query engine—not to mention that to run slower queries, you still have to pay the price of transforming relational data in XML.

Reading database contents as XML makes sense only if you need to represent that information in an intermediate format for further transformations and processing. Currently, the best approach is still relying on FOR XML using the EXPLICIT operator if you need complex schemas. SQL Server 2000 supports XDR schemas, and to use XSD, you should resort to SQLXML 3.0. Unfortunately, SQLXML 3.0 relies on the OLE DB provider for data access and is not recommended for .NET Framework applications. If you find the FOR XML EXPLICIT syntax too quirky, look ahead to the discussion of .NET Framework XML serialization in Chapter 11.


..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.144.12.14