Chapter 8. Working with XML

IN THIS CHAPTER

The eXtensible Markup Language (XML) has seen widespread adoption throughout almost every industry in recent years. Its ability to allow applications to exchange data in a standardized format through web services and XML documents, and its adoption by SQL Server 2005 and most of Microsoft’s new and upcoming applications, make the use of XML almost compulsory when creating applications in the .NET Framework. Whether you’re using XML in your application’s configuration file, consuming or exposing web services, working with XML in SQL Server, or working with datasets, knowing how to work with XML programmatically with the .NET Framework is an essential skill for any C# developer. This chapter does not cover the basics of XML itself or other standards such as XPath and XSLT. Instead, you will see how to work with XML, XPath, and XSLT using C# 2.0.

Reading and Writing XML Documents

One of the most basic tasks that you can perform with XML is manipulating the contents of an XML document. This includes traversing the list of nodes in the document, setting and querying attribute values, and manipulating the tree itself by creating and inserting new nodes.

This section shows you how to read XML documents using the Document Object Model (DOM) ) modeled by the XmlDocument class in the System.Xml namespace. The DOM is recursive, meaning that each node has the same properties and methods as every other node in the document. Tables 8.1 and 8.2 provide a brief overview of many of the properties and methods of the XmlNode class before getting into the code sample. The XML node is the most basic unit of abstraction within a DOM-modeled document.

Table 8.1 Commonly Used XmlNode Properties

Image

Table 8.2 Commonly Used XmlNode Methods

Image

The XmlDocument class, which deals with the entire document, is also itself an XmlNode. If you look at the documentation for the XmlDocument class, you’ll see that it inherits from XmlNode. This fits with the DOM pattern in that the document is a node that can have child nodes. The methods in Table 8.3 show some of the additional methods available to the XmlDocument that aren’t part of a standard node class.

Table 8.3 XmlDocument Class Methods

Image

To show off the code for manipulating an XML document, we need a document. For the rest of this chapter, we will be working with a document called items.xml, which contains a list of items from a fictitious role-playing game and the various aspects of those items. The contents of this file are shown in Listing 8.1.

Listing 8.1 Items.xml

Image

Although XML allows you to freely mix freeform text with markup, for the purposes of the code in this chapter, the examples use XML documents as pure data storage.

The code in Listing 8.2 shows the use of the XmlNode class and its associated properties and methods for traversing an XML document and displaying its contents to the console.

Listing 8.2 XML Document Display Sample

Image

Image

Querying XML with XPath

XPath is a language used for querying information contained within XML documents. An explanation of the XPath language itself is outside the scope of this chapter. If you’re looking for a tutorial on XPath, you might try http://www.w3schools.com/xpath. Several other extremely good tutorials are also found at this site.

The basic premise behind XPath is that an XPath expression is essentially a description of the result set. More specifically, anything in the source document that satisfies the XPath expression will be returned when the expression is used to select nodes. Hierarchy levels within an XML document are represented in an XPath expression using the forward slash (/). If the double slash (//) is used, it indicates that position within the document tree is irrelevant to whether or not a node satisfies the expression. This is often referred to as a “deep” search.

During filtering and selecting, attributes are specified using the @ prefix, and predicates (conditions that must be satisfied) on a node are specified within square brackets ([]).

The simplest thing you can do with XPath is to select a list of nodes without filtering the results. To do that, you simply specify the nodes you want, as shown in the following XPath statement:

/items/item

This will select all item nodes that have the items node as a parent. If you want to select just the items node, you can use the XPath expression /items. To select all item nodes without regard to their location within the hierarchy, you can use the expression //item.

Using the square bracket ([]) notation, you can also select nodes based on their position within the current context. So, to select the second item beneath the items parent, you would use the following expression:

/items/item[1]

Note the zero-based indexing when using the square bracket notation.

Before getting into the more complex XPath statements, let’s take the simple expressions and execute them in some .NET code, as shown in Listing 8.3.

Listing 8.3 Simple XPath Expressions

Image

And now take a look at some code that uses some more advanced XPath expressions:

Image

Transforming Documents with XSLT

An XSL Transformation (XSLT) essentially combines the XPath language for searching XML nodes and returning node lists with a set of functions designed specifically for converting a source XML document into a destination XML document. This destination document can be another XML document that simply contains the data in a different format, or it can be in the form of XHTML, an XML-compliant HTML document.

Converting XML into XHTML is probably the most common use for XSLT, though it is frequently used for converting document formats and facilitating data exchange between disparate systems.

To transform an XML document, you need an XSLT document. Listing 8.4 shows you an XSLT document for transforming the items.xml document shown earlier in the chapter. If you want more information on XSLT, there are several books available, as well as many online references, such as http://www.w3schools.com/xsl.

Listing 8.4 itemsTransform.xslt

Image

Image

The actual transformation is accomplished using the XslCompiledTransform class, as shown in the following code snippet:

XslCompiledTransform xct = new XslCompiledTransform();
xct.Load(@"......itemsTransform.xslt");
xct.Transform(@"........items.xml", "items.html");

The preceding code results in an HTML page that looks like the one shown in Figure 8.1.

Figure 8.1 Results of an XSL transformation of the items.xml document.

Image

Validating Documents with XSD

XML Schema Definition (XSD) is an XML dialect that describes the format, data types, and constraints of the information contained in an XML file. Countless online references on the XSD specification itself are available, as well as many publications.

This section shows you how to validate an XML document based on an existing XSD. To show this, we will create an XSD that defines the format of the items.xml document. Without knowing all that much about XSD, you can still easily create schemas by inferring them from instance documents such as items.xml with a command-line utility that ships with the .NET Framework SDK: XSD.EXE.

You can execute XSD.EXE against your instance document, and then modify it using the designer inside Visual Studio to change data types and relationships. The visual designer included with Visual Studio 2005 is an excellent tool for designing XML schemas.

Listing 8.5 contains the results of executing XSD.EXE as well as a modification to indicate that the ID of an item is an integer.

Listing 8.5 items.xsd

Image

Image

Now take a look at the code that validates an instance document against a schema:

Image

The preceding code adds an XmlSchema instance to the document itself. When the Validate method is called, the validation takes place in the background, and each time a validation error occurs, the OnValidate method is invoked. To test this code, go back to the original items.xml code and change one of the item IDs from a number to something with letters in it. When you run the code, you’ll see the following error message:

Image

Summary

This chapter has shown you how you can make use of your existing XML skills in C#. You saw how to manipulate the nodes of an XML document using the DOM class XmlDocument, and you saw how to query XML documents using XPath statements. Finally, you saw how to transform XML documents using XSL transformations.

This chapter didn’t go into much detail on the individual standards such as XSL and XPath. If you need to know more about those, many references are readily available.

Now that you have completed this chapter, you should feel comfortable working with XML in any of its forms whether you’re working on an ASP.NET application or a Windows Forms application.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.226.93.137