Chapter 14. Working with XML

XML, or eXtensible Markup Language, provides an industry-standard method for encoding information so that it is easily understandable by different software applications. It contains data and the description of data, which enables software applications to interpret and process that data.

XML specifications are defined and maintained by the World Wide Web Consortium (W3C). The latest version is XML 1.1 (Second Edition). However, XML 1.0 (currently in its fourth edition) is the most popular version, and is supported by all XML parsers. W3C states that:

You are encouraged to create or generate XML 1.0 documents if you do not need the new features in XML 1.1; XML Parsers are expected to understand both XML 1.0 and XML 1.1.[16]

This chapter will introduce XML 1.0 only, and in fact, will focus on just the most commonly used XML features. I’ll introduce you to the XMLDocument and XMLElement classes first, and you’ll learn how to create and manipulate XML documents.

Of course, once you have a large document, you’ll want to be able to find substrings, and I’ll show you two different ways to do that, using XPath and XPath Navigator. XML also forms a key component of the Service Oriented Architecture (SOA), which allows you to access remote objects across applications and platforms. The .NET Framework allows you to serialize your objects as XML, and deserialize them at their destination. I’ll cover those methods at the end of the chapter.

XML Basics (A Quick Review)

XML is a markup language, not unlike HTML, except that it is extensible—that is, the user of XML can (and does!) create new elements and properties.

Elements

In XML, a document is composed of a hierarchy of elements. An element is defined by a pair of tags, called the start and end tags. In the following example, FirstName is an element:

<FirstName>Orlando</FirstName>

A start tag is composed of the element name surrounded by a pair of angle brackets:

<FirstName>

An end tag is similar to the start tag, except that the element name is preceded by a forward slash:

</FirstName>

The content between the start and end tags is the element text, which may consist of a set of child elements. The FirstName element’s text is simply a string. On the other hand, the Customer element has three child elements:

  <Customer>
    <FirstName>Orlando</FirstName>
    <LastName>Gee</LastName>
    <EmailAddress>[email protected]</EmailAddress>
  </Customer>

The top-level element in an XML document is called its root element. Every document has exactly one root element.

An element can have zero or more child elements, and (except for the root element) every element has exactly one parent element. Elements with the same parent element are called sibling elements.

In this example, Customers (plural) is the root. The children of the root element, Customers, are the three Customer (singular) elements:

<Customers>
  <Customer>
    ...
  </Customer>
  <Customer>
    ...
  </Customer>
  <Customer>
    ...
  </Customer>
</Customers>

Each Customer has one parent (Customers) and three children (FirstName, LastName, and EmailAddress). Each of these, in turn, has one parent (Customer) and zero children.

XHTML

XHTML is an enhanced standard of HTML that follows the stricter rules of XML validity. The two most important (and most often overlooked) rules follow:

  • No elements may overlap, though they may nest. Thus:

    <element 1>
       <element2>
          <...>
       </element 2>
    </element 1>

    You may not write:

    <element 1>
       <element2>
          <...>
       </element 1>
    </element 2>

    because in the latter case, element2 overlaps element1 rather than being neatly nested within it.

  • Every element must be closed, which means that for each opened element, you must have a closing tag (or the element tag must be self-closing). Thus, for those of you who cut your teeth on forgiving browsers, it is time to stop writing:

     <br>

    and replace it with:

    <br />

X Stands for eXtensible

The key point of XML is to provide an extensible markup language. An incredibly short pop-history lesson: HTML was derived from the Structured Query Markup Language (SQML). HTML has many wonderful attributes (pardon), but if you want to add a new element to HTML, you have two choices: apply to the W3C and wait awhile, or strike out on your own and be “nonstandard.”

There was a strong need for the ability for two organizations to get together and specify tags that they could use for data exchange. Hey! Presto! XML was born as a more general-purpose markup language that allows users to define their own tags. This last point is the critical distinction of XML.

Creating XML Documents

Because XML documents are structured text documents, you can create them using a text editor and process them using string manipulation functions. To paraphrase David Platt, you can also have an appendectomy through your mouth, but it takes longer and hurts more.

To make the job easier, .NET implements a collection of classes and utilities that provide XML functionality, including the streaming XML APIs (which support XmlReader and XmlWriter), and another set of XML APIs that use the XML Document Object Model (DOM).

In Chapter 13, we used a list of customers in our examples. We will use the same customer list in this chapter, starting with Example 14-1, in which we’ll write the list of customers to an XML document.

Example 14-1. Creating an XML document
using System;
using System.Collections.Generic;
using System.Xml;

namespace Programming_CSharp
{
    // Simple customer class
    public class Customer
    {
        public string FirstName     { get; set; }
        public string LastName      { get; set; }
        public string EmailAddress  { get; set; }

        // Overrides the Object.ToString(  ) to provide a
        // string representation of the object properties.
        public override string ToString(  )
        {
            return string.Format("{0} {1}
Email:   {2}",
                        FirstName, LastName, EmailAddress);
        }
    }

    // Main program
    public class Tester
    {
        static void Main(  )
        {
            List<Customer> customers = CreateCustomerList(  );

            XmlDocument customerXml = new XmlDocument(  );
            XmlElement rootElem = customerXml.CreateElement("Customers");
            customerXml.AppendChild(rootElem);
            foreach (Customer customer in customers)
            {
                // Create new element representing the customer object.
                XmlElement customerElem = customerXml.CreateElement("Customer");

                // Add element representing the FirstName property
                // to the customer element.
                XmlElement firstNameElem = customerXml.CreateElement("FirstName");
                firstNameElem.InnerText  = customer.FirstName;
                customerElem.AppendChild(firstNameElem);

                // Add element representing the LastName property
                // to the customer element.
                XmlElement lastNameElem = customerXml.CreateElement("LastName");
                lastNameElem.InnerText = customer.LastName;
                customerElem.AppendChild(lastNameElem);

                // Add element representing the EmailAddress property
                // to the customer element.
                XmlElement emailAddress =
                    customerXml.CreateElement("EmailAddress");
                emailAddress.InnerText = customer.EmailAddress;
                customerElem.AppendChild(emailAddress);

                // Finally add the customer element to the XML document
                rootElem.AppendChild(customerElem);
            }

            Console.WriteLine(customerXml.OuterXml);
            Console.Read(  );
        }

        // Create a customer list with sample data
        private static List<Customer> CreateCustomerList(  )
        {
            List<Customer> customers = new List<Customer>
                {
                    new Customer { FirstName = "Orlando",
                                   LastName = "Gee",
                                   EmailAddress = "[email protected]"},
                    new Customer { FirstName = "Keith",
                                   LastName = "Harris",
                                   EmailAddress = "[email protected]" },
                    new Customer { FirstName = "Donna",
                                   LastName = "Carreras",
                                   EmailAddress = "[email protected]" },
                    new Customer { FirstName = "Janet",
                                   LastName = "Gates",
                                   EmailAddress = "[email protected]" },
                    new Customer { FirstName = "Lucy",
                                   LastName = "Harrington",
                                   EmailAddress = "[email protected]" }
                };
            return customers;
        }
    }
}

Tip

I’ve formatted the output here to make it easier to read; your actual format will be in a continuous string:

Output:
<Customers>
  <Customer>
    <FirstName>Orlando</FirstName>
    <LastName>Gee</LastName>
    <EmailAddress>[email protected]</EmailAddress>
  </Customer>
  <Customer>
    <FirstName>Keith</FirstName>
    <LastName>Harris</LastName>
    <EmailAddress>[email protected]</EmailAddress>
  </Customer>
  <Customer>
    <FirstName>Donna</FirstName>
    <LastName>Carreras</LastName>
    <EmailAddress>[email protected]</EmailAddress>
  </Customer>
  <Customer>
    <FirstName>Janet</FirstName>
    <LastName>Gates</LastName>
    <EmailAddress>[email protected]</EmailAddress>
  </Customer>
  <Customer>
    <FirstName>Lucy</FirstName>
    <LastName>Harrington</LastName>
    <EmailAddress>[email protected]</EmailAddress>
  </Customer>
</Customers>

Tip

We could rewrite this example with less code using LINQ to XML, which I cover in Chapter 15.

In .NET, the System.Xml namespace contains all XML-related classes that provide support to creating and processing XML documents. It is convenient to add a using directive to any code files that use classes from this namespace.

The Customer class and the CreateCustomerList function in the main Tester class are identical to those used in Chapter 13, so I will not go over them again here.

The main attraction in this example is the XML creation in the main function. First, a new XML document object is created:

XmlDocument customerXml = new XmlDocument(  );

Next, you create the root element:

XmlElement rootElem = customerXml.CreateElement("Customers");
customerXml.AppendChild(rootElem);

Creating XML elements and other objects in the XML DOM is slightly different from conventional object instantiation. The idiom is to call the CreateElement method of the XML document object to create a new element in the document, and then call its parent element’s AppendChild method to attach it to the parent. After these two operations, the customerXML document will contain an empty element:

<Customers></Customers>

or:

<Customers />

In the XML DOM, the root element is also called the document element. You can access it through the DocumentElement property of the document object:

XmlElement rootElem = customerXml.DocumentElement;

XML Elements

With the root element in hand, you can add each customer as a child node:

foreach (Customer customer in customers)
{
    // Create new element representing the customer object.
    XmlElement customerElem = customerXml.CreateElement("Customer");

In this example, you make each property of the customer object a child element of the customer element:

    // Add element representing the FirstName property to the customer element.
    XmlElement firstNameElem = customerXml.CreateElement("FirstName");
    firstNameElem.InnerText  = customer.FirstName;
    cstomerElem.AppendChild(firstNameElem);

This adds the FirstName child element and assigns the customer’s first name to its InnerText property. The result will look like this:

<FirstName>Orlando</FirstName>

The other two properties, LastName and EmailAddress, are added to the customer element in exactly the same way. Here’s an example of the complete customer element:

<Customer>
  <FirstName>Orlando</FirstName>
  <LastName>Gee</LastName>
  <EmailAddress>[email protected]</EmailAddress>
</Customer>

Finally, the newly created customer element is added to the XML document as a child of the root element:

    // Finally add the customer element to the XML document
    rootElem.AppendChild(customerElem);
}

Once all customer elements are created, this example prints the XML document:

Console.WriteLine(customerXml.OuterXml);

When you run the code, the result is just a long string containing the whole XML document and its elements. You can import it into an XML editor and format it into a more human-readable form, as in the example output shown earlier. Visual Studio includes an XML editor, so you can just paste the string into an XML file, and open it in Visual Studio. You can then use the “Format the whole document” command on the XML Editor toolbar to format the string, as shown in Figure 14-1.

Formatting the XML document in Visual Studio
Figure 14-1. Formatting the XML document in Visual Studio

XML Attributes

An XML element may have a set of attributes, which store additional information about the element. An attribute is a key/value pair contained in the start tag of an XML element:

<Customer FirstName="Orlando" LastName="Gee"></Customer>

The next example demonstrates how you can mix the use of child elements and attributes. This example creates customer elements with the customer’s name stored in attributes and the email address stored as a child element:

<Customer FirstName="Orlando" LastName="Gee">
  <EmailAddress>[email protected]</EmailAddress>
</Customer>

The only difference between this and Example 14-1 is that you store the FirstName and LastName properties as attributes to the customer elements here:

// Add an attribute representing the FirstName property
// to the customer element.
XmlAttribute firstNameAttr = customerXml.CreateAttribute("FirstName");
firstNameAttr.Value = customer.FirstName;
customerElem.Attributes.Append(firstNameAttr);

Similar to creating an element, you call the document object’s CreateAttribute method to create an XmlAttribute object in the document. Assigning the value to an attribute is a little more intuitive than assigning the element text because an attribute has no child nodes; therefore, you can simply assign a value to its Value property. For attributes, the Value property is identical to the InnerText property.

Tip

You will also need to append the attribute to an element’s Attributes property, which represents a collection of all attributes of the element. Unlike adding child elements, you cannot call the AppendChild function of elements to add attributes.

Example 14-2 shows the sample code and output.

Example 14-2. Creating an XML document containing elements and attributes
using System;
using System.Collections.Generic;
using System.IO;
using System.Xml;

namespace Programming_CSharp
{
    // Simple customer class
    public class Customer
    {
        // Same as in Example 14-1
    }

    // Main program
    public class Tester
    {
        static void Main(  )
        {
            List<Customer> customers = CreateCustomerList(  );

            XmlDocument customerXml = new XmlDocument(  );
            XmlElement rootElem = customerXml.CreateElement("Customers");
            customerXml.AppendChild(rootElem);
            foreach (Customer customer in customers)
            {
                // Create new element representing the customer object.
                XmlElement customerElem = customerXml.CreateElement("Customer");

                // Add an attribute representing the FirstName property
                // to the customer element.
                XmlAttribute firstNameAttr =
                    customerXml.CreateAttribute("FirstName");
                firstNameAttr.Value = customer.FirstName;
                customerElem.Attributes.Append(firstNameAttr);

                // Add an attribute representing the LastName property
                // to the customer element.
                XmlAttribute lastNameAttr =
                    customerXml.CreateAttribute("LastName");
                lastNameAttr.Value = customer.LastName;
                customerElem.Attributes.Append(lastNameAttr);

                // Add element representing the EmailAddress property
                // to the customer element.
                XmlElement emailAddress =
                    customerXml.CreateElement("EmailAddress");
                emailAddress.InnerText = customer.EmailAddress;
                customerElem.AppendChild(emailAddress);

                // Finally add the customer element to the XML document
                rootElem.AppendChild(customerElem);
            }

            Console.WriteLine(customerXml.OuterXml);
            Console.Read(  );
        }

        // Create a customer list with sample data
        private static List<Customer> CreateCustomerList(  )
        {
            // Same as in Example 14-1
        }
    }
}

Output:
<Customers>
  <Customer FirstName="Orlando" LastName="Gee">
    <EmailAddress>[email protected]</EmailAddress>
  </Customer>
  <Customer FirstName="Keith" LastName="Harris">
    <EmailAddress>[email protected]</EmailAddress>
  </Customer>
  <Customer FirstName="Donna" LastName="Carreras">
    <EmailAddress>[email protected]</EmailAddress>
  </Customer>
  <Customer FirstName="Janet" LastName="Gates">
    <EmailAddress>[email protected]</EmailAddress>
  </Customer>
  <Customer FirstName="Lucy" LastName="Harrington">
    <EmailAddress>[email protected]</EmailAddress>
  </Customer>
</Customers>

Being able to create XML documents to store data to be processed or exchanged is great, but it would not be of much use if you could not find information in them easily. The System.Xml.XPath namespace contains classes and utilities that provide XPath (search) support to C# programmers.

Searching in XML with XPath

In its simplest form, XPath may look similar to directory file paths. Here’s an example using the XML document containing a customer list. This document is shown in Example 14-2 and is reproduced here for convenience:

<Customers>
  <Customer FirstName="Orlando" LastName="Gee">
    <EmailAddress>[email protected]</EmailAddress>
  </Customer>
  <Customer FirstName="Keith" LastName="Harris">
    <EmailAddress>[email protected]</EmailAddress>
  </Customer>
  <Customer FirstName="Donna" LastName="Carreras">
    <EmailAddress>[email protected]</EmailAddress>
  </Customer>
  <Customer FirstName="Janet" LastName="Gates">
    <EmailAddress>[email protected]</EmailAddress>
  </Customer>
  <Customer FirstName="Lucy" LastName="Harrington">
    <EmailAddress>[email protected]</EmailAddress>
  </Customer>
</Customers>

Example 14-3 lists the code for the example.

Example 14-3. Searching an XML document using XPath
using System;
using System.Collections.Generic;
using System.Xml;

namespace Programming_CSharp
{
    public class Customer
    {
        public string FirstName { get; set; }
        public string LastName { get; set; }
        public string EmailAddress { get; set; }

        // Overrides the Object.ToString(  ) to provide a
        // string representation of the object properties.
        public override string ToString(  )
        {
            return string.Format("{0} {1}
Email:   {2}",
                        FirstName, LastName, EmailAddress);
        }
    }

    public class Tester
    {
        private static XmlDocument CreateCustomerListXml(  )
        {
            List<Customer> customers = CreateCustomerList(  );
            XmlDocument customerXml = new XmlDocument(  );
            XmlElement rootElem = customerXml.CreateElement("Customers");
            customerXml.AppendChild(rootElem);
            foreach (Customer customer in customers)
            {
                XmlElement customerElem = customerXml.CreateElement("Customer");

                XmlAttribute firstNameAttr =
                    customerXml.CreateAttribute("FirstName");
                firstNameAttr.Value = customer.FirstName;
                customerElem.Attributes.Append(firstNameAttr);

                XmlAttribute lastNameAttr =
                    customerXml.CreateAttribute("LastName");
                lastNameAttr.Value = customer.LastName;
                customerElem.Attributes.Append(lastNameAttr);

                XmlElement emailAddress =
                    customerXml.CreateElement("EmailAddress");
                emailAddress.InnerText = customer.EmailAddress;
                customerElem.AppendChild(emailAddress);

                rootElem.AppendChild(customerElem);
            }

            return customerXml;
        }

        private static List<Customer> CreateCustomerList(  )
        {
            List<Customer> customers = new List<Customer>
                {
                    new Customer {FirstName = "Douglas",
                                  LastName = "Adams",
                                  EmailAddress = "[email protected]"},
                    new Customer {FirstName = "Richard",
                                  LastName = "Dawkins",
                                  EmailAddress = "[email protected]"},
                    new Customer {FirstName = "Kenji",
                                  LastName = "Yoshino",
                                  EmailAddress = "[email protected]"},
                    new Customer {FirstName = "Ian",
                                  LastName = "McEwan",
                                  EmailAddress = "[email protected]"},
                    new Customer {FirstName = "Neal",
                                  LastName = "Stephenson",
                                  EmailAddress = "[email protected]"},
                    new Customer {FirstName = "Randy",
                                  LastName = "Shilts",
                                  EmailAddress = "[email protected]"},
                    new Customer {FirstName = "Michelangelo",
                                  LastName = "Signorile ",
                                  EmailAddress = "[email protected]"},
                    new Customer {FirstName = "Larry",
                                  LastName = "Kramer",
                                  EmailAddress = "[email protected]"},
                    new Customer {FirstName = "Jennifer",
                                  LastName = "Baumgardner",
                                  EmailAddress = "[email protected]"}
            };
            return customers;
        }

        static void Main(  )
        {
            XmlDocument customerXml = CreateCustomerListXml(  );

            Console.WriteLine("Search for single node...");
            string xPath = "/Customers/Customer[@FirstName='Douglas']";
            XmlNode oneCustomer = customerXml.SelectSingleNode(xPath);

            Console.WriteLine("
SelectSingleNode("{0}")...", xPath);
            if (oneCustomer != null)
            {
                Console.WriteLine(oneCustomer.OuterXml);
            }
            else
            {
                Console.WriteLine("Not found");
            }

            Console.WriteLine("
Search for a single element... ");
            xPath = "/Customers/Customer[@FirstName='Douglas']";
            XmlElement customerElem = customerXml.SelectSingleNode(xPath)
                                      as XmlElement;

            Console.WriteLine("
SelectSingleNode("{0}")...", xPath);
            if (customerElem != null)
            {
                Console.WriteLine(customerElem.OuterXml);
                Console.WriteLine("customerElem.HasAttributes = {0}",
                                 customerElem.HasAttributes);
            }
            else
            {
                Console.WriteLine("Not found");
            }


            Console.WriteLine("
Search using descendant axis... ");
            xPath = "descendant::Customer[@FirstName='Douglas']";
            oneCustomer = customerXml.SelectSingleNode(xPath);
            Console.WriteLine("
SelectSingleNode("{0}")...", xPath);
            if (oneCustomer != null)
            {
                Console.WriteLine(oneCustomer.OuterXml);
            }
            else
            {
                Console.WriteLine("Not found");
            }

            xPath = "descendant::Customer[attribute::FirstName='Douglas']";
            oneCustomer = customerXml.SelectSingleNode(xPath);
            Console.WriteLine("
SelectSingleNode("{0}")...", xPath);
            if (oneCustomer != null)
            {
                Console.WriteLine(oneCustomer.OuterXml);
            }
            else
            {
                Console.WriteLine("Not found");
            }

            Console.WriteLine("
Search using node values... ");
            xPath = "descendant::EmailAddress[text(  )='[email protected]']";
            XmlNode oneEmail = customerXml.SelectSingleNode(xPath);
            Console.WriteLine("
SelectSingleNode("{0}")...", xPath);
            if (oneEmail != null)
            {
                Console.WriteLine(oneEmail.OuterXml);
            }
            else
            {
                Console.WriteLine("Not found");
            }

         xPath = "descendant::Customer[EmailAddress ='[email protected]']";
         oneCustomer = customerXml.SelectSingleNode(xPath);
         Console.WriteLine("
SelectSingleNode("{0}")...", xPath);
         if (oneCustomer != null)
         {
            Console.WriteLine(oneCustomer.OuterXml);
         }
         else
         {
            Console.WriteLine("Not found");
         }
            Console.WriteLine("
Search using XPath Functions... ");
            xPath = "descendant::Customer[contains(EmailAddress, 'foo.com')]";
            XmlNodeList customers = customerXml.SelectNodes(xPath);
            Console.WriteLine("
SelectNodes("{0}")...", xPath);
            if (customers != null)
            {
                foreach (XmlNode customer in customers)
                    Console.WriteLine(customer.OuterXml);
            }
            else
            {
                Console.WriteLine("Not found");
            }

            xPath = "descendant::Customer[starts-with(@LastName, 'A') " +
                    "and contains(EmailAddress, 'foo.com')]";
            customers = customerXml.SelectNodes(xPath);
            Console.WriteLine("
SelectNodes("{0}")...", xPath);
            if (customers != null)
            {
                foreach (XmlNode customer in customers)
                    Console.WriteLine(customer.OuterXml);
            }
            else
            {
                Console.WriteLine("Not found");
            }   // end else
        }       // end main
    }           // end class
}               // end namespace


Output:
Search for single node...

SelectSingleNode("/Customers/Customer[@FirstName='Douglas']")...
<Customer FirstName="Douglas" LastName="Adams">
<EmailAddress>[email protected]</EmailAddress></Customer>

Search for a single element...

SelectSingleNode("/Customers/Customer[@FirstName='Douglas']")...
<Customer FirstName="Douglas" LastName="Adams">
<EmailAddress>[email protected]</EmailAddress></Customer>
customerElem.HasAttributes = True

Search using descendant axis...

SelectSingleNode("descendant::Customer[@FirstName='Douglas']")...
<Customer FirstName="Douglas" LastName="Adams">
<EmailAddress>[email protected]</EmailAddress></Customer>

SelectSingleNode("descendant::Customer[attribute::FirstName='Douglas']")...
<Customer FirstName="Douglas" LastName="Adams">
<EmailAddress>[email protected]</EmailAddress></Customer>

Search using node values...

SelectSingleNode("descendant::EmailAddress[text(  )='[email protected]']")...
<EmailAddress>[email protected]</EmailAddress>

SelectSingleNode("descendant::EmailAddress[text(  )='[email protected]']")...
<EmailAddress>[email protected]</EmailAddress>

Search using XPath Functions...

SelectNodes("descendant::Customer[contains(EmailAddress, 'foo.com')]")...
<Customer FirstName="Douglas" LastName="Adams">
<EmailAddress>[email protected]</EmailAddress></Customer>

<Customer FirstName="Richard" LastName="Dawkins">
<EmailAddress>[email protected]</EmailAddress></Customer>

<Customer FirstName="Kenji" LastName="Yoshino">
<EmailAddress>[email protected]</EmailAddress></Customer>

<Customer FirstName="Ian" LastName="McEwan">
<EmailAddress>[email protected]</EmailAddress></Customer>

<Customer FirstName="Neal" LastName="Stephenson">
<EmailAddress>[email protected]</EmailAddress></Customer>

<Customer FirstName="Randy" LastName="Shilts">
<EmailAddress>[email protected]</EmailAddress></Customer>

<Customer FirstName="Michelangelo" LastName="Signorile ">
<EmailAddress>[email protected]</EmailAddress></Customer>

<Customer FirstName="Larry" LastName="Kramer">
<EmailAddress>[email protected]</EmailAddress></Customer>

<Customer FirstName="Jennifer" LastName="Baumgardner">
<EmailAddress>[email protected]</EmailAddress></Customer>

<Customer FirstName="Jennifer" LastName="Baumgardner">
<EmailAddress>[email protected]</EmailAddress></Customer>

SelectNodes("descendant::Customer[starts-with(@LastName, 'A')
and contains(EmailAddress, 'foo.com')]")...
<Customer FirstName="Douglas" LastName="Adams">
<EmailAddress>[email protected]</EmailAddress></Customer>

This example refactors Example 14-2 by extracting the creation of the sample customer list XML document into the CreateCustomerListXml( ) method. You can now simply call this function in the main( ) function to create the XML document.

Tip

There are a couple of things to notice about this code. The first is that although most of the code in this book has what I would consider excessive commenting, I took the liberty? of stripping this one listing down to the level of commenting that I use in my own code: that is, “next to none.” I believe in commenting only when the code can’t speak for itself, and when it can’t I take that as a failure, typically a failure of variable or method naming, often a failure of structure. That’s not to say I never comment; just that I do so a lot less than other folks (except when I’m writing books!).

The second thing to note is that I’ve placed a lot more output statements whose entire purpose is to help you understand what you are seeing in the output; this is the kind of commenting that I think actually is helpful, and was the only kind of debugging available before the days of IDEs and breakpoints. It is good to get back to our roots.

Finally, note that for this example, I changed the names in the listing to some of my favorite writers. I did this as a tribute to them, and I hope that you will note their names and run out and buy everything they’ve written.

Searching for a Single Node

The first search is to find a customer whose first name is “Douglas”:

string xPath = "/Customers/Customer[@FirstName='Douglas']";
XmlNode oneCustomer = customerXml.SelectSingleNode(xPath);
Console.WriteLine("
SelectSingleNode("{0}")...", xPath);
if (oneCustomer != null)
{
    Console.WriteLine(oneCustomer.OuterXml);
}
else
{
    Console.WriteLine("Not found");
}

In general, you will have some ideas about the structure of XML documents you are going to process; otherwise, it will be difficult to find the information you want. Here we know the node we are looking for sits just one level below the root element. This makes it quite easy to construct the XPath using the absolute path:

/Customers/Customer[@FirstName='Douglas']

The beginning forward slash / indicates that the search should start from the top of the document. You then specify the top-level element, which is always the root element if you start from the top of the document, as in this case. Next, the target element, Customer, is specified. If the target element is a few more levels down, you can just specify the full path including all those levels, much like you do with filesystems.

Once the target element is reached, you specify the search conditions, or predicates, which are always enclosed in a pair of square brackets. In this case, you want to search for the value of the FirstName attribute, which is represented in XPath as @FirstName, where the @ prefix denotes that it is an attribute instead of an element. The value is then given to complete the condition expression.

There are many ways to execute an XPath in .NET. Here, you start with the SelectSingleNode method from the XmlDocument class. I cover other execution methods later in this example and in the next example:

XmlNode oneCustomer = customerXml.SelectSingleNode(xPath);

The SelectSingleNode method searches for nodes starting from the context node, which is the node from which the call is initiated. In this case, the context node is the XmlDocument itself, customerXml. If this method finds a node that satisfies the search condition, it returns an instance of XmlNode. In the XML DOM, XmlNode is the base class representing any nodes in XML document hierarchy. Specialized node classes such as XmlElement and XmlAttribute are all derived from this class. Even the XmlDocument itself is derived from XmlNode, because it just happens to be the top node.

If the method fails to find any node, it returns a null object. Therefore, you should always test the result against null before attempt to use it:

if (oneCustomer != null)
    Console.WriteLine(oneCustomer.OuterXml);
else
    Console.WriteLine("Not found");

In this example, the method is successful, and the resulting element is displayed. Because XmlNode is a base class, you can access common properties such as Name, Value, InnerXml, OuterXml, and ParentNode, and methods such as AppendChild. If you need to access more specialized properties such as XmlAttribute.Specified, or methods such as XmlElement.RemoveAttribute, you should cast the result to the appropriate specialized type. In such cases, you can combine the testing and casting of search results to save yourself a little bit of typing using the C# as operator:

xPath = "/Customers/Customer[@FirstName='Douglas']";
XmlElement customerElem =customerXml.SelectSingleNode(xPath) as XmlElement;
Console.WriteLine("
SelectSingleNode("{0}")...", xPath);
if (customerElem != null)
{
    Console.WriteLine(customerElem.OuterXml);
    Console.WriteLine("customerElem.HasAttributes = {0}",
       customerElem.HasAttributes);
}
else
    Console.WriteLine("Not found");

Because the result here is cased into an instance of XmlElement, you can check its HasAttributes property which is not available through XmlNode.

Searching Using Axes

In practice, you don’t always know the absolute path at design time. In such cases, you will need to use one of the XPath axes (pronounced as the plural of axis), which specify the relationship between the context node and the search target nodes.

Because you call the SelectSingleNode method through the XML document, the target nodes are the children of the document. You should therefore use the descendant axis, which specifies the immediate children and their children, and their children’s children, and so on:

xPath = "descendant::Customer[@FirstName='Douglas']";
oneCustomer = customerXml.SelectSingleNode(xPath);
Console.WriteLine("
SelectSingleNode("{0}")...", xPath);
if (oneCustomer != null)
    Console.WriteLine(oneCustomer.OuterXml);
else
    Console.WriteLine("Not found");

The descendant axis in this XPath expression means that the SelectSingleNode method will search for nodes anywhere, not just those on a specific level, in the document. The result is the same in this case. You can also use a shorthand notation, //, for the descendant axis. For instance, in the preceding example, you can also use:

 xPath = "//Customer[@FirstName='Douglas']";

In addition to the descendant axis explained earlier, other types of axes are defined in XPath. You can find more details in the XPath references at http://www.w3.org/tr/xpath#axes.

Predicates

The condition expression in XPath expressions is called a predicate. When an XPath search is performed, the predicate is evaluated against each node. In this example, each node is evaluated according to the specific predicate defined in the XPath. Here, the @ prefix is used to indicate that the evaluation will be against an attribute. This is actually an abbreviated form of the attribute axis. For instance, the following XPath expression is semantically identical to the predicate mentioned earlier, and produces the same search result:

xPath = "descendant::Customer[attribute::FirstName='Douglas']";
oneCustomer = customerXml.SelectSingleNode(xPath);
Console.WriteLine("
SelectSingleNode("{0}")...", xPath);
if (oneCustomer != null)
    Console.WriteLine(oneCustomer.OuterXml);
else
    Console.WriteLine("Not found");

If no axis is specified, XPath defaults to the element. Therefore, the following code snippet finds the customer who has a specific email address:

xPath = "descendant::Customer[EmailAddress ='[email protected]']";
oneCustomer = customerXml.SelectSingleNode(xPath);
Console.WriteLine("
SelectSingleNode("{0}")...", xPath);
if (oneCustomer != null)
{
    Console.WriteLine(oneCustomer.OuterXml);
}
else
{
    Console.WriteLine("Not found");
}

What if you want to find a node with specific text—for instance, instead of finding the customer element containing a given email address, we want to find the email address element itself? Unfortunately, because XPath and the XML DOM are separate standards, they don’t always provide the same features in the same manner. For instance, InnerText or InnerXml defined in the XML DOM cannot be used in XPath predicates. Instead, the text of an element is returned with the XPath text( ) function:

xPath = "descendant::EmailAddress[text(  )='[email protected]']";
XmlNode oneEmail = customerXml.SelectSingleNode(xPath);
Console.WriteLine("
SelectSingleNode("{0}")...", xPath);
if (oneEmail != null)
    Console.WriteLine(oneEmail.OuterXml);
else
    Console.WriteLine("Not found");

XPath provides a comprehensive list of functions, including string, numeric, and Boolean functions, which you can use to build your queries. So, be sure to read the documentation to understand what they can do for you.

So far, all the queries return a single node, but often, the search result contains a collection of nodes. Therefore, instead of using the SelectSingleNode method, you could use the SelectNodes method:

xPath = "descendant::Customer[contains(EmailAddress, 'foo.com')]";
XmlNodeList customers = customerXml.SelectNodes(xPath);
Console.WriteLine("
SelectNodes(/"{0}/")...", xPath);
if (customers != null)
{
    foreach (XmlNode customer in customers)
        Console.WriteLine(customer.OuterXml);
}
else
    Console.WriteLine("Not found");

This query finds all customers whose email address is from the same domain. As you would expect, this method returns a collection of XmlNode objects, which is contained in an instance of the XmlNodeList collection. You can iterate the result collection to see all nodes returned.

XPath Functions

The next code block shows a more complex predicate to find customers whose last name starts with A and whose email is from the same domain:

xPath = "descendant::Customer[starts-with(@LastName, 'A') " +
        "and contains(EmailAddress, 'foo.com')]";
customers = customerXml.SelectNodes(xPath);
Console.WriteLine("
SelectNodes(/"{0}/")...", xPath);
if (customers != null)
{
    foreach (XmlNode customer in customers)
        Console.WriteLine(customer.OuterXml);
}
else
    Console.WriteLine("Not found");

The predicate here is composed of evaluation against attributes and child elements. The first part checks whether the LastName attribute value starts with the letter A using the XPath starts-with(string1, string2) function, which checks whether string1 starts with string2. The two parts of the predicate are joined using the XPath and operator.

Many functions are defined in XPath; you can obtain a complete list of XPath functions from http://www.w3.org/TR/xpath#corelib.

Searching Using XPathNavigator

Another way to query XML documents using XPath is to use the .NET XPathNavigator class, which is defined in the System.Xml.XPath namespace. This namespace contains a set of classes that provide optimized operations for searching and iterating XML data using XPath.

To demonstrate the use of these functions, we will use the same set of customer data as in the previous examples, as shown in Example 14-4.

Example 14-4. Searching an XML document using XPathNavigator
using System;
using System.Collections.Generic;
using System.IO;
using System.Xml;
using System.Xml.XPath;

namespace Programming_CSharp
{
    public class Customer
    {
        public string FirstName { get; set; }
        public string LastName { get; set; }
        public string EmailAddress { get; set; }

        // Overrides the Object.ToString(  ) to provide a
        // string representation of the object properties.
        public override string ToString(  )
        {
            return string.Format("{0} {1}
Email:   {2}",
                        FirstName, LastName, EmailAddress);
        }
    }
    // Main program
    public class Tester
    {
        static void Main(  )
        {
            XmlDocument customerXml = CreateCustomerXml(  );
            XPathNavigator nav = customerXml.CreateNavigator(  );

            string xPath = "descendant::Customer[@FirstName='Douglas']";
            XPathNavigator navNode = nav.SelectSingleNode(xPath);
            Console.WriteLine("
SelectSingleNode("{0}")...", xPath);
            if (navNode != null)
            {
                Console.WriteLine(navNode.OuterXml);

                XmlElement elem = navNode.UnderlyingObject as XmlElement;
                if (elem != null)
                    Console.WriteLine(elem.OuterXml);
                else
                    Console.WriteLine("Found the wrong node!");
            }
            else
                Console.WriteLine("Customer not found");

            xPath = "descendant::Customer[starts-with(@LastName, 'A') " +
                    "and contains(EmailAddress, 'foo.com')]";
            Console.WriteLine("
Select(/"{0}/")...", xPath);
            XPathNodeIterator iter = nav.Select(xPath);
            if (iter.Count > 0)
            {
                while (iter.MoveNext(  ))
                    Console.WriteLine(iter.Current.OuterXml);
            }
            else
                Console.WriteLine("Customer not found");

            Console.WriteLine("
Now sort by FirstName...");
            XPathExpression expr = nav.Compile(xPath);
            expr.AddSort("@FirstName", Comparer<String>.Default);
            iter = nav.Select(expr);
            while (iter.MoveNext(  ))
                Console.WriteLine(iter.Current.OuterXml);

            XPathExpression expr2 = nav.Compile(xPath);
            Console.WriteLine("
And again...");
            expr2.AddSort("@FirstName", XmlSortOrder.Ascending,
                XmlCaseOrder.None, string.Empty, XmlDataType.Text);
            iter = nav.Select(expr2);
            while (iter.MoveNext(  ))
                Console.WriteLine(iter.Current.OuterXml);
        }

        // Create an XML document containing a customer list.
        private static XmlDocument CreateCustomerXml(  )
        {

            List<Customer> customers = CreateCustomerList(  );
            XmlDocument customerXml = new XmlDocument(  );
            XmlElement rootElem = customerXml.CreateElement("Customers");
            customerXml.AppendChild(rootElem);
            foreach (Customer customer in customers)
            {
                XmlElement customerElem = customerXml.CreateElement("Customer");

                XmlAttribute firstNameAttr =
                    customerXml.CreateAttribute("FirstName");
                firstNameAttr.Value = customer.FirstName;
                customerElem.Attributes.Append(firstNameAttr);

                XmlAttribute lastNameAttr =
                    customerXml.CreateAttribute("LastName");
                lastNameAttr.Value = customer.LastName;
                customerElem.Attributes.Append(lastNameAttr);

                XmlElement emailAddress =
                    customerXml.CreateElement("EmailAddress");
                emailAddress.InnerText = customer.EmailAddress;
                customerElem.AppendChild(emailAddress);

                rootElem.AppendChild(customerElem);
            }

            return customerXml;
        }
        private static List<Customer> CreateCustomerList(  )
        {
            List<Customer> customers = new List<Customer>
                {
                    new Customer { FirstName = "Douglas",
                                   LastName = "Adams",
                                   EmailAddress = "[email protected]"},
                    new Customer { FirstName = "Richard",
                                   LastName = "Adawkins",
                                   EmailAddress = "[email protected]" },
                    new Customer { FirstName = "Kenji",
                                   LastName = "Ayoshino",
                                   EmailAddress = "[email protected]" },
                    new Customer { FirstName = "Ian",
                                   LastName = "AmcEwan",
                                   EmailAddress = "[email protected]" },
                    new Customer { FirstName = "Neal",
                                   LastName = "Astephenson",
                                   EmailAddress = "[email protected]" },
                    new Customer { FirstName = "Randy",
                                   LastName = "Ashilts",
                                   EmailAddress = "[email protected]" },
                    new Customer { FirstName = "Michelangelo",
                                   LastName = "Asignorile ",
                                   EmailAddress = "[email protected]" },
                    new Customer { FirstName = "Larry",
                                   LastName = "Akramer",
                                   EmailAddress = "[email protected]" },
                    new Customer { FirstName = "Jennifer",
                                   LastName = "Abaumgardner",
                                   EmailAddress = "[email protected]" }

                };
            return customers;
        }
    }
}

Output:

<Customer FirstName="Kenji" LastName="Ayoshino">
  <EmailAddress>[email protected]</EmailAddress>
</Customer>
<Customer FirstName="Ian" LastName="AmcEwan">
  <EmailAddress>[email protected]</EmailAddress>
</Customer>
<Customer FirstName="Neal" LastName="Astephenson">
  <EmailAddress>[email protected]</EmailAddress>
</Customer>
<Customer FirstName="Randy" LastName="Ashilts">
  <EmailAddress>[email protected]</EmailAddress>
</Customer>
<Customer FirstName="Michelangelo" LastName="Asignorile ">
  <EmailAddress>[email protected]</EmailAddress>
</Customer>
<Customer FirstName="Larry" LastName="Akramer">
  <EmailAddress>[email protected]</EmailAddress>
</Customer>
<Customer FirstName="Jennifer" LastName="Abaumgardner">
  <EmailAddress>[email protected]</EmailAddress>
</Customer>

Now sort by FirstName...
<Customer FirstName="Douglas" LastName="Adams">
  <EmailAddress>[email protected]</EmailAddress>
</Customer>
<Customer FirstName="Ian" LastName="AmcEwan">
  <EmailAddress>[email protected]</EmailAddress>
</Customer>
<Customer FirstName="Jennifer" LastName="Abaumgardner">
  <EmailAddress>[email protected]</EmailAddress>
</Customer>
<Customer FirstName="Kenji" LastName="Ayoshino">
  <EmailAddress>[email protected]</EmailAddress>
</Customer>
<Customer FirstName="Larry" LastName="Akramer">
  <EmailAddress>[email protected]</EmailAddress>
</Customer>
<Customer FirstName="Michelangelo" LastName="Asignorile ">
  <EmailAddress>[email protected]</EmailAddress>
</Customer>
<Customer FirstName="Neal" LastName="Astephenson">
  <EmailAddress>[email protected]</EmailAddress>
</Customer>
<Customer FirstName="Randy" LastName="Ashilts">
  <EmailAddress>[email protected]</EmailAddress>
</Customer>
<Customer FirstName="Richard" LastName="Adawkins">
  <EmailAddress>[email protected]</EmailAddress>
</Customer>

And again...
<Customer FirstName="Douglas" LastName="Adams">
  <EmailAddress>[email protected]</EmailAddress>
</Customer>
<Customer FirstName="Ian" LastName="AmcEwan">
  <EmailAddress>[email protected]</EmailAddress>
</Customer>
<Customer FirstName="Jennifer" LastName="Abaumgardner">
  <EmailAddress>[email protected]</EmailAddress>
</Customer>
<Customer FirstName="Kenji" LastName="Ayoshino">
  <EmailAddress>[email protected]</EmailAddress>
</Customer>
<Customer FirstName="Larry" LastName="Akramer">
  <EmailAddress>[email protected]</EmailAddress>
</Customer>
<Customer FirstName="Michelangelo" LastName="Asignorile ">
  <EmailAddress>[email protected]</EmailAddress>
</Customer>
<Customer FirstName="Neal" LastName="Astephenson">
  <EmailAddress>[email protected]</EmailAddress>
</Customer>
<Customer FirstName="Randy" LastName="Ashilts">
  <EmailAddress>[email protected]</EmailAddress>
</Customer>
<Customer FirstName="Richard" LastName="Adawkins">
  <EmailAddress>[email protected]</EmailAddress>
</Customer>

Tip

We had to take some horrible liberties with the last names of some wonderful writers to get this example to work. For that, I apologize.

This example added the using System.Xml.XPath directive to include the required classes. The customer XML document is created in the same way as in previous examples:

XmlDocument customerXml = CreateCustomerXml(  );
XPathNavigator nav = customerXml.CreateNavigator(  );

Here, it also creates an instance of the XPathNavigator class, which you can create only by calling the CreateNavigator method of the target XmlDocument instance. Instead of calling the methods of the XML document, you now use the navigator object to execute queries:

string xPath = "descendant::Customer[@FirstName='Donna']";
XPathNavigator navNode = nav.SelectSingleNode(xPath);

The SelectSingleNode( ) method also returns a single node. However, it returns another XPathNavigator object from which you can query further.

You can access many of the node properties, such as InnerXml, from the navigator object. However, if you need to access properties or methods of the specific node type, you should retrieve the underlying node using the UnderlyingObject property of XPathNavigator:

Console.WriteLine("
SelectSingleNode("{0}")...", xPath);
if (navNode != null)
{
    Console.WriteLine(navNode.OuterXml);

    XmlElement elem = navNode.UnderlyingObject as XmlElement;
    if (elem != null)
        Console.WriteLine(elem.OuterXml);
    else
        Console.WriteLine("Found the wrong node!");
}
else
    Console.WriteLine("Customer not found");

Using XPathNodeIterator

For queries that may return more than one node, you should call the Select method of the XPathNavigator class:

xPath = "descendant::Customer[starts-with(@LastName, 'A') " +
        "and contains(EmailAddress, 'foo.com')]";
Console.WriteLine("
Select(/"{0}/")...", xPath);
XPathNodeIterator iter = nav.Select(xPath);
if (iter.Count > 0)
{
    while (iter.MoveNext(  ))
        Console.WriteLine(iter.Current.OuterXml);
}
else
{
    Console.WriteLine("Customer not found");
}

The Select method returns an XPathNodeIterator instance, which allows you to iterate through the results. One important feature of this approach is that the query is not executed on this line:

XPathNodeIterator iter = nav.Select(xPath);

The query is executed only when you go through the result by calling the iterator’s MoveNext( ) method. This reduces the initial hit, especially when the document is large. This is one of the performance advantages you gain by using XPathNavigator instead of searching through the XmlDocument directly.

This delayed query execution means that it’s not always a good idea to access the iterator’s Count property because this causes the query to be executed. Therefore, the code in this example is not very efficient, especially if the document or the result is large. However, it is useful when checking whether the query returns anything.

Using XPathExpression

Although the SelectNodes and SelectSingleNode methods of the XmlDocument and XPathNavigator classes accept an XPath expression as plain text, they actually compile the input expression into a state in which the XML query engine can execute it before the query is actually executed. If you call any of the SelectXXX methods with the same XPath expression again, the expression is compiled again.

If you anticipate that you may run the same query many times, it would be beneficial to compile the XPath expression yourself and use the compiled form whenever needed. In XPath, you can do this by calling the XPathNavigator’s Compile method. The result is an XPathExpression object that can be cached for later use:

XPathExpression expr = nav.Compile(xPath);
iter = nav.Select(expr);

An additional benefit of creating a compiled expression is that you can use it to sort the query results. You can add a sort condition to a compiled expression using its AddSort method:

expr.AddSort("@FirstName", Comparer<String>.Default);

The first argument is the sort key, and the second is an instance of a comparer class that implements IComparer. The .NET Framework provides a predefined generic Comparer<T> class using the singleton pattern. Therefore, if the sort key is a string, as in this example, you can use default string comparison by passing in the singleton Comparer<String>.Default instance to the AddSort method. You can also indicate a case-insensitive comparison using the System.Collections.CaseInsensitiveComparer class.

The AddSort method is overloaded, with the second version taking more arguments to specify detailed sorting requirements and to perform either a numeric or a text comparison:

expr2.AddSort(sortKey, sortOrder, caseOrder, language, dataType);

You can decide to sort in ascending or descending order, whether the lowercase or uppercase should come first, the language to use for comparison, and whether it should be a numeric or a text search:

expr2.AddSort("@FirstName", XmlSortOrder.Ascending,
              XmlCaseOrder.None, string.Empty, XmlDataType.Text);

After adding the sort condition in this example, you can see from the preceding result that the returned nodes are now ordered by their FirstName attribute.

XML Serialization

So far, you have constructed the XML representing the Customer objects by hand. As XML is becoming popular, especially with the acceptance of web services as a central component of the SOA, it is increasingly common to serialize objects into XML, transmit them across process and application boundaries, and deserialize them back into conventional objects.

Tip

For more information about SOA, see Programming WCF Services by Juval Löwy (O’Reilly).

It is therefore natural for the .NET Framework to provide a built-in serialization mechanism, as a part of the Windows Communication Foundation (WCF), to reduce the coding efforts by application developers. The System.Xml.Serialization namespace defines the classes and utilities that implement methods required for serializing and deserializing objects. Example 14-5 illustrates this.

Example 14-5. Simple XML serialization and deserialization
using System;
using System.IO;
using System.Xml.Serialization;

namespace Programming_CSharp
{
    // Simple customer class
    public class Customer
    {
        public string FirstName { get; set; }
        public string LastName { get; set; }
        public string EmailAddress { get; set; }

        // Overrides the Object.ToString(  ) to provide a
        // string representation of the object properties.
        public override string ToString(  )
        {
            return string.Format("{0} {1}
Email:   {2}",
                        FirstName, LastName, EmailAddress);
        }
    }

    // Main program
    public class Tester
    {
        static void Main(  )
        {
            Customer c1 = new Customer
                          {
                              FirstName = "Orlando",
                              LastName = "Gee",
                              EmailAddress = "[email protected]"
                          };

            XmlSerializer serializer = new XmlSerializer(typeof(Customer));
            StringWriter writer = new StringWriter(  );

            serializer.Serialize(writer, c1);
            string xml = writer.ToString(  );
            Console.WriteLine("Customer in XML:
{0}
", xml);

            Customer c2 = serializer.Deserialize(new StringReader(xml))
                          as Customer;
            Console.WriteLine("Customer in Object:
{0}", c2.ToString(  ));

            Console.ReadKey(  );
        }
    }
}

Output:
Customer in XML:
<?xml version="1.0" encoding="utf-16"?>
<Customer xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:xsd="http://www.w3.org/2001/XMLSchema">
  <FirstName>Orlando</FirstName>
  <LastName>Gee</LastName>
  <EmailAddress>[email protected]</EmailAddress>
</Customer>

Customer in Object:
Orlando Gee
Email:   [email protected]

To serialize an object using .NET XML serialization, you need to create an XmlSerializer object:

XmlSerializer serializer = new XmlSerializer(typeof(Customer));

You must pass in the type of the object to be serialized to the XmlSerializer constructor. If you don’t know the object type at design time, you can discover it by calling its GetType( ) method:

XmlSerializer serializer = new XmlSerializer(c1.GetType(  ));

You also need to decide where the serialized XML document should be stored. In this example, you simply send it to a StringWriter:

StringWriter writer = new StringWriter(  );

serializer.Serialize(writer, c1);
string xml = writer.ToString(  );
Console.WriteLine("Customer in XML:
{0}
", xml);

The resulting XML string is then displayed on the console:

<?xml version="1.0" encoding="utf-16"?>
<Customer xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:xsd="http://www.w3.org/2001/XMLSchema">
  <FirstName>Orlando</FirstName>
  <LastName>Gee</LastName>
  <EmailAddress>[email protected]</EmailAddress>
</Customer>

The first line is an XML declaration. This is to let the consumers (human users and software applications) of this document know that this is an XML file, the official version to which this file conforms, and the encoding format used. This is optional in XML, but it is generated by .NET XML Serialization.

The root element is the Customer element, with each property represented as a child element. The xmlns:xsi and xmlns:xsd attributes specify the XML schema definition used by this document. They are optional, so I will not explain them further. If you are interested, please read the XML specification or other documentation, such as the MSDN Library, for more details.

Aside from those optional parts, this XML representation of the Customer object is identical to the one created in Example 14-1. However, instead of writing tens of lines of code, you need only three lines using .NET XML Serialization classes.

Furthermore, it is just as easy to reconstruct an object from its XML form:

Customer c2 = serializer.Deserialize(new StringReader(xml))
              as Customer;
Console.WriteLine("Customer in Object:
{0}", c2.ToString(  ));

All it needs is to call the XmlSerializer.Deserialize method. It has several overloaded versions, one of which takes a TextReader instance as an input parameter. Because StringReader is derived from TextReader, you just pass an instance of StringReader to read from the XML string. The Deserialize method returns an object, so it is necessary to cast it to the correct type.

Customizing XML Serialization Using Attributes

By default, all public read/write properties are serialized as child elements. You can customize your classes by specifying the type of XML node you want for each of your public properties, as shown in Example 14-6.

Example 14-6. Customizing XML serialization with attributes
using System;
using System.IO;
using System.Xml.Serialization;

namespace Programming_CSharp
{
    // Simple customer class
    public class Customer
    {
        [XmlAttribute(  )]
        public string FirstName { get; set; }

        [XmlIgnore(  )]
        public string LastName { get; set; }

        public string EmailAddress { get; set; }

        // Overrides the Object.ToString(  ) to provide a
        // string representation of the object properties.
        public override string ToString(  )
        {
            return string.Format("{0} {1}
Email:   {2}",
                        FirstName, LastName, EmailAddress);
        }
    }

    // Main program
    public class Tester
    {
        static void Main(  )
        {
            Customer c1 = new Customer
                          {
                              FirstName = "Orlando",
                              LastName = "Gee",
                              EmailAddress = "[email protected]"
                          };

            //XmlSerializer serializer = new XmlSerializer(c1.GetType(  ));
            XmlSerializer serializer = new XmlSerializer(typeof(Customer));
            StringWriter writer = new StringWriter(  );

            serializer.Serialize(writer, c1);
            string xml = writer.ToString(  );
            Console.WriteLine("Customer in XML:
{0}
", xml);

            Customer c2 = serializer.Deserialize(new StringReader(xml)) as
                          Customer;
            Console.WriteLine("Customer in Object:
{0}", c2.ToString(  ));

            Console.ReadKey(  );
        }
    }
}

Output:
Customer in XML:
<?xml version="1.0" encoding="utf-16"?>
<Customer xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
          xmlns:xsd="http://www.w3.org/2001/XMLSchema"
          FirstName="Orlando">
  <EmailAddress>[email protected]</EmailAddress>
</Customer>

Customer in Object:
Orlando
Email:   [email protected]

The only changes in this example are a couple of added XML serialization attributes in the Customer class:

[XmlAttribute(  )]
public string FirstName { get; set; }

The first change is to specify that you want to serialize the FirstName property into an attribute of the Customer element by adding the XmlAttributeAttribute to the property:

[XmlIgnore(  )]
public string LastName { get; set; }

The other change is to tell XML serialization that you in fact do not want the LastName property to be serialized at all. You do this by adding the XmlIgnoreAttribute to the property. As you can see from the sample output, the Customer object is serialized exactly as we asked.

However, you have probably noticed that when the object is deserialized, its LastName property is lost. Because it is not serialized, the XmlSerializer is unable to assign it any value. Therefore, its value is left as the default, which is an empty string.

The goal is to exclude from serialization only those properties you don’t need or can compute or can retrieve in other ways.

Runtime XML Serialization Customization

Sometimes it may be necessary to customize the serialization of objects at runtime. For instance, your class may contain an instance of another class. The contained class may be serialized with all its properties as child elements. However, you may want to have them serialized into attributes to save some space. Example 14-7 illustrates how you can achieve this.

Example 14-7. Customizing XML serialization at runtime
using System;
using System.IO;
using System.Reflection;
using System.Xml.Serialization;

namespace Programming_CSharp
{
    // Simple customer class
    public class Customer
    {
        public string FirstName { get; set; }
        public string LastName { get; set; }
        public string EmailAddress { get; set; }

        // Overrides the Object.ToString(  ) to provide a
        // string representation of the object properties.
        public override string ToString(  )
        {
            return string.Format("{0} {1}
Email:   {2}",
                        FirstName, LastName, EmailAddress);
        }
    }

    // Main program
    public class Tester
    {
        static void Main(  )
        {
            Customer c1 = new Customer
                          {
                              FirstName = "Orlando",
                              LastName = "Gee",
                              EmailAddress = "[email protected]"
                          };

            Type customerType = typeof(Customer);
            XmlAttributeOverrides overrides = new XmlAttributeOverrides(  );
            foreach (PropertyInfo prop in customerType.GetProperties(  ))
            {
                XmlAttributes attrs = new XmlAttributes(  );
                attrs.XmlAttribute = new XmlAttributeAttribute(  );
                overrides.Add(customerType, prop.Name, attrs);
            }

            XmlSerializer serializer = new XmlSerializer(customerType, overrides);
            StringWriter writer = new StringWriter(  );

            serializer.Serialize(writer, c1);
            string xml = writer.ToString(  );
            Console.WriteLine("Customer in XML:
{0}
", xml);

            Customer c2 = serializer.Deserialize(new StringReader(xml)) as
                          Customer;
            Console.WriteLine("Customer in Object:
{0}", c2.ToString(  ));

            Console.ReadKey(  );
        }
    }
}

Output:
Customer in XML:
<?xml version="1.0" encoding="utf-16"?>
<Customer xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:xsd="http://www.w3.org/2001/XMLSchema" FirstName="
Orlando" LastName="Gee" EmailAddress="[email protected]" />

Customer in Object:
Orlando Gee
Email:   [email protected]

The Customer class in this example has no custom XML serialization attributes. Therefore, all its properties are serialized into child elements, as you have seen in previous examples. When an instance of it is serialized at runtime in the main function, we use a combination of reflection and advanced serialization techniques to ensure that the properties are serialized into attributes instead.

In .NET XML serialization, you instruct the serialization engine to override its default behavior with your custom requirements. Because you are going to use the Customer type a lot, you store it locally so that it you can use it later:

Type customerType = typeof(Customer);

To specify your custom requirements, you use the XmlAttributeOverrides class:

XmlAttributeOverrides overrides = new XmlAttributeOverrides(  );
foreach (PropertyInfo prop in customerType.GetProperties(  ))
{
    XmlAttributes attrs = new XmlAttributes(  );
    attrs.XmlAttribute = new XmlAttributeAttribute(  );
    overrides.Add(customerType, prop.Name, attrs);
}

The first step is to create a new XmlAttributeOverrides instance. You can now use .NET reflection to go through all the properties of the target class, using its GetProperties method. For each property, you override its default serialization behavior by adding an XmlAttributes object to the XmlAttributeOverrides object. To specify that you want to serialize the property as an attribute, you assign an XmlAttributeAttribute object to the XmlAttributes.XmlAttribute property. This is the equivalent of adding the XmlAttributeAttribute to a property at design time, as you did in the last example.

The XmlAttributeOverrides.Add method takes three input parameters. The first is the type of the object, the second is the name of the property, and the last is the customer serialization behavior.

To ensure that the XML serializer use the customer serialization overrides, you must pass in the overrides object to its constructor:

XmlSerializer serializer = new XmlSerializer(customerType, overrides);

The rest of this example stays unchanged from the last example. You can see from the sample output that all the properties are indeed serialized into attributes instead of child elements. When the object is deserialized, the customer overrides are also recognized and the object is reconstructed correctly.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.189.170.206