Chapter 6. XML Processing

One of the most remarkable aspects of the Microsoft .NET Framework is its deep integration with XML. In many .NET applications, you won't even be aware you're using XML technologies—they'll just be used behind the scenes when you serialize a Microsoft ADO.NET DataSet, call a web service, or read application settings from a Web.config configuration file. In other cases, you'll want to work directly with the System.Xml namespaces to manipulate Extensible Markup Language (XML) data. Common XML tasks don't just include parsing an XML file, but also include validating it against a schema, applying an Extensible Stylesheet Language (XSL) transform to create a new document or Hypertext Markup Language (HTML) page, and searching intelligently with XPath.

In .NET 3.5, Microsoft added LINQ to XML, which integrates XML handling into the LINQ model for querying data sources. You can use the same keywords and syntax to query XML as you would a collection or a database.

The recipes in this chapter describe how to do the following:

Show the Structure of an XML Document in a TreeView

Problem

You need to display the structure and content of an XML document in a Windows-based application.

Solution

Load the XML document using the System.Xml.XmlDocument class. Create a reentrant method that converts a single XmlNode into a System.Windows.Forms.TreeNode, and call it recursively to walk through the entire document.

How It Works

The .NET Framework provides several different ways to process XML documents. The one you use depends in part upon your programming task. One of the most fully featured classes is XmlDocument, which provides an in-memory representation of an XML document that conforms to the W3C Document Object Model (DOM). The XmlDocument class allows you to browse through the nodes in any direction, insert and remove nodes, and change the structure on the fly. For details of the DOM specification, go to www.w3c.org.

Note

The XmlDocument class is not scalable for very large XML documents, because it holds the entire XML content in memory at once. If you want a more memory-efficient alternative, and you can afford to read and process the XML piece by piece, consider the XmlReader and XmlWriter classes described in recipe 6-7.

To use the XmlDocument class, simply create a new instance of the class and call the Load method with a file name, a Stream, a TextReader, or an XmlReader object. It is also possible to read the XML from a simple string with the LoadXML method. You can even supply a string with a URL that points to an XML document on the Web using the Load method. The XmlDocument instance will be populated with the tree of elements, or nodes, from the source document. The entry point for accessing these nodes is the root element, which is provided through the XmlDocument.DocumentElement property. DocumentElement is an XmlElement object that can contain one or more nested XmlNode objects, which in turn can contain more XmlNode objects, and so on. An XmlNode is the basic ingredient of an XML file. Common XML nodes include elements, attributes, comments, and contained text.

When dealing with an XmlNode or a class that derives from it (such as XmlElement or XmlAttribute), you can use the following basic properties:

  • ChildNodes is an XmlNodeList collection that contains the first level of nested nodes.

  • Name is the name of the node.

  • NodeType returns a member of the System.Xml.XmlNodeType enumeration that indicates the type of the node (element, attribute, text, and so on).

  • Value is the content of the node, if it's a text or CDATA node.

  • Attributes provides a collection of node objects representing the attributes applied to the element.

  • InnerText retrieves a string with the concatenated value of the node and all nested nodes.

  • InnerXml retrieves a string with the concatenated XML markup for all nested nodes.

  • OuterXml retrieves a string with the concatenated XML markup for the current node and all nested nodes.

The Code

The following example walks through every element of an XmlDocument using the ChildNodes property and a recursive method. Each node is displayed in a TreeView control, with descriptive text that either identifies it or shows its content.

using System;
using System.Windows.Forms;
using System.Xml;
using System.IO;

namespace Apress.VisualCSharpRecipes.Chapter06
{
    public partial class Recipe06_01 : System.Windows.Forms.Form
    {
        public Recipe06_01()
        {
            InitializeComponent();
        }

        // Default the file name to the sample document.
        private void Recipe06_01_Load(object sender, EventArgs e)
        {
            txtXmlFile.Text = Path.Combine(Application.StartupPath,
               @"....ProductCatalog.xml");
        }

        private void cmdLoad_Click(object sender, System.EventArgs e)
        {
            // Clear the tree.
            treeXml.Nodes.Clear();

            // Load the XML document.
            XmlDocument doc = new XmlDocument();
            try
            {
                doc.Load(txtXmlFile.Text);
            }
            catch (Exception err)
            {
                MessageBox.Show(err.Message);
                return;
            }
// Populate the TreeView.
            ConvertXmlNodeToTreeNode(doc, treeXml.Nodes);

            // Expand all nodes.
            treeXml.Nodes[0].ExpandAll();
        }

        private void ConvertXmlNodeToTreeNode(XmlNode xmlNode,
          TreeNodeCollection treeNodes)
        {
            // Add a TreeNode node that represents this XmlNode.
            TreeNode newTreeNode = treeNodes.Add(xmlNode.Name);

            // Customize the TreeNode text based on the XmlNode
            // type and content.
            switch (xmlNode.NodeType)
            {
                case XmlNodeType.ProcessingInstruction:
                case XmlNodeType.XmlDeclaration:
                    newTreeNode.Text = "<?" + xmlNode.Name + " " +
                      xmlNode.Value + "?>";
                    break;
                case XmlNodeType.Element:
                    newTreeNode.Text = "<" + xmlNode.Name + ">";
                    break;
                case XmlNodeType.Attribute:
                    newTreeNode.Text = "ATTRIBUTE: " + xmlNode.Name;
                    break;
                case XmlNodeType.Text:
                case XmlNodeType.CDATA:
                    newTreeNode.Text = xmlNode.Value;
                    break;
                case XmlNodeType.Comment:
                    newTreeNode.Text = "<!--" + xmlNode.Value + "-->";
                    break;
            }

            // Call this routine recursively for each attribute.
            // (XmlAttribute is a subclass of XmlNode.)
            if (xmlNode.Attributes != null)
            {
                foreach (XmlAttribute attribute in xmlNode.Attributes)
                {
                    ConvertXmlNodeToTreeNode(attribute, newTreeNode.Nodes);
                }
            }
// Call this routine recursively for each child node.
            // Typically, this child node represents a nested element
            // or element content.
            foreach (XmlNode childNode in xmlNode.ChildNodes)
            {
                ConvertXmlNodeToTreeNode(childNode, newTreeNode.Nodes);
            }
        }
    }
}

Usage

As an example, consider the following simple XML file (which is included with the sample code as the ProductCatalog.xml file):

<?xml version="1.0" ?>
<!--This document is a sample catalog for demonstration purposes-->
<productCatalog>
    <catalogName>Freeman and Freeman Unique Catalog 2010</catalogName>
    <expiryDate>2012-01-01</expiryDate>

    <products>
        <product id="1001">
            <productName>Gourmet Coffee</productName>
            <description>Beans from rare Chillean plantations.</description>
            <productPrice>0.99</productPrice>
            <inStock>true</inStock>
        </product>
        <product id="1002">
            <productName>Blue China Tea Pot</productName>
            <description>A trendy update for tea drinkers.</description>
            <productPrice>102.99</productPrice>
            <inStock>true</inStock>
        </product>
    </products>
</productCatalog>

Figure 6-1 shows how this file will be rendered in the Recipe06_01 form.

The displayed structure of an XML document

Figure 6.1. The displayed structure of an XML document

Insert Nodes in an XML Document

Problem

You need to modify an XML document by inserting new data, or you want to create an entirely new XML document in memory.

Solution

Create the node using the appropriate XmlDocument method (such as CreateElement, CreateAttribute, CreateNode, and so on). Then insert it using the appropriate XmlNode method (such as InsertAfter, InsertBefore, or AppendChild).

How It Works

Inserting a node into the XmlDocument class is a two-step process. You must first create the node, and then you insert it at the appropriate location. You can then call XmlDocument.Save to persist changes.

To create a node, you use one of the XmlDocument methods starting with the word Create, depending on the type of node. This ensures that the node will have the same namespace as the rest of the document. (Alternatively, you can supply a namespace as an additional string argument.) Next, you must find a suitable related node and use one of its insertion methods to add the new node to the tree.

The Code

The following example demonstrates this technique by programmatically creating a new XML document:

using System;
using System.Xml;

namespace Apress.VisualCSharpRecipes.Chapter06
{
    public class Recipe06_02
    {
        private static void Main()
        {
            // Create a new, empty document.
            XmlDocument doc = new XmlDocument();
            XmlNode docNode = doc.CreateXmlDeclaration("1.0", "UTF-8", null);
            doc.AppendChild(docNode);

            // Create and insert a new element.
            XmlNode productsNode = doc.CreateElement("products");
            doc.AppendChild(productsNode);

            // Create a nested element (with an attribute).
            XmlNode productNode = doc.CreateElement("product");
            XmlAttribute productAttribute = doc.CreateAttribute("id");
            productAttribute.Value = "1001";
            productNode.Attributes.Append(productAttribute);
            productsNode.AppendChild(productNode);

            // Create and add the subelements for this product node
            // (with contained text data).
            XmlNode nameNode = doc.CreateElement("productName");
            nameNode.AppendChild(doc.CreateTextNode("Gourmet Coffee"));
            productNode.AppendChild(nameNode);
            XmlNode priceNode = doc.CreateElement("productPrice");
            priceNode.AppendChild(doc.CreateTextNode("0.99"));
            productNode.AppendChild(priceNode);

            // Create and add another product node.
            productNode = doc.CreateElement("product");
            productAttribute = doc.CreateAttribute("id");
            productAttribute.Value = "1002";
            productNode.Attributes.Append(productAttribute);
            productsNode.AppendChild(productNode);
nameNode = doc.CreateElement("productName");
            nameNode.AppendChild(doc.CreateTextNode("Blue China Tea Pot"));
            productNode.AppendChild(nameNode);
            priceNode = doc.CreateElement("productPrice");
            priceNode.AppendChild(doc.CreateTextNode("102.99"));
            productNode.AppendChild(priceNode);

            // Save the document (to the console window rather than a file).
            doc.Save(Console.Out);
            Console.ReadLine();
        }
    }
}

When you run this code, the generated XML document looks like this:

<?xml version="1.0"?>

<products>

   <product id="1001">

     <productName>Gourmet Coffee</productName>

     <productPrice>0.99</productPrice>

   </product>

   <product id="1002">

      <productName>Blue China Tea Pot</productName>

      <productPrice>102.99</productPrice>

</product>

</products>

Quickly Append Nodes in an XML Document

Problem

You need to add nodes to an XML document without requiring lengthy, verbose code.

Solution

Create a helper function that accepts a tag name and content, and can generate the entire element at once. Alternatively, use the XmlDocument.CloneNode method to copy branches of an XmlDocument.

How It Works

Inserting a single element into an XmlDocument requires several lines of code. You can shorten this code in several ways. One approach is to create a dedicated helper class with higher-level methods for adding elements and attributes. For example, you could create an AddElement method that generates a new element, inserts it, and adds any contained text—the three operations needed to insert most elements.

The Code

Here's an example of one such helper class:

using System;
using System.Xml;

namespace Apress.VisualCSharpRecipes.Chapter06
{
    public class XmlHelper
    {
        public static XmlNode AddElement(string tagName,
          string textContent, XmlNode parent)
        {
            XmlNode node = parent.OwnerDocument.CreateElement(tagName);
            parent.AppendChild(node);

            if (textContent != null)
            {
                XmlNode content;
                content = parent.OwnerDocument.CreateTextNode(textContent);
                node.AppendChild(content);
            }
            return node;
        }

        public static XmlNode AddAttribute(string attributeName,
          string textContent, XmlNode parent)
        {
            XmlAttribute attribute;
            attribute = parent.OwnerDocument.CreateAttribute(attributeName);
            attribute.Value = textContent;
            parent.Attributes.Append(attribute);
return attribute;
        }
    }
}

You can now condense the XML-generating code from recipe 6-2 with the simpler syntax shown here:

public class Recipe06_03
{
    private static void Main()
    {
        // Create the basic document.
        XmlDocument doc = new XmlDocument();
        XmlNode docNode = doc.CreateXmlDeclaration("1.0", "UTF-8", null);
        doc.AppendChild(docNode);
        XmlNode products = doc.CreateElement("products");
        doc.AppendChild(products);

        // Add two products.
        XmlNode product = XmlHelper.AddElement("product", null, products);
        XmlHelper.AddAttribute("id", "1001", product);
        XmlHelper.AddElement("productName", "Gourmet Coffee", product);
        XmlHelper.AddElement("productPrice", "0.99", product);

        product = XmlHelper.AddElement("product", null, products);
        XmlHelper.AddAttribute("id", "1002", product);
        XmlHelper.AddElement("productName", "Blue China Tea Pot", product);
        XmlHelper.AddElement("productPrice", "102.99", product);

        // Save the document (to the console window rather than a file).
        doc.Save(Console.Out);
        Console.ReadLine();
    }
}

Alternatively, you might want to take the helper methods such as AddAttribute and AddElement and make them instance methods in a custom class you derive from XmlDocument.

Another approach to simplifying writing XML is to duplicate nodes using the XmlNode.CloneNode method. CloneNode accepts a Boolean deep parameter. If you supply true, CloneNode will duplicate the entire branch, with all nested nodes.

Here is an example that creates a new product node by copying the first node:

// (Add first product node.)

// Create a new element based on an existing product.
product = product.CloneNode(true);
// Modify the node data.
product.Attributes[0].Value = "1002";
product.ChildNodes[0].ChildNodes[0].Value = "Blue China Tea Pot";
product.ChildNodes[1].ChildNodes[0].Value = "102.99";

// Add the new element.
products.AppendChild(product);

Notice that in this case, certain assumptions are being made about the existing nodes (for example, that the first child in the item node is always the name, and the second child is always the price). If this assumption is not guaranteed to be true, you might need to examine the node name programmatically.

Find Specific Elements by Name

Problem

You need to retrieve a specific node from an XmlDocument, and you know its name but not its position.

Solution

Use the XmlDocument.GetElementsByTagName method, which searches an entire document and returns a System.Xml.XmlNodeList containing any matches.

How It Works

The XmlDocument class provides a convenient GetElementsByTagName method that searches an entire document for nodes that have the indicated element name. It returns the results as a collection of XmlNode objects.

The Code

The following code demonstrates how you could use GetElementsByTagName to calculate the total price of items in a catalog by retrieving all elements with the name productPrice:

using System;
using System.Xml;

namespace Apress.VisualCSharpRecipes.Chapter06
{
    public class Recipe06_04
    {
        private static void Main()
        {
// Load the document.
            XmlDocument doc = new XmlDocument();
            doc.Load(@"....ProductCatalog.xml");

            // Retrieve all prices.
            XmlNodeList prices = doc.GetElementsByTagName("productPrice");

            decimal totalPrice = 0;
            foreach (XmlNode price in prices)
            {
                // Get the inner text of each matching element.
                totalPrice += Decimal.Parse(price.ChildNodes[0].Value);
            }

            Console.WriteLine("Total catalog value: " + totalPrice.ToString());
            Console.ReadLine();
        }
    }
}

Notes

You can also search portions of an XML document by using the XmlElement.GetElementsByTagName method. It searches all the descendant nodes looking for matches. To use this method, first retrieve an XmlNode that corresponds to an element. Then cast this object to an XmlElement. The following example demonstrates how to find the price node under the first product element:

// Retrieve a reference to the first product.
XmlNode product = doc.GetElementsByTagName("products")[0];

// Find the price under this product.
XmlNode productPrice
    = ((XmlElement)product).GetElementsByTagName("productPrice")[0];
Console.WriteLine("Price is " + productPrice.InnerText);

If your elements include an attribute of type ID, you can also use a method called GetElementById to retrieve an element that has a matching ID value.

Get XML Nodes in a Specific XML Namespace

Problem

You need to retrieve nodes from a specific namespace using an XmlDocument.

Solution

Use the overload of the XmlDocument.GetElementsByTagName method that requires a namespace name as a string argument. Additionally, supply an asterisk (*) for the element name if you want to match all tags.

How It Works

Many XML documents contain nodes from more than one namespace. For example, an XML document that represents a scientific article might use a separate type of markup for denoting math equations and vector diagrams, or an XML document with information about a purchase order might aggregate client and order information with a shipping record. Similarly, an XML document that represents a business-to-business transaction might include portions from both companies, written in separate markup languages.

A common task in XML programming is to retrieve the elements found in a specific namespace. You can perform this task with the overloaded version of the XmlDocument.GetElementsByTagName method that requires a namespace name. You can use this method to find tags by name or to find all the tags in the specified namespace if you supply an asterisk for the tag name parameter.

The Code

As an example, consider the following compound XML document, which includes order and client information, in two different namespaces (http://mycompany/OrderML and http://mycompany/ClientML):

<?xml version="1.0" ?>
<ord:order xmlns:ord="http://mycompany/OrderML"
 xmlns:cli="http://mycompany/ClientML">

  <cli:client>
    <cli:firstName>Sally</cli:firstName>
    <cli:lastName>Sergeyeva</cli:lastName>
  </cli:client>

  <ord:orderItem itemNumber="3211"/>
  <ord:orderItem itemNumber="1155"/>

</ord:order>

Here is a simple console application that selects all the tags in the http://mycompany/OrderML namespace:

using System;
using System.Xml;

namespace Apress.VisualCSharpRecipes.Chapter06
{
    public class Recipe06_05
    {
        private static void Main()
{
            // Load the document.
            XmlDocument doc = new XmlDocument();
            doc.Load(@"....Order.xml");

            // Retrieve all order tags.
            XmlNodeList matches = doc.GetElementsByTagName("*",
              "http://mycompany/OrderML");

            // Display all the information.
            Console.WriteLine("Element 	Attributes");
            Console.WriteLine("******* 	**********");

            foreach (XmlNode node in matches)
            {
                Console.Write(node.Name + "	");
                foreach (XmlAttribute attribute in node.Attributes)
                {
                    Console.Write(attribute.Value + "  ");
                }
                Console.WriteLine();
            }
            Console.ReadLine();
        }
    }
}

The output of this program is as follows:

Element         Attributes

*******         **********

ord:order       http://mycompany/OrderML  http://mycompany/ClientML

ord:orderItem   3211

ord:orderItem   1155

Find Elements with an XPath Search

Problem

You need to search an XML document for nodes using advanced search criteria. For example, you might want to search a particular branch of an XML document for nodes that have certain attributes or contain a specific number of nested child nodes.

Solution

Execute an XPath expression using the SelectNodes or SelectSingleNode method of the XmlDocument class.

How It Works

The XmlNode class defines two methods that perform XPath searches: SelectNodes and SelectSingleNode. These methods operate on all contained child nodes. Because the XmlDocument inherits from XmlNode, you can call XmlDocument.SelectNodes to search an entire document.

The Code

For example, consider the following XML document, which represents an order for two items. This document includes text and numeric data, nested elements, and attributes, and so is a good way to test simple XPath expressions.

<?xml version="1.0"?>
<Order id="2004-01-30.195496">
  <Client id="ROS-930252034">
    <Name>Remarkable Office Supplies</Name>
  </Client>

  <Items>
    <Item id="1001">
      <Name>Electronic Protractor</Name>
      <Price>42.99</Price>
    </Item>
    <Item id="1002">
      <Name>Invisible Ink</Name>
      <Price>200.25</Price>
    </Item>
  </Items>
</Order>

Basic XPath syntax uses a pathlike notation. For example, the path /Order/Items/Item indicates an <Item> element that is nested inside an <Items> element, which in turn is nested in a root <Order> element. This is an absolute path. The following example uses an XPath absolute path to find the name of every item in an order:

using System;
using System.Xml;

namespace Apress.VisualCSharpRecipes.Chapter06
{
    public class Recipe06_06
    {
        private static void Main()
        {
// Load the document.
            XmlDocument doc = new XmlDocument();
            doc.Load(@"....orders.xml");

            // Retrieve the name of every item.
            // This could not be accomplished as easily with the
            // GetElementsByTagName method, because Name elements are
            // used in Item elements and Client elements, and so
            // both types would be returned.
            XmlNodeList nodes = doc.SelectNodes("/Order/Items/Item/Name");

            foreach (XmlNode node in nodes)
            {
                Console.WriteLine(node.InnerText);
            }
            Console.ReadLine();
        }
    }
}

The output of this program is as follows:

Electronic Protractor

Invisible Ink

Notes

XPath provides a rich and powerful search syntax, and it is impossible to explain all the variations you can use in a short recipe. However, Table 6-1 outlines some of the key ingredients in more advanced XPath expressions and includes examples that show how they would work with the order document. For a more detailed reference, refer to the W3C XPath recommendation, at www.w3.org/TR/xpath.

Table 6.1. XPath Expression Syntax

Expression

Description

Example

/

Starts an absolute path that selects from the root node.

/Order/Items/Item selects all Item elements that are children of an Items element, which is itself a child of the root Order element.

//

Starts a relative path that selects nodes anywhere.

//Item/Name selects all the Name elements that are children of an Item element, regardless of where they appear in the document.

@

Selects an attribute of a node.

/Order/@id selects the attribute named id from the root Order element.

*

Selects any element in the path.

/Order/* selects both Items and Client nodes because both are contained by a root Order element.

|

Combines multiple paths.

/Order/Items/Item/Name|Order/Client/Name selects the Name nodes used to describe a Client and the Name nodes used to describe an Item.

.

Indicates the current (default) node.

If the current node is an Order, the expression ./Items refers to the related items for that order.

..

Indicates the parent node.

//Name/.. selects any element that is parent to a Name, which includes the Client and Item elements.

[ ]

Defines selection criteria that can test a contained node or an attribute value.

/Order[@id="2004-01-30.195496"] selects the Order elements with the indicated attribute value. /Order/Items/Item[Price > 50] selects products higher than $50 in price. /Order/Items/Item[Price > 50 and Name="Laser Printer"] selects products that match two criteria.

starts-with

Retrieves elements based on what text a contained element starts with.

/Order/Items/Item[starts-with(Name, "C")] finds all Item elements that have a Name element that starts with the letter C.

position

Retrieves elements based on position.

/Order/Items/Item[position ()=2] selects the second Item element.

count

Counts elements. You specify the name of the child element to count or an asterisk (*) for all children.

/Order/Items/Item[count(Price) = 1] retrieves Item elements that have exactly one nested Price element.

Note

XPath expressions and all element and attribute names you use inside them are always case-sensitive, because XML itself is case-sensitive.

Read and Write XML Without Loading an Entire Document into Memory

Problem

You need to read XML from a stream or write it to a stream. However, you want to process the information one node at a time, rather than loading it all into memory with an XmlDocument.

Solution

To write XML, create an XmlWriter that wraps a stream and use Write methods (such as WriteStartElement and WriteEndElement). To read XML, create an XmlReader that wraps a stream, and call Read to move from node to node.

How It Works

The XmlWriter and XmlReader classes read or write XML directly from a stream one node at a time. These classes do not provide the same features for navigating and manipulating your XML as XmlDocument, but they do provide higher performance and a smaller memory footprint, particularly if you need to deal with large XML documents.

Both XmlWriter and XmlReader are abstract classes, which means you cannot create an instance of them directly. Instead, you should call the Create method of XmlWriter or XmlReader and supply a file or stream. The Create method will return the right derived class based on the options you specify. This allows for a more flexible model. Because your code uses the base classes, it can work seamlessly with any derived class. For example, you could switch to a validating reader (as shown in the next recipe) without needing to modify your code.

To write XML to any stream, you can use the streamlined XmlWriter. It provides Write methods that write one node at a time. These include the following:

  • WriteStartDocument, which writes the document prologue, and WriteEndDocument, which closes any open elements at the end of the document.

  • WriteStartElement, which writes an opening tag for the element you specify. You can then add more elements nested inside this element, or you can call WriteEndElement to write the closing tag.

  • WriteElementString, which writes an entire element, with an opening tag, a closing tag, and text content.

  • WriteAttributeString, which writes an entire attribute for the nearest open element, with a name and value.

Using these methods usually requires less code than creating an XmlDocument by hand, as demonstrated in recipes 6-2 and 6-3.

To read the XML, you use the Read method of the XmlReader. This method advances the reader to the next node and returns true. If no more nodes can be found, it returns false. You can retrieve information about the current node through XmlReader properties, including its Name, Value, and NodeType.

To find out whether an element has attributes, you must explicitly test the HasAttributes property and then use the GetAttribute method to retrieve the attributes by name or index number. The XmlTextReader class can access only one node at a time, and it cannot move backward or jump to an arbitrary node, which gives much less flexibility than the XmlDocument class.

The Code

The following console application writes and reads a simple XML document using the XmlWriter and XmlReader classes. This is the same XML document created in recipes 6-2 and 6-3 using the XmlDocument class.

using System;
using System.Xml;
using System.IO;
using System.Text;

namespace Apress.VisualCSharpRecipes.Chapter06
{
    public class Recipe06_07
    {
        private static void Main()
        {
            // Create the file and writer.
            FileStream fs = new FileStream("products.xml", FileMode.Create);

            // If you want to configure additional details (like indenting,
            // encoding, and new line handling), use the overload of the Create
            // method that accepts an XmlWriterSettings object instead.
            XmlWriter w = XmlWriter.Create(fs);

            // Start the document.
            w.WriteStartDocument();
            w.WriteStartElement("products");

            // Write a product.
            w.WriteStartElement("product");
            w.WriteAttributeString("id", "1001");
            w.WriteElementString("productName", "Gourmet Coffee");
            w.WriteElementString("productPrice", "0.99");
            w.WriteEndElement();

            // Write another product.
            w.WriteStartElement("product");
            w.WriteAttributeString("id", "1002");
            w.WriteElementString("productName", "Blue China Tea Pot");
            w.WriteElementString("productPrice", "102.99");
            w.WriteEndElement();
// End the document.
            w.WriteEndElement();
            w.WriteEndDocument();
            w.Flush();
            fs.Close();

            Console.WriteLine("Document created. " +
             "Press Enter to read the document.");
            Console.ReadLine();

            fs = new FileStream("products.xml", FileMode.Open);

            // If you want to configure additional details (like comments,
            // whitespace handling, or validation), use the overload of the Create
            // method that accepts an XmlReaderSettings object instead.
            XmlReader r = XmlReader.Create(fs);

            // Read all nodes.
            while (r.Read())
            {
                if (r.NodeType == XmlNodeType.Element)
                {
                    Console.WriteLine();
                    Console.WriteLine("<" + r.Name + ">");

                    if (r.HasAttributes)
                    {
                        for (int i = 0; i < r.AttributeCount; i++)
                        {
                            Console.WriteLine("	ATTRIBUTE: " +
                              r.GetAttribute(i));
                        }
                    }
                }
                else if (r.NodeType == XmlNodeType.Text)
                {
                    Console.WriteLine("	VALUE: " + r.Value);
                }
            }
            Console.ReadLine();
        }
    }
}

Often, when using the XmlReader, you are searching for specific nodes rather than processing every element, as in this example. The approach used in this example does not work as well in this situation. It forces you to read element tags, text content, and CDATA sections separately, which means you need to explicitly keep track of where you are in the document. A better approach is to read the entire node and text content at once (for simple text-only nodes) by using the ReadElementString method. You can also use methods such as ReadToDescendant, ReadToFollowing, and ReadToNextSibling, all of which allow you to skip some nodes.

For example, you can use ReadToFollowing("Price"); to skip straight to the next Price element, without worrying about whitespace, comments, or other elements before it. (If a Price element cannot be found, the XmlReader moves to the end of the document, and the ReadToFollowing method returns false.)

Validate an XML Document Against a Schema

Problem

You need to validate the content of an XML document by ensuring that it conforms to an XML schema.

Solution

When you call XmlReader.Create, supply an XmlReaderSettings object that indicates you want to perform validation. Then move through the document one node at a time by calling XmlReader.Read, catching any validation exceptions. To find all the errors in a document without catching exceptions, handle the ValidationEventHandler event on the XmlReaderSettings object given as parameter to XmlReader.

How It Works

An XML schema defines the rules that a given type of XML document must follow. The schema includes rules that define the following:

  • The elements and attributes that can appear in a document

  • The data types for elements and attributes

  • The structure of a document, including what elements are children of other elements

  • The order and number of child elements that appear in a document

  • Whether elements are empty, can include text, or require fixed values

XML schema documents are beyond the scope of this chapter, but you can learn much from a simple example. This recipe uses the product catalog first presented in recipe 6-1.

At its most basic level, XML Schema Definition (XSD) defines the elements that can occur in an XML document. XSD documents are themselves written in XML, and you use a separate predefined element (named <element>) in the XSD document to indicate each element that is required in the target document. The type attribute indicates the data type. Here is an example for a product name:

<xsd:element name="productName" type="xsd:string" />

And here is an example for the product price:

<xsd:element name="productPrice" type="xsd:decimal" />

The basic schema data types are defined at www.w3.org/TR/xmlschema-2. They map closely to .NET data types and include string, int, long, decimal, float, dateTime, boolean, and base64Binary—to name a few of the most frequently used types.

Both productName and productPrice are simple types because they contain only character data. Elements that contain nested elements are called complex types. You can nest them together using a <sequence> tag, if order is important, or an <all> tag if it is not. Here is how you might model the <product> element in the product catalog. Notice that attributes are always declared after elements, and they are not grouped with a <sequence> or <all> tag because order is never important:

<xsd:complexType name="product">
  <xsd:sequence>
    <xsd:element name="productName" type="xsd:string"/>
    <xsd:element name="productPrice" type="xsd:decimal"/>
    <xsd:element name="inStock" type="xsd:boolean"/>
  </xsd:sequence>
  <xsd:attribute name="id" type="xsd:integer"/>
</xsd:complexType>

By default, a listed element can occur exactly one time in a document. You can configure this behavior by specifying the maxOccurs and minOccurs attributes. Here is an example that allows an unlimited number of products in the catalog:

<xsd:element name="product" type="product" maxOccurs="unbounded" />

Here is the complete schema for the product catalog XML:

<?xml version="1.0"?>
<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema">

   <!-- Define the complex type product. -->
   <xsd:complexType name="product">
      <xsd:sequence>
         <xsd:element name="productName" type="xsd:string"/>
         <xsd:element name="description" type="xsd:string"/>
         <xsd:element name="productPrice" type="xsd:decimal"/>
         <xsd:element name="inStock" type="xsd:boolean"/>
      </xsd:sequence>
      <xsd:attribute name="id" type="xsd:integer"/>
   </xsd:complexType>

   <!-- This is the structure the document must match.
        It begins with a productCatalog element that nests other elements. -->
   <xsd:element name="productCatalog">
      <xsd:complexType>
         <xsd:sequence>
            <xsd:element name="catalogName" type="xsd:string"/>
            <xsd:element name="expiryDate" type="xsd:date"/>

            <xsd:element name="products">
               <xsd:complexType>
                  <xsd:sequence>
                     <xsd:element name="product" type="product"
                      maxOccurs="unbounded" />
                  </xsd:sequence>
               </xsd:complexType>
            </xsd:element>
</xsd:sequence>
      </xsd:complexType>
   </xsd:element>

</xsd:schema>

The XmlReader class can enforce these schema rules, providing you explicitly request a validating reader when you use the XmlReader.Create method. (Even if you do not use a validating reader, an exception will be thrown if the reader discovers XML that is not well formed, such as an illegal character, improperly nested tags, and so on.)

Once you have created your validating reader, the validation occurs automatically as you read through the document. As soon as an error is found, the XmlReader raises a ValidationEventHandler event with information about the error on the XmlReaderSettings object given at creation time. If you want, you can handle this event and continue processing the document to find more errors. If you do not handle this event, an XmlException will be raised when the first error is encountered, and processing will be aborted.

The Code

The next example shows a utility class that displays all errors in an XML document when the ValidateXml method is called. Errors are displayed in a console window, and a final Boolean variable is returned to indicate the success or failure of the entire validation operation.

using System;
using System.Xml;
using System.Xml.Schema;

namespace Apress.VisualCSharpRecipes.Chapter06
{
    public class ConsoleValidator
    {
        // Set to true if at least one error exists.
        private bool failed;

        public bool Failed
        {
            get {return failed;}
        }

        public bool ValidateXml(string xmlFilename, string schemaFilename)
        {
            // Set the type of validation.
            XmlReaderSettings settings = new XmlReaderSettings();
            settings.ValidationType = ValidationType.Schema;

            // Load the schema file.
            XmlSchemaSet schemas = new XmlSchemaSet();
            settings.Schemas = schemas;
            // When loading the schema, specify the namespace it validates
            // and the location of the file. Use null to use
// the targetNamespace value from the schema.
            schemas.Add(null, schemaFilename);

            // Specify an event handler for validation errors.
            settings.ValidationEventHandler += ValidationEventHandler;

            // Create the validating reader.
            XmlReader validator = XmlReader.Create(xmlFilename, settings);

            failed = false;
            try
            {
                // Read all XML data.
                while (validator.Read()) {}
            }
            catch (XmlException err)
            {
                // This happens if the XML document includes illegal characters
                // or tags that aren't properly nested or closed.
                Console.WriteLine("A critical XML error has occurred.");
                Console.WriteLine(err.Message);
                failed = true;
            }
            finally
            {
                validator.Close();
            }

            return !failed;
        }

        private void ValidationEventHandler(object sender,
          ValidationEventArgs args)
         {
            failed = true;

            // Display the validation error.
            Console.WriteLine("Validation error: " + args.Message);
            Console.WriteLine();
        }
    }
}

Here is how you would use the class to validate the product catalog:

public class Recipe06_08
{
    private static void Main()
    {
        ConsoleValidator consoleValidator = new ConsoleValidator();
        Console.WriteLine("Validating ProductCatalog.xml.");
bool success = consoleValidator.ValidateXml(@"....ProductCatalog.xml",
           @"....ProductCatalog.xsd");
        if (!success)
            Console.WriteLine("Validation failed.");
        else
            Console.WriteLine("Validation succeeded.");

        Console.ReadLine();
    }
}

If the document is valid, no messages will appear, and the success variable will be set to true. But consider what happens if you use a document that breaks schema rules, such as the ProductCatalog_Invalid.xml file shown here:

<?xml version="1.0" ?>
<productCatalog>
    <catalogName>Acme Fall 2003 Catalog</catalogName>
    <expiryDate>Jan 1, 2004</expiryDate>

    <products>
        <product id="1001">
            <productName>Magic Ring</productName>
            <productPrice>$342.10</productPrice>
            <inStock>true</inStock>
        </product>
        <product id="1002">
            <productName>Flying Carpet</productName>
            <productPrice>982.99</productPrice>
            <inStock>Yes</inStock>
        </product>
    </products>
</productCatalog>

If you attempt to validate this document, the success variable will be set to false, and the output will indicate each error:

Validating ProductCatalog_Invalid.xml.

Validation error: The 'expiryDate' element has an invalid value according to

its data type. [path information truncated]

Validation error: The 'productPrice' element has an invalid value according to
its data type. [path information truncated]



Validation error: The 'inStock' element has an invalid value according to its

 data type. [path information truncated]



Validation failed.

Finally, if you want to validate an XML document and load it into an in-memory XmlDocument, you need to take a slightly different approach. The XmlDocument provides its own Schemas property, along with a Validate method that checks the entire document in one step. When you call Validate, you supply a delegate that points to your validation event handler.

Here is how it works:

XmlDocument doc = new XmlDocument();
doc.Load(@"....Product_Catalog.xml");

// Specify the schema information.
XmlSchemaSet schemas = new XmlSchemaSet();
schemas.Add(null, @"....ProductCatalog.xsd");
doc.Schemas = schemas;

// Validate the document.
doc.Validate(new ValidationEventHandler(ValidationEventHandler));

Use XML Serialization with Custom Objects

Problem

You need to use XML as a serialization format. However, you don't want to process the XML directly in your code—instead, you want to interact with the data using custom objects.

Solution

Use the System.Xml.Serialization.XmlSerializer class to transfer data from your object to XML, and vice versa. You can also mark up your class code with attributes to customize its XML representation.

How It Works

The XmlSerializer class allows you to convert objects to XML data, and vice versa. This process is used natively by web services and provides a customizable serialization mechanism that does not require a single line of custom code. The XmlSerializer class is even intelligent enough to correctly create arrays when it finds nested elements.

The only requirements for using XmlSerializer are as follows:

  • The XmlSerializer serializes only properties and public variables.

  • The classes you want to serialize must include a default zero-argument constructor. The XmlSerializer uses this constructor when creating the new object during deserialization.

  • All class properties must be readable and writable. This is because XmlSerializer uses the property get accessor to retrieve information and the property set accessor to restore the data after deserialization.

Note

You can also store your objects in an XML-based format using .NET serialization and System.Runtime.Serialization.Formatters.Soap.SoapFormatter. In this case, you simply need to make your class serializable—you do not need to provide a default constructor or ensure all properties are writable. However, this gives you no control over the format of the serialized XML.

To use XML serialization, you must first mark up your data objects with attributes that indicate the desired XML mapping. You can find these attributes in the System.Xml.Serialization namespace and include the following:

  • XmlRoot specifies the name of the root element of the XML file. By default, XmlSerializer will use the name of the class. You can apply this attribute to the class declaration.

  • XmlElement indicates the element name to use for a property or public variable. By default, XmlSerializer will use the name of the property or public variable.

  • XmlAttribute indicates that a property or public variable should be serialized as an attribute, not an element, and specifies the attribute name.

  • XmlEnum configures the text that should be used when serializing enumerated values. If you don't use XmlEnum, the name of the enumerated constant will be used.

  • XmlIgnore indicates that a property or public variable should not be serialized.

The Code

For example, consider the product catalog first shown in recipe 6-1. You can represent this XML document using ProductCatalog and Product objects. Here's the class code that you might use:

using System;
using System.Xml.Serialization;

namespace Apress.VisualCSharpRecipes.Chapter06
{
    [XmlRoot("productCatalog")]
    public class ProductCatalog
    {
        [XmlElement("catalogName")]
        public string CatalogName;

        // Use the date data type (and ignore the time portion in the
        // serialized XML).
        [XmlElement(ElementName="expiryDate", DataType="date")]
        public DateTime ExpiryDate;

        // Configure the name of the tag that holds all products
        // and the name of the product tag itself.
        [XmlArray("products")]
        [XmlArrayItem("product")]
        public Product[] Products;

        public ProductCatalog()
        {
            // Default constructor for deserialization.
        }

        public ProductCatalog(string catalogName, DateTime expiryDate)
        {
            this.CatalogName = catalogName;
            this.ExpiryDate = expiryDate;
        }
    }

    public class Product
    {
        [XmlElement("productName")]
        public string ProductName;

        [XmlElement("productPrice")]
        public decimal ProductPrice;

        [XmlElement("inStock")]
        public bool InStock;

        [XmlAttributeAttribute(AttributeName="id", DataType="integer")]
public string Id;

        public Product()
        {
            // Default constructor for serialization.
        }

        public Product(string productName, decimal productPrice)
        {
            this.ProductName = productName;
            this.ProductPrice = productPrice;
        }
    }
}

Notice that these classes use the XML serialization attributes to rename element names (using Pascal casing in the class member names and camel casing in the XML tag names), indicate data types that are not obvious, and specify how <product> elements will be nested in the <productCatalog>.

Using these custom classes and the XmlSerializer object, you can translate XML into objects, and vice versa. The following is the code you would need to create a new ProductCatalog object, serialize the results to an XML document, deserialize the document back to an object, and then display the XML document:

using System;
using System.Xml;
using System.Xml.Serialization;
using System.IO;

namespace Apress.VisualCSharpRecipes.Chapter06
{
    public class Recipe06_09
    {
        private static void Main()
        {
            // Create the product catalog.
            ProductCatalog catalog = new ProductCatalog("New Catalog",
              DateTime.Now.AddYears(1));
            Product[] products = new Product[2];
            products[0] = new Product("Product 1", 42.99m);
            products[1] = new Product("Product 2", 202.99m);
            catalog.Products = products;

            // Serialize the order to a file.
            XmlSerializer serializer = new XmlSerializer(typeof(ProductCatalog));
            FileStream fs = new FileStream("ProductCatalog.xml", FileMode.Create);
            serializer.Serialize(fs, catalog);
            fs.Close();

            catalog = null;

            // Deserialize the order from the file.
            fs = new FileStream("ProductCatalog.xml", FileMode.Open);
catalog = (ProductCatalog)serializer.Deserialize(fs);

            // Serialize the order to the console window.
            serializer.Serialize(Console.Out, catalog);
            Console.ReadLine();
        }
    }
}

Create a Schema for a .NET Class

Problem

You need to create an XML schema based on one or more C# classes. This will allow you to validate XML documents before deserializing them with the XmlSerializer.

Solution

Use the XML Schema Definition Tool (xsd.exe) command-line utility included with the .NET Framework. Specify the name of your assembly as a command-line argument, and add the /t:[TypeName] parameter to indicate the types you want to convert.

How It Works

Recipe 6-9 demonstrated how to use the XmlSerializer to serialize .NET objects to XML and deserialize XML into .NET objects. But if you want to use XML as a way to interact with other applications, business processes, or non–.NET Framework applications, you'll need an easy way to validate the XML before you attempt to deserialize it. You will also need to define an XML schema document that defines the structure and data types used in your XML format so that other applications can work with it. One quick solution is to generate an XML schema using the xsd.exe command-line utility.

The xsd.exe utility is included with the .NET Framework. If you have installed Microsoft Visual Studio .NET, you will find it in a directory like C:Program FilesMicrosoft Visual Studio .NETFrameworkSDKBin. The xsd.exe utility can generate schema documents from compiled assemblies. You simply need to supply the file name and indicate the class that represents the XML document with the / t:[TypeName] parameter.

Usage

For example, consider the ProductCatalog and Product classes shown in recipe 6-9. You could create the XML schema for a product catalog with the following command line:

xsd Recipe06-09.exe /t:ProductCatalog

You need to specify only the ProductCatalog class on the command line because this class represents the actual XML document. The generated schema in this example will represent a complete product catalog, with contained product items. It will be given the default file name schema0.xsd. You can now use the validation technique shown in recipe 6-8 to test whether the XML document can be successfully validated with the schema.

Generate a Class from a Schema

Problem

You need to create one or more C# classes based on an XML schema. You can then create an XML document in the appropriate format using these objects and the XmlSerializer.

Solution

Use the xsd.exe command-line utility included with the .NET Framework. Specify the name of your schema file as a command-line argument, and add the /c parameter to indicate you want to generate class code.

How It Works

Recipe 6-10 introduced the xsd.exe command-line utility, which you can use to generate schemas based on class definitions. The reverse operation—generating C# source code based on an XML schema document—is also possible. This is primarily useful if you want to write a certain format of XML document but you do not want to manually create the document by writing individual nodes with the XmlDocument class or the XmlWriter class. Instead, by using xsd.exe, you can generate a set of full .NET objects. You can then serialize these objects to the required XML representation using the XmlSerializer, as described in recipe 6-9.

To generate source code from a schema, you simply need to supply the file name of the schema document and add the /c parameter to indicate you want to generate the required classes.

Usage

For example, consider the schema shown in recipe 6-8. You can generate C# code for this schema with the following command line:

xsd ProductCatalog.xsd /c

This will generate one file (ProductCatalog.cs) with two classes: Product and ProductCalalog. These classes are similar to the ones created in recipe 6-9, except that the class member names match the XML document exactly. Optionally, you can add the /f parameter. If you do, the generated classes will be composed of public fields. If you do not, the generated classes will use public properties instead (which simply wrap private fields).

Perform an XSL Transform

Problem

You need to transform an XML document into another document using an XSLT stylesheet.

Solution

Use the System.Xml.Xsl.XslCompiledTransform class. Load the XSLT stylesheet using the XslCompiledTransform.Load method, and generate the output document by using the Transform method and supplying a source document.

How It Works

XSLT is an XML-based language designed to transform one XML document into another document. You can use XSLT to create a new XML document with the same data but arranged in a different structure, or to select a subset of the data in a document. You can also use it to create a different type of structured document. XSLT is commonly used in this manner to format an XML document into an HTML page.

The Code

XSLT is a rich language, and creating XSL transforms is beyond the scope of this book. However, you can learn how to create simple XSLT documents by looking at a basic example. This recipe transforms the orders.xml document shown in recipe 6-6 into an HTML document with a table, and then displays the results. To perform this transformation, you'll need the following XSLT stylesheet:

<?xml version="1.0" encoding="UTF-8" ?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
    version="1.0" >

  <xsl:template match="Order">
    <html><body><p>
    Order <b><xsl:value-of select="Client/@id"/></b>
    for <xsl:value-of select="Client/Name"/></p>
    <table border="1">
    <td>ID</td><td>Name</td><td>Price</td>
    <xsl:apply-templates select="Items/Item"/>
    </table></body></html>
  </xsl:template>
<xsl:template match="Items/Item">
    <tr>
    <td><xsl:value-of select="@id"/></td>
    <td><xsl:value-of select="Name"/></td>
    <td><xsl:value-of select="Price"/></td>
    </tr>
  </xsl:template>

</xsl:stylesheet>

Essentially, every XSLT stylesheet consists of a set of templates. Each template matches some set of elements in the source document and then describes the contribution that the matched element will make to the resulting document. To match the template, the XSLT document uses XPath expressions, as described in recipe 6-6.

The orders.xslt stylesheet contains two template elements (as children of the root stylesheet element). The first template matches the root Order element. When the XSLT processor finds an Order element, it outputs the tags necessary to start an HTML table with appropriate column headings and inserts some data about the client using the value-of command, which outputs the text result of an XPath expression. In this case, the XPath expressions (Client/@id and Client/Name) match the id attribute and the Name element.

Next, the apply-templates command branches off and performs processing of any contained Item elements. This is required because there might be multiple Item elements. Each Item element is matched using the XPath expression Items/Item. The root Order node is not specified because Order is the current node. Finally, the initial template writes the tags necessary to end the HTML document.

If you execute this transform on the sample orders.xml file shown in recipe 6-6, you will end up with the following HTML document:

<html>
  <body>
    <p>
    Order <b>ROS-930252034</b>
    for Remarkable Office Supplies</p>
    <table border="1">
      <td>ID</td>
      <td>Name</td>
      <td>Price</td>
      <tr>
        <td>1001</td>
        <td>Electronic Protractor</td>
        <td>42.99</td>
      </tr>
      <tr>
        <td>1002</td>
        <td>Invisible Ink</td>
        <td>200.25</td>
      </tr>
    </table>
  </body>
</html>

To apply an XSLT stylesheet in .NET, you use the XslCompiledTransform class. The following code shows a Windows-based application that programmatically applies the transformation and then displays the transformed file in a window using the WebBrowser control:

using System;
using System.Collections.Generic;
using System.ComponentModel;
using System.Data;
using System.Drawing;
using System.Text;
using System.Windows.Forms;
using System.Xml.Xsl;
using System.Xml;

namespace Apress.VisualCSharpRecipes.Chapter06
{
    public partial class TransformXml : Form
    {
        public TransformXml()
        {
            InitializeComponent();
        }

        private void TransformXml_Load(object sender, EventArgs e)
        {
            XslCompiledTransform transform = new XslCompiledTransform();

            // Load the XSTL stylesheet.
            transform.Load(@"....orders.xslt");

            // Transform orders.xml into orders.html using orders.xslt.
            transform.Transform(@"....orders.xml", @"....orders.html");

            webBrowser1.Navigate(Application.StartupPath + @"....orders.html");
        }
    }
}

Figure 6-2 shows the application.

The stylesheet output for orders.xml

Figure 6.2. The stylesheet output for orders.xml

In this example, the code uses the overloaded version of the Transform method that saves the result document directly to disk, although you could receive it as a stream and process it inside your application instead. The following code shows an alternate approach that keeps the document content in memory at all times (with no external results file). The XslCompiledTransform writes the results to an XmlWriter that wraps a StringBuilder. The content is then copied from the StringBuilder into the WebBrowser through the handy WebBrowser.DocumentText property. The results are identical.

StringBuilder htmlContent = new StringBuilder();
XmlWriter results = XmlWriter.Create(htmlContent);
transform.Transform(@"....orders.xml", results);
webBrowser1.DocumentText = htmlContent.ToString();

Load XML with LINQ

Problem

You need to load an XML tree in order to perform LINQ to XML operations.

Solution

Use the static methods of the System.Xml.Linq.XElement class to handle existing XML.

How It Works

The System.Xml.Linq.XElement class represents an XML element. LINQ represents XML as a tree of XElements, such that one class is used for every element from the root nodes to the terminating child nodes. The static XElement.Load method will load and parse an XML document from a number of sources, returning an instance of XElement representing the root node. The Load method is overloaded to support a range of data sources, as shown in Table 6-2.

Table 6.2. Overloads of the XElement.Load Method

Method

Description

Load(Stream)

Loads the XML data from a stream

Load(String)

Loads the XML from a file, the name of which is obtained from the string

Load(TextReader)

Loads the XML from a System.IO.TextReader

Load(XMLReader)

Loads the XML from a System.Xml.XmlReader (see recipe 6-7)

You can control some of the load options by using the System.Xml.Linq.LoadOptions enumeration as an additional argument to the Load method—see the .NET Framework documentation for details.

The Code

The following example demonstrates using the four different data sources accepted by the XElement.Load method:

using System;
using System.IO;
using System.Xml;
using System.Xml.Linq;
namespace Recipe06_13
{
    class Recipe06_13
    {
        static void Main(string[] args)
        {
            // Define the path to the sample file.
            string filename = @"....ProductCatalog.xml";

            // Load the XML using the file name.
            Console.WriteLine("Loading using file name");
            XElement root = XElement.Load(filename);
            // Write out the XML.
            Console.WriteLine(root);
            Console.WriteLine("Press enter to continue");
            Console.ReadLine();

            // Load via a stream to the file.
            Console.WriteLine("Loading using a stream");
            FileStream filestream = File.OpenRead(filename);
            root = XElement.Load(filestream);
            // Write out the XML.
            Console.WriteLine(root);
            Console.WriteLine("Press enter to continue");
            Console.ReadLine();

            // Load via a textreader.
            Console.WriteLine("Loading using a TextReader");
            TextReader reader = new StreamReader(filename);
            root = XElement.Load(reader);
            // Write out the XML.
            Console.WriteLine(root);
            Console.WriteLine("Press enter to continue");
            Console.ReadLine();

            // Load via an xmlreader.
            Console.WriteLine("Loading using an XmlReader");
            XmlReader xmlreader = new XmlTextReader(new StreamReader(filename));
            root = XElement.Load(xmlreader);
            // Write out the XML.
            Console.WriteLine(root);
            Console.WriteLine("Press enter to continue");
            Console.ReadLine();

        }
    }
}

Create a New XML Tree with LINQ

Problem

You need to create a new XML tree for use with LINQ.

Solution

Create a new instance of System.Xml.Linq.XElement for the root element. Child nodes and attributes can be added by passing instances of XElement or XAttribute as constructor arguments or by calling the Add instance method.

How It Works

There are two ways to create an XML tree using LINQ to XML. The first is to create an instance of XElement and pass in instances of XElement for child nodes and XAttribute for attributes of the root node as constructor arguments. A simple example follows:

XElement root = new XElement("myrootnode",
    new XAttribute("first_attribute", "first_attribute_value"),
    new XAttribute("second_attribute", "second_attribute_value"),
    new XElement("first_element", "first_element"),
    new XElement("second_element", "second_element",
        new XAttribute("nested_attribute", "nested_attribute_value"))
);

This approach looks confusing, but starts to make sense once you start to write code yourself. You can supply any number of constructor arguments, and each XElement you create to represent a child node will accept constructor arguments itself to represent its own children and attributes. The preceding example is equivalent to the following XML:

<myrootnode

   first_attribute="first_attribute_value" second_attribute="second_attribute_value">

<first_element>first_element</first_element>

<second_element

   nested_attribute="nested_attribute_value">second_element</second_element>

</myrootnode>

The second approach to create an XML tree with XML to LINQ is to create individual instances of XElement and XAttribute and add them to their parent node using the XElement.Add instance method. See the example in this recipe for a demonstration.

In addition to XElement and XAttribute, the System.Xml.Linq namespace includes classes that represent other XML types—including XComment, XDeclaration, and XCData. The .NET Framework contains a full description of each type available, but of particular interest are XDocument and XDeclaration—these classes allow you to create the standard XML declaration at the start of your data.

The Code

The following example creates one element using the constructor arguments and adds another using the XElement.Add method. The root element is then added to an instance of XDocument along with an XDeclaration. The XDocument is written out to the console via an XMLTextWriter, which ensures that the XML header is included (it is omitted if the XDocument instance is passed to Console.WriteLine).

using System;
using System.Xml.Linq;
using System.Xml;

namespace Recipe06_14
{
    class Recipe06_14
    {
        static void Main(string[] args)
        {

            XElement root = new XElement("products",
                new XElement("product",
                    new XAttribute("id", 1001),
                    new XElement("productName", "Gourmet Coffee"),
                    new XElement("description",
                        "The finest beans from rare Chillean plantations."),
                    new XElement("productPrice", 0.99),
                    new XElement("inStock", true)
                ));

            XElement teapot = new XElement("product");
            teapot.Add(new XAttribute("id", 1002));
            teapot.Add(new XElement("productName", "Blue China Tea Pot"));
            teapot.Add(new XElement("description",
                "A trendy update for tea drinkers."));
            teapot.Add(new XElement("productPrice", 102.99));
            teapot.Add(new XElement("inStock", true));
            root.Add(teapot);

            XDocument doc = new XDocument(
                new XDeclaration("1.0", "", ""),
                root);
doc.Save(Console.Out);        }
    }
}

The output of the example is as follows (formatted for easier reading):

<?xml version="1.0" encoding="ibm850"?>

<products>

<product id="1001">

<productName>Gourmet Coffee</productName>

<description>The finest beans from rare Chillean plantations.</description>

<productPrice>0.99</productPrice>

<inStock>true</inStock>

</product>

<product id="1002">

<productName>Blue China Tea Pot</productName>

<description>A trendy update for tea drinkers.</description>

<productPrice>102.99</productPrice>

<inStock>true</inStock>

</product>

</products>

Press any key to continue . . .

Query XML with LINQ

Problem

You need to query an XML tree to find elements with a given name, attribute, or other characteristic.

Solution

Use the XML tree as the data source for a LINQ query.

How It Works

The System.Xml.Linq.XElement class is a valid data source for LINQ queries. The basic sequence for querying an XML tree is the same as the one we used in Chapter 2 when querying a collection:

  1. Start a new LINQ query using the from keyword, providing a variable name that you will use to make selections (for example, from element in root.Elements()).

  2. Identify the conditions to use in selecting elements with the where keyword.

  3. Indicate what value will be added to the result set from each matching element using the select keyword.

  4. Specify the way in which you wish the results to be sorted using the orderby keyword.

When using XElement as the LINQ source, the result is an IEnumerable of XElements, containing those elements from your XML tree that match your search criteria. LINQ queries can be written using the keywords that have been added to C# (from, where, select, etc.), or by using instance methods that fulfill the same purpose—in the case of XElement, you should call the Elements instance method to obtain an IEnumerable<XElement> to use as the basis for your queries.

To select an element, you use the properties and methods of the XElement class. For example, to find all elements in an XML tree that have an attribute color with value blue, you would call

from element in root.Elements() where (string)element.Attribute("color")

    == "blue" select element;

To achieve the same result using instance methods, you would call

root.Elements().Where(e => (string)e.Attribute("color") == "blue").Select(e => e);

The result type from a LINQ query depends on what kind of element you retrieve with the select keyword, but will always return an instance of IEnumerable, which you can use in a foreach loop to run through the matched elements, or as the basis for further LINQ queries. The preceding example calls (which are functionally equivalent) return an IEnumerable<XElement>. If you use the select keyword to obtain a value representing a characteristic of an element, such as an attribute value, then the generic type of IEnumerable will be different.

The Code

The following example loads an XML tree from the file store.xml, which is included in the sample code. Elements that are named Products and that have a value of 16 for the child element CategoryID are selected, and the value of the ModelName child element is printed out. The query is performed once using the LINQ keywords and again using the instance methods and lambda expressions—it is a matter of preference as to which technique you adopt.

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Xml.Linq;


namespace Recipe06_15
{
    class Recipe06_15
    {
        static void Main(string[] args)
        {
            // Load the XML tree from the sample file.
            XElement rootElement = XElement.Load(@"....store.xml");

            // Select the name of elements who have a category ID of 16.
            IEnumerable<string> catEnum = from elem in rootElement.Elements()
                                          where (elem.Name == "Products" &&
                                            ((string)elem.Element("CategoryID"))
                                                == "16")
                                          select ((string)elem.Element("ModelName"));

            foreach (string stringVal in catEnum)
            {
                Console.WriteLine("Category 16 item: {0}", stringVal);
            }

            Console.WriteLine("Press enter to proceed");
            Console.ReadLine();

            // Perform the select again using instance methods.
            IEnumerable<string> catEnum2 = rootElement.Elements().Where(e
                => e.Name == "Products"
                && (string)e.Element("CategoryID") == "16").Select(

e => (string)e.Element("ModelName"));


            foreach (string stringVal in catEnum2)
            {
                Console.WriteLine("Category 16 item: {0}", stringVal);
            }
        }
    }
}

Modify an XML Tree with LINQ

The Problem

You need to add, remove, or modify elements in an XML tree.

The Solution

Use the Add, Replace*, and Remove* methods of the XElement class.

How It Works

The first step to add, remove, or modify an XML tree is to select the element you wish to change. You can do this by using a LINQ query, or through the Attribute and Element methods of XElement.

To add a new element, call the Add method of the XElement you wish to use as the parent.

To modify elements, perform a LINQ query to select the elements of interest and enumerate the results, calling the ReplaceAttributes or ReplaceNodes methods to modify the XElement you have selected, or the ReplaceWith method to replace the selected element with a new element.

To remove an element, call the Remove method on the XElement instance that you no longer require.

The Code

The following example loads an XML file called ProductCatalog.xml, which is included with the source code for this chapter. After printing out the XML that has been loaded, the following tasks are performed:

  1. Find all elements called product and modify the value of the attribute id.

  2. Remove all elements that contain the word tea in a child node called description.

  3. Create a new element and add it to a suitable parent.

using System;
using System.Collections.Generic;
using System.Linq;
using System.Xml.Linq;

namespace Recipe06_16
{
    class Recipe06_16
    {
        static void Main(string[] args)
        {
            // Load the XML tree from the file.
            XElement rootElem = XElement.Load(@"....ProductCatalog.xml");
// Write out the XML.
            Console.WriteLine(rootElem);

            Console.WriteLine("Press enter to continue");
            Console.ReadLine();

            // Select all of the product elements.
            IEnumerable<XElement> prodElements
                = from elem in rootElem.Element("products").Elements()
                    where (elem.Name == "product")
                    select elem;

            // Run through the elements and change the ID attribute.
            foreach(XElement elem in prodElements)
            {
                // Get the current product ID.
                int current_id = Int32.Parse((string)elem.Attribute("id"));
                // Perform the replace operation on the attribute.
                elem.ReplaceAttributes(new XAttribute("id", current_id + 500));
            }

            Console.WriteLine(rootElem);
            Console.WriteLine("Press enter to continue");
            Console.ReadLine();

            // Remove all elements that contain the word "tea" in the description.
            IEnumerable<XElement> teaElements = from elem in
rootElem.Element("products").Elements()
                    where (((string)elem.Element("description")).Contains("tea"))
                    select elem;

            foreach (XElement elem in teaElements)
            {
                elem.Remove();
            }

            Console.WriteLine(rootElem);
            Console.WriteLine("Press enter to continue");
            Console.ReadLine();

            // Define and add a new element.
            XElement newElement = new XElement("product",
                new XAttribute("id", 3000),
                new XElement("productName", "Chrome French Press"),
                new XElement("description",
                    "A simple and elegant way of making great coffee"),
                new XElement("productPrice", 25.00),
                new XElement("inStock", true));
rootElem.Element("products").Add(newElement);

            Console.WriteLine(rootElem);
        }
    }
}
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.144.107.116