Chapter 9. Introducing LINQ to XML

This chapter covers

  • LINQ to XML design principles
  • LINQ to XML class hierarchy
  • Loading, parsing, and manipulating XML

In the first three parts of this book, we introduced you to the new language features in C# and VB that help enable LINQ, the default implementation of the standard query operators that work over objects—LINQ to Objects—as well as the implementation of LINQ for working with relational data—LINQ to SQL. In this chapter, we introduce you to another import piece of LINQ—LINQ to XML.

LINQ to XML allows you to use the powerful query capabilities offered by LINQ with XML data. Rather than learn a new API for querying XML, we can stick with the familiar query syntax that we’ve already learned for querying objects and relational data.

In addition to allowing us to query XML using LINQ, LINQ to XML also provides developers with a new XML programming API. The programming API is a lightweight, in-memory API that has been designed to take advantage of the latest .NET Framework. It provides functionality similar to the DOM, but does so with a redesigned API that is more intuitive.

In this chapter, we’re going to focus on the new XML programming API offered with LINQ. Understanding the LINQ to XML API will give us a foundation that we’ll build upon as we dive deeper later in this chapter, as well as in chapters 10 and 11. Once we have a firm understanding of the LINQ to XML API, we’ll see how we can begin to query and transform XML data with LINQ to XML in chapter 10.

To become experts in the LINQ to XML API, we first need to back up and become familiar with its key design principles. In this chapter, we’ll introduce those design principles, along with several of the key concepts that are at the heart of the API. Once we understand why Microsoft chose to create LINQ to XML, we’ll plunge into the LINQ to XML class hierarchy. As we look at the class hierarchy, we’ll identify several of the key constructors and methods that we’ll use when working with the LINQ to XML API toward the end of this chapter. Once we have an overview of the classes provided by LINQ to XML, we’ll look at how we can begin to use the LINQ to XML classes to perform the key operations necessary for building applications that use XML data, such as how to load, parse, create, update, delete, and save XML.

Before we get too deep into the details of the LINQ to XML API, we first need to understand what an XML API is, and what it’s good for.

9.1. What is an XML API?

An XML API provides developers with a programming interface for working with XML data. By utilizing an XML API, we can build applications that make use of XML. To illustrate our need for such an API, think about how we might build an application that makes use of an XML file that contains a list of web site links, as shown in listing 9.1.

Listing 9.1. Sample XML file containing web site links
<links>
  <link>
    <url>http://linqinaction.net</url>
    <name>LINQ in Action</name>
  </link>
  <link>
    <url>http://hookedonlinq.com</url>
    <name>Hooked on LINQ</name>
  </link>
  <link>
    <url>http://msdn.microsoft.com/data/linq/</url>
    <name>The LINQ Project</name>
  </link>
</links>

To build an application that uses this XML file, we need a way to open the XML file and read its contents. We also might need a way to create a file with the same structure as the file shown, as well as a way to modify specific links contained within the XML file. If you’ve done any work with XML, you already know that these scenarios are exactly what an XML API is designed for. Rather than resorting to brute force string manipulation, we use an API that is designed to make loading, manipulating, and saving XML easy for programmers.

Over the years we’ve seen many different implementations of APIs for working with XML. While they all share some common attributes, they each have a different style and approach that make them unique. Today, when working with XML in .NET, we can choose from a variety of APIs. Our choice largely depends upon what we’re trying to accomplish. If we’re interested in the low-level parsing of XML, we can use the XmlTextReader class. If we’re dealing with large documents, we might choose a streaming API such as XmlReader. And if we’re interested in an API that will make it easy to traverse the XML, we might choose to use the DOM available via the XmlNode class, or the XPathNavigator class, which allows traversal of XML nodes via XPath expressions. Each API provides unique advantages and has specific strengths and weaknesses. But what they all have in common is their goal of allowing developers to build applications that use XML.

With so many .NET XML API choices available today, you might be wondering why we need LINQ to XML at all. After all, it appears we have a lot of specialized XML APIs that are designed for working with XML data. Let’s now take a look at why we need yet another XML programming API.

9.2. Why do we need another XML programming API?

With existing APIs, developers have too much to think about. We have to know when to choose between XSLT, XPath, XQuery, and XML DOM. We have to worry about the subtle points of a lot of different APIs and need to learn technologies that have completely different conceptual models. For those working with XML day in and day out, this might not be a problem, but for the majority of developers, the depth and breadth of technological choices for working with XML is overwhelming.

LINQ to XML aims to solve these problems by providing mainstream developers with a simple, yet powerful, XML programming API. It provides the query and transformation power of XQuery and XPath integrated into .NET programming languages, as well as an in-memory programming API that makes working with XML data consistent and predictable.

In addition to providing developers with a more usable XML API, LINQ to XML also aims to take advantage of the advancements in programming languages that have occurred since the DOM/SAX was created nearly a decade ago. Language features such as nullable types and functional construction are in wide use today, and developers working with XML should be able to leverage these language advancements in their daily work. Additionally, LINQ itself brings many language advancements such as extension methods, anonymous types, and lambda expressions. In order for LINQ to fulfill its goal of providing a single query API for all data, Microsoft needed to ensure the LINQ story surrounding XML was compelling.

It could be argued that instead of creating a brand-new API, Microsoft should have reworked its existing APIs. Although Microsoft considered adding LINQ sup-port to the existing APIs, retrofitting them would be difficult without breaking existing applications. An attempt to do so would cause a great deal of confusion among developers and would raise the complexity of those APIs to a point that they’d be unusable for most tasks. Since one of the primary goals was to make a more usable XML API, the complexity that changing existing APIs would bring made it a less viable option.

If what we’ve just said has yet to convince you, don’t worry, because as you begin to work with LINQ to XML you’ll quickly see why Microsoft chose to create a new XML API. LINQ to XML has been designed for LINQ, and it shows!

Let’s now look at the core LINQ to XML design principles to get a better understanding of how LINQ to XML differs from existing .NET XML APIs.

9.3. LINQ to XML design principles

To make working with XML more productive and enjoyable for the average XML programmer, Microsoft has taken a completely new approach with the design of LINQ to XML. It has been designed to be a lightweight XML-programming API, both from a conceptual as well as from a memory and performance perspective. As we’ll see in section 9.4, the LINQ to XML data model has been closely aligned with the W3C Information Set.[1]

1 The W3C Information Set is a specification that provides a consistent set of definitions for the information in an XML document. For more information, visit the W3C Information Set web site: http://www.w3.org/TR/xml-infoset/.

To fully appreciate how the design principles that we’re about to discuss make a difference when working with XML, let’s create a simple XML document using today’s most prominent XML-programming API, the DOM, then compare it against how we create the same XML document using LINQ to XML. Our simple example will show how LINQ to XML can make our lives as XML developers easier and more productive.

Our aim is to create an XML document that contains the details of the books contained within our LINQBooks sample application. Let’s start with a simple document that contains the most important book in anyone’s library (see listing 9.2).

Listing 9.2. The most important book in anyone’s library
<books>
  <book>
    <title>LINQ in Action</title>
    <author>Fabrice Marguerie </author>
    <author>Steve Eichert</author>
    <author>Jim Wooley</author>
    <publisher>Manning</publisher>
  </book>
</books>

Now let’s write the code necessary for creating our document using the DOM. See listing 9.3.

Listing 9.3. Create an XML document using the DOM
XmlDocument doc = new XmlDocument();
XmlElement books = doc.CreateElement("books");
XmlElement author1 = doc.CreateElement("author");
author1.InnerText = "Fabrice Marguerie";
XmlElement author2 = doc.CreateElement("author");
author2.InnerText = "Steve Eichert";
XmlElement author3 = doc.CreateElement("author");
author3.InnerText = "Jim Wooley";
XmlElement title = doc.CreateElement("title");
title.InnerText = "LINQ in Action";
XmlElement book = doc.CreateElement("book");
book.AppendChild(author1);
book.AppendChild(author2);
book.AppendChild(author3);
book.AppendChild(title);
books.AppendChild(book);
doc.AppendChild(books);

As we can see, creating XML documents using the DOM requires us to use an imperative construction model. First, we create our element within the context of our document, and then we append it to its parent. The imperative construction model results in code that looks nothing like the resulting XML. Rather than being hierarchical like the XML we’re trying to produce, the code is flat with everything at a single level. Additionally, we need to create a lot of temporary variables to hold onto each element we create. The result is a block of code that is hard to read, debug, and maintain. The structure of the code has no relationship to the structure of the XML we’re creating. In contrast, let’s look at the code required to create the same XML using LINQ to XML.

Listing 9.4. Create an XML document using LINQ to XML
new XElement("books",
  new XElement("book",
    new XElement("author", "Fabrice Marguerie"),
    new XElement("author", "Steve Eichert"),
    new XElement("author", "Jim Wooley"),
    new XElement("title", "LINQ in Action"),
    new XElement("publisher", "Manning")
  )
);

By providing convenient constructors for creating elements within the context of a document-free environment, we can quickly write the code necessary for creating our document using the LINQ to XML programming API. We no longer have to worry about creating our elements within the context of a parent document, and we can construct our XML using a structure very similar to that of the resulting XML.

Our simple example demonstrated several of the key design differences between LINQ to XML and the DOM. To highlight the difference even more, we’re now going to explore LINQ to XML’s key concepts and examine the underlying design principles of LINQ to XML in detail.

9.3.1. Key concept: functional construction

LINQ to XML provides a powerful approach to creating XML elements, referred to as functional construction. Functional construction allows a complete XML tree to be created in a single statement. Rather than imperatively building up our XML document by creating a series of temporary variables for each node, we build XML in a functional manner, which allows the XML to be built in a way that closely resembles the resulting XML.

When working with the DOM, notice how much code we need to write just to keep our elements around so we can assign them values and append them to the appropriate parent element. Not only do we need to write a lot more code, but also the code doesn’t look at all like the resulting XML. In order to build XML using the imperative model that the DOM requires, we need to stop thinking about our XML and instead think about how the XML DOM works.

The goal of functional construction is to allow programmers to build XML in a way that fits with how they think about XML. By allowing developers to stay focused on the XML and not have to switch gears, the LINQ to XML API makes developers’ lives more pleasant and enjoyable. Isn’t happiness in life defined by how nicely your XML API lets you create XML documents?

As we can see in figure 9.1, the LINQ to XML code on the left closely resembles the resulting XML that’s shown on the right.

Figure 9.1. LINQ to XML’s functional construction allows the code for creating XML to closely resemble the resulting XML.

When we start discussing how to create XML in section 9.5.3, we’ll revisit and reexamine how functional construction is made possible in LINQ to XML. For now, let’s move on to the second key concept of LINQ to XML, context-free XML creation.

9.3.2. Key concept: context-free XML creation

When creating XML using the DOM, everything must be done within the context of a parent document. This document-centric approach to creating XML results in code that is hard to read, write, and debug. Within LINQ to XML, elements and attributes have been granted first-class status. They’re standalone values that can be created outside the context of a document or parent element. This allows programmers to work with XML in a much more natural way. Rather than going through factory methods to create elements and attributes, they can be created using the compositional constructors offered by the XElement and XAttribute class.

The result is code that is much more readable and understandable. In addition, it is easier to create methods that accept and return elements and attributes, since they no longer have to be constructed within the context of their parent document.

Although documents have lost their elite status within LINQ to XML, they still have their place. When creating full XML documents that have XML declarations, document type definitions, and XML processing instructions, LINQ to XML offers the XDocument class.

As we work through the rest of this chapter, you’ll begin to see the benefits of working within a context-free API. We know you’re excited to see more key concepts, so now let’s move on to simplified names.

9.3.3. Key concept: simplified names

One of the most confusing aspects of XML is all the XML names, XML namespaces, and namespace prefixes. When creating elements with the DOM, developers have several overloaded factory methods that allow them to include details of the fully expanded name of an element. How the DOM figures out the name, namespace, and prefix is confusing and complicates the API unnecessarily. Within LINQ to XML, XML names have been greatly simplified. Rather than having to worry about local names, qualified names, namespaces, and namespace prefixes, we can focus on a single fully expanded name. The XName class represents a fully expanded name, which includes the namespace and local name for the elements. When a namespace is included as part of an XName, it takes the following form: {http://schemas.xyxcorp.com/}localname.

In addition to simplifying the process of creating elements that use namespaces, LINQ to XML also makes it much easier to query an XML tree for elements that have a namespace specified. Let’s look at the code for querying the following RSS feed (which has namespaces), shown in listing 9.5.

Listing 9.5. An RSS feed that uses XML namespaces
<?xml-stylesheet href=http://iqueryable.com/friendly-rss.xsl
 type="text/xsl" media="screen"?>
<rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/"
 xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
 xmlns:wfw="http://wellformedweb.org/CommentAPI/">
  <channel>
    <title>Steve Eichert</title>
    <link>http://iqueryable.com/</link>
    <generator>ActiveType CMS v0.1</generator>
    <dc:language>en-US</dc:language>
    <description />
    <item>
      <dc:creator>Steve Eichert</dc:creator>
      <title>Parsing WordML using LINQ to XML</title>
      <link>http://iqueryable.com/LINQ/ParsingWordMLusingLINQ to XML</link>
      <pubDate>Wed, 02 Aug 2006 15:52:44 GMT</pubDate>
      <guid>http://iqueryable.com/LINQ/ParsingWordMLusingLINQ to XML</guid>
      <comments>
      http://iqueryable.com/LINQ/ParsingWordMLusingLINQ to XML#comments
      </comments>
      <wfw:commentRss>
      http://iqueryable.com/LINQ/ParsingWordMLusingLINQ to
XML/commentRss.aspx
      </wfw:commentRss>
      <slash:comments>1</slash:comments>
      <description>Foo...</description>
    </item>
  </channel>
</rss>

Note that the RSS feed uses several XML namespaces: http://purl.org/dc/elements/1.1/, http://purl.org/rss/1.09/modules/slash/, and http://well-formedweb.org/commentapi/. Listing 9.6 shows the code for using the DOM to select values out of elements that use a namespace prefix for the aforementioned namespaces.

Listing 9.6. Working with XML containing namespaces via the DOM
XmlDocument doc = new XmlDocument();
doc.Load("http://iqueryable.com/rss.aspx");

XmlNamespaceManager ns = new XmlNamespaceManager(doc.NameTable);
ns.AddNamespace("dc", "http://purl.org/dc/elements/1.1/");
ns.AddNamespace("slash", "http://purl.org/rss/1.0/modules/slash/");
ns.AddNamespace("wfw", "http://wellformedweb.org/CommentAPI/");

XmlNodeList commentNodes = doc.SelectNodes("//slash:comments", ns);
foreach(XmlNode node in commentNodes) {
    Console.WriteLine(node.InnerText);
}

When querying the RSS feed using the DOM, we need to create an XMLNamespaceManager and remember to use it every time we do a search on the document. Unless of course we don’t plan on querying for elements that have a prefix, in which case we can get rid of the namespace manager altogether:

XmlNodeList titleNodes = doc.SelectNodes("/rss/channel/item/title");
foreach(XmlNode node in titleNodes) {
  Console.WriteLine(node.InnerText);
}

Depending on what we’re querying, we have slightly different APIs. We have to remember when to use a namespace manager and when to forgo it. LINQ to XML provides a more natural way of handling namespaces. Instead of working with an XMLNamespaceManager and having to remember when to use it, we remember one simple rule:

  • Always use fully expanded names when working with elements and attributes.

If the element that you’re interested in has a namespace associated with it, use it when constructing your XName; if it doesn’t, then don’t. Listing 9.7 shows the LINQ to XML code for querying our sample XML.

Listing 9.7. Querying XML containing namespaces with LINQ to XML

As you can see, the way we deal with namespaces in LINQ to XML is straightforward. In our first query, we build our fully expanded name (XName) by appending the element’s local name to our XNamespace (slash). In the second query for the titles of the items in the RSS feed, we use just the local name (title) since it doesn’t have any namespace associated with it. By combining the namespace and the local name into a single concept, LINQ to XML makes working with XML documents that use namespaces and namespace prefixes much simpler. Everything is wrapped up into a single concept and encapsulated in a single class, XName.

That completes our quick tour of the key concepts within LINQ to XML. Throughout the next chapters, you’ll see many examples of these key concepts, as well as how they can make your life easier when working with XML. Now let’s jump into the class hierarchy itself so that we can see how everything we’ve talked about thus far manifests itself in the classes and objects we’ll use to build XML applications.

9.4. LINQ to XML class hierarchy

Before moving on to look at how we can use the LINQ to XML programming API to load, create, and update XML, we need to understand the major classes we’ll be using. Fortunately, LINQ to XML has a relatively small hierarchy, and only a handful of classes that you’ll work with day to day. The class hierarchy is figure 9.2 shows the major classes defined in the LINQ to XML API.

Figure 9.2. The LINQ to XML class hierarchy consists of a small number of classes, but together they provide developers with a powerful programming API for working with XML.

LINQ to XML is a small, focused API that has been designed to allow programmers to work with XML in a more productive and intuitive manner.

At the top of figure 9.2 we have the abstract XObject class. The XObject class serves as a base class for the majority of the classes within the LINQ to XML class hierarchy. It provides an AddAnnotation method for adding user-defined information, such as line numbers, to LINQ to XML objects, as well as a RemoveAnnotation for getting rid of an annotation when it’s no longer desired. To retrieve annotations, XObject offers the Annotation, Annotation<T>, Annotations, and Annotations<T> axis methods.

Just below XObject in the class diagram is the abstract XNode class. XNode is the base class for all LINQ to XML classes that represent element nodes. It provides common operations for updates using the imperative style such as AddAfterSelf, AddBeforeSelf, and Remove, as well as axis methods such as Ancestors, ElementsAfterSelf, ElementsBeforeSelf, NodesAfterSelf, and NodesBeforeSelf.

Just below XNode in the class hierarchy is the XContainer class. XContainer is an abstract base class for all XNode objects that can contain other XNode objects. XContainer adds additional imperative update methods such as Add, AddFirst, RemoveNodes, and ReplaceNodes. It also adds axis methods such as Nodes, Descendants, Element, and Elements. XContainer serves as the base class of two of the most important classes within the LINQ to XML hierarchy, XElement and XDocument.

Although it appears low in the class hierarchy, the most fundamental class within LINQ to XML is XElement. The XElement class represents XML element nodes that contain other elements as child nodes. It adds further axis methods such as Attributes, AncestorsAndSelf, and DescendantAndSelf, as well as additional imperative update methods such as RemoveAll, RemoveAttributes, SetElementValue, and SetAttributeValue. As the fundamental class within LINQ to XML, XElement also provides a static Load method, which allows XML to be loaded from external sources, as well as a static Parse method that allows an XElement to be created from a string of XML. Finally, XElement offers a Save method for saving the XML tree that it represents to disk, as well as a WriteTo method that allows the XML to be written to an XmlWriter. In addition to being able to contain other XNode objects, XElement also has the ability to have attributes assigned.

The XAttribute class represents attributes within LINQ to XML. Unlike many of the other core classes within LINQ to XML, XAttribute does not inherit from XNode. XAttribute objects are name/value pairs that are associated with XElement objects. The XAttribute class provides a Parent axis property, as well as a single imperative Remove method.

As we mentioned earlier in this chapter, the importance of XML documents has been greatly deemphasized in LINQ to XML, but they’re still needed from time to time. It’s for this purpose that LINQ to XML provides the XDocument class. The XDocument class represents a complete XML document. Like the XElement class, it offers both a static Load method for loading XML documents from external sources and a static Parse method that allows XML documents to be created from a string. It also offers the same Save and WriteTo methods that allow the actual XML document that they represent to be saved. The primary difference between the XElement and XDocument classes is that an XDocument contains a single root XElement, as well as the ability to contain

  • One XML declaration
  • One XML document type
  • XML processing instructions

As mentioned earlier, one of the key concepts of LINQ to XML is the simplification of XML names. The two classes that help with this simplification are the XName and XNamespace classes. XName represents a fully expanded name for an XElement or XAttribute. The fully expanded name is represented in the string format {namespace}localname. The XNamespace class represents the namespace portion of an XName, and as such can be retrieved using the Namespace property of an XName. The XName and XNamespace classes have an implicit operator overload defined that allows a string formatted in expanded XML Name format to automatically be converted into an XName and XNamespace. The implicit overloads allow us to use strings in place of XName and XNamespace objects when constructing XElement and XAttribute objects.

While there are several other classes within the LINQ to XML hierarchy, they’re complementary classes that you likely won’t see in your everyday programming efforts. See table 9.1.

Table 9.1. Complementary LINQ to XML classes

Class

Description

XDeclaration Represents an XML declaration. An XML declaration is used to declare the XML version, the encoding, and whether or not the XML document is standalone.
XComment Represents an XML comment.
XDocumentType Represents an XML DTD.
XProcessingInstruction Represents an XML processing instruction. A processing instruction is intended to convey information to an application that processes the XML.
XStreamingElement Allows elements to be streamed on input and output.
XText and XCData The LINQ to XML text node classes. Text nodes are used when creating CData sections or working with mixed content

The simplicity of LINQ to XML can be seen in the low number of classes within the API, and in how few classes you need to be intimately familiar with to complete common XML programming tasks. Now that we have a base understanding of the core classes within the LINQ to XML object hierarchy, let’s start to get our hands dirty with LINQ to XML.

9.5. Working with XML using LINQ

Now that we’ve seen the LINQ to XML class hierarchy, it’s time to look at how we use the API provided by the LINQ to XML classes to perform the common XML operations we encounter when developing applications. The LINQ to XML API provides developers with an in-memory programming interface for reading, parsing, creating, and manipulating XML. Using the LINQ to XML API, we can quickly build applications that leverage XML data throughout.

In this section, we’re going to cover all the fundamental operations that are required for building applications that use XML. As we work our way through the API, we’re not going to cover every class within the LINQ to XML hierarchy, and we’re not going to cover every method available. Instead we’ll focus on the key classes and methods that will be the most valuable when building applications with LINQ to XML.

We’re going to start by looking at how to load data from a file on disk, as well as from an external web site. Once we have a firm understanding of how to read XML, we’ll be able to leverage the various XML data feeds available on the web within our applications. While it’s nice to have a well-formatted XML document, we’ll sometimes need to parse a string of text formatted as XML into a LINQ to XML object. Because of this, we’ll explore the Parse methods available on XElement and XDocument.

Next, we’ll learn how to create our own XML, and we’ll finish by looking at how we can alter and modify existing XML trees using the LINQ to XML API. By the time we finish, we’ll have covered all the common operations that you’ll need to start building applications that use XML. Without further ado, let’s take a look at how to load existing XML documents with LINQ to XML.

9.5.1. Loading XML

LINQ to XML allows XML to be loaded from a variety of input sources. These sources include a file, a URL, and an XmlReader. To load XML, the static Load method on XElement can be used. To load an XML file from a file on your hard drive into an XElement, you can use the following C# code:

XElement x = XElement.Load(@"c:ooks.xml");

Loading XML from a web site (or any URL) is also supported by the Load method. To load the RSS feed from the MSDN web site, we can alter the Load method to take in a URL instead of a file path.

XElement x = XElement.Load("http://msdn.microsoft.com/rss.xml");

By default, when XML is loaded into a XDocument or XElement, the whitespace within the document is removed. If you want to preserve the whitespace within the source document, you can overload the Load method so it takes a LoadOptions flag. The LoadOptions flag can be used to indicate the options to use when loading the XML. The available options are None, PreserveWhitespace, SetBaseUri, and SetLineInfo. Let’s load the RSS from MSDN again, but this time preserve whitespace by passing the LoadOptions.PreserveWhitespace flag.

string xmlUrl = "http://msdn.microsoft.com/rss.xml";
XElement x = XElement.Load(xmlUrl, LoadOptions.PreserveWhitespace);

When loading XML from a file or URL, LINQ to XML uses the XmlReader class. The XmlReader first retrieves the XML requested in the Load method, either by reading the file from the local filesystem or by requesting the file with the provided URL. Once the file is retrieved, the XmlReader reads the XML within the file and parses it into an in-memory tree of LINQ to XML objects. Given XElement’s use of XmlReader for loading XML, it’s not surprising that LINQ to XML also supports loading XML directly from an existing XmlReader. To load XML from an XmlReader, you must first position the XmlReader on an element node.

In listing 9.8, we load our books.xml file into an XmlReader using its static Create method. We then read each node within the XmlReader until we find a node with a NodeType of XmlNodeType.Element. Once our XmlReader is positioned on an element node, we then use the static ReadFrom method that accepts an XmlReader as a parameter to create an XElement from the existing XmlReader instance.

Listing 9.8. Creating an XElement from an existing XmlReader
using(XmlReader reader = XmlReader.Create("books.xml")) {
  while(reader.Read()) {
    if(reader.NodeType == XmlNodeType.Element)
      break;
  }
  XElement booksXml = (XElement) XNode.ReadFrom(reader);
}

If you want to create an XElement object from a fragment of XML contained within an XmlReader, you need to navigate to the proper node using the XmlReader API and once again pass the reader to the ReadFrom method. For example, to load the first book element within our books.xml file, we can use listing 9.9.

Listing 9.9. Creating an XElement object from a fragment of XML contained within an XmlReader
using(XmlReader reader = XmlTextReader.Create("books.xml")) {
  while (reader.Read()) {
    if (reader.NodeType == XmlNodeType.Element && reader.Name == "book")
      break;
  }
  XElement booksXml = (XElement) XNode.ReadFrom(reader);
}

Thus far we’ve only explored how to load XML into XElement objects. If you’re interested in accessing the XML declarations (XDeclaration), top-level XML processing instructions (XProcessingInstruction), XML document type definitions (XDocumentType), or XML comments (XComment) within an XML document, you’ll need to load your XML into an XDocument object instead of an XElement. To load XML into an XDocument object, you can use the same mechanisms that we just discussed. The static Load method on XDocument has the same overloads as XElement and provides the same basic behavior. The only difference is that XDocument can contain additional nodes types as children. If we again want to load the MSDN RSS feed, but this time we’re interested in being able to access every child node (including the XML declarations, DTDs, processing instructions, and comments) we can load the RSS feed into an XDocument object using the following code:

XDocument msdnDoc = XDocument.Load("http://msdn.microsoft.com/rss.xml");

Now that we’ve discussed how to load XML from external sources such as files, URLs, and XmlReader objects, let’s look at how we can deal with XML that is contained within a simple string rather than a file.

9.5.2. Parsing XML

In some cases the XML that we want to use won’t be in a file, or located at a URL. It will be a simple string that is being built by some other part of our application. For these cases, the XElement class provides a static Parse method that creates a new XElement from a string of XML. The Parse method has a similar interface to the Load method, so there isn’t much new to learn. Listing 9.10 shows how we can use the Parse method to create an XElement from a string of XML.

Listing 9.10. Parsing a string of XML to an XElement
XElement x = XElement.Parse(
@"<books>
     <book>
       <author>Don Box</author>
       <title>Essential .NET</title>
     </book>
     <book>
       <author>Martin Fowler</author>
       <title>Patterns of Enterprise Application Architecture</title>
     </book>
  </books>");

Just like the Load method, the Parse method allows you to control whether whitespace is preserved by passing LoadOptions.PreserveWhitespace as the second parameter:

XElement x = XElement.Parse("<books/>", LoadOptions.PreserveWhitespace);

As noted earlier, LINQ to XML uses an XmlReader to parse XML. If malformed XML is passed to Parse then the underlying XmlReader will throw an exception. The Load and Parse methods do not catch the exceptions thrown by XmlReader; instead the exception bubbles up so that application code can catch the exception and handle it appropriately. The following code shows the general structure that should be followed when loading or parsing XML:

try {
  XElement xml = XElement.Parse("<bad xml>");
}
catch (System.Xml.XmlException e) {
  // log the exception
}

As we’ve seen throughout this section, the way we load and parse XML hasn’t changed much with LINQ to XML. Under the covers, LINQ to XML leverages the power of the existing XmlReader infrastructure to perform all the XML parsing.

This allows the LINQ to XML classes to focus on providing an intuitive API for working with XML, rather than on the nitty-gritty details required to parse the XML.

Now that we’ve covered the basics of loading and parsing existing XML, let’s move on to creating XML from scratch.

9.5.3. Creating XML

As we discussed earlier in this chapter, LINQ to XML provides a powerful approach to creating XML elements, referred to as functional construction. Functional construction allows a complete XML tree to be created in a single statement. As an example, let’s look at how we can create the following XML using functional construction.

<books>
    <book>
       <author>Don Box</author>
       <title>Essential .NET</title>
    </book>
</book>

To create this XML, we can use one of the XElement constructors that allow us to pass in an entire XML fragment as a set of nested XElement objects. See listing 9.11.

Listing 9.11. Creating an XElement with functional construction
XElement books = new XElement("books",
  new XElement("book",
    new XElement("author", "Don Box"),
    new XElement("title", "Essential .NET")
  )
);

By indenting the C# code used to create the XML, we can see it take the shape of the resulting XML. Compare this to listing 9.12, which creates the same XML using the imperative construction model provided by LINQ to XML.

Listing 9.12. Creating an XElement using the imperative construction model provided by LINQ to XML
XElement book = new XElement("book");
book.Add(new XElement("author", "Don Box"));
book.Add(new XElement("title", "Essential .NET"));

XElement books = new XElement("books");
books.Add(book);

While the overall number of lines to create the XML in the two code samples is comparable, the first sample that used functional construction is more readable and more closely resembles the resulting XML. When creating XML using the imperative model, we need to create temporary variables for the various elements that make up the resulting XML. The result is code that is less readable and more prone to errors.

When thinking about XML, we often visualize the hierarchy of nodes that make up the XML. When building XML using imperative Add method calls, the code can’t easily take on a shape similar to the resulting XML. With functional construction, we can write code that has a shape and feel similar to the resulting XML. This allows us to stay focused on the XML and not have to switch gears. The end result is a more pleasant and enjoyable programming experience.

To enable functional construction, the following three constructors are available on XElement.

public XElement(XName name)
public XElement(XName name, object content)
public XElement(XName name, params object[] content)

The content parameter can be any type of object that is a legitimate child of an XElement. Legitimate child content includes

  • A string, which is added as text content. This is the recommended pattern to add a string as the value of an element; the LINQ to XML implementation will create the internal XText node.
  • An XText, which can have either a string or CData value, added as child content. This is mainly useful for CData values; using a string is simpler for ordinary string values.
  • An XElement, which is added as a child element.
  • An XAttribute, which is added as an attribute.
  • An XProcessingInstruction or XComment, which is added as child content.
  • An IEnumerable, which is enumerated, and these rules are applied recursively.
  • Anything else, in which case To String() is called and the result is added as text content.
  • null, which is ignored.

The simplest way to create an XElement is by using a constructor that takes an XName.

XElement book = new XElement("book");

To make working with the LINQ to XML API more usable, the XName class has an implicit conversion from string. This means that LINQ to XML can convert a string, such as “book”, into an XName object without you explicitly specifying a cast or creating a new XName object. Because of this, we can pass the name of the element (“book”) directly to the XElement constructor. Under the covers, LINQ to XML implicitly converts the string into an XName and initializes the XElement with the XName.

Creating leaf elements that have text content is as easy as passing the content as the second parameter to the XElement constructor.

XElement name = new XElement("name", "Steve Eichert");

Which will produce

<name>Steve Eichert</name>

As you would expect, the string could have been stored in a variable or returned from a method call.

XElement name = new XElement("name", usersName);
XElement name = new XElement("name", GetUsersName());

To create an XML element with child nodes, we can take advantage of the third XElement constructor that is declared with the params keyword. The params keyword allows a variable number of arguments to be passed as content. To create this XML:

<books>
  <book>LINQ in Action</book>
  <book>Ajax in Action</book>
</books>

We can use the following code:

XElement books = new XElement("books",
  new XElement("book", "LINQ in Action"),
  new XElement("book", "Ajax in Action")
);

Since each child node in the previous sample is itself an XElement, we can extend the code to create an entire XML tree, as in listing 9.13.

Listing 9.13. Creating an XML tree using LINQ to XML
XElement books = new XElement("books",
  new XElement("book",
    new XElement("title", "LINQ in Action"),
    new XElement("authors",
      new XElement("author", "Fabrice Marguerie"),
      new XElement("author", "Steve Eichert"),
      new XElement("author", "Jim Wooley")
     ),
     new XElement("publicationDate", "January 2008")
  ),
  new XElement("book",
    new XElement("title", "Ajax in Action"),
    new XElement("authors",
      new XElement("author", "Dave Crane"),
      new XElement("author", "Eric Pascarello"),
      new XElement("author", "Darren James")
    ),
    new XElement("publicationDate", "October 2005")
  )
);

Of course, as you encounter real-life scenarios for creating XML, it’s pretty unlikely that you’ll be dealing with XML that doesn’t contain namespaces and namespace prefixes. To create an element with a namespace, you can either pass the fully expanded XML name as the first parameter to the XElement constructor or you can create an XNamespace and append the local name when creating the element. Listing 9.14 shows how to create an XElement with a full XML name, as well as with an XNamespace.

Listing 9.14. Creating an XElement with a full XML name and an XNamespace

If you’re creating a single element that uses a namespace, you’ll most likely pass the fully expanded name and not explicitly create an XNamespace. If you’re creating several elements that all use the same namespace, your code will look a lot cleaner if you declare the XNamespace once and use it throughout all the relevant elements, as shown in listing 9.15.

Listing 9.15. Creating several elements that all use an XNamespace
XNamespace ns = "http://linqinaction.net";
XElement book = new XElement(ns + "book",
  new XElement(ns + "title", "LINQ in Action"),
  new XElement(ns + "author", "Fabrice Marguerie"),
  new XElement(ns + "author", "Steve Eichert"),
  new XElement(ns + "author", "Jim Wooley"),
  new XElement(ns + "publisher", "Manning")
);

This will produce the following XML:

<book xmlns="http://linqinaction.net">
  <title>LINQ in Action</title>
  <author>Fabrice Marguerie</author>
  <author>Steve Eichert</author>
  <author>Jim Wooley</author>
  <publisher>Manning</publisher>
</book>

If you need to include namespace prefixes in your XML, you’ll have to alter your code to explicitly associate a prefix with an XML namespace. To associate a prefix with a namespace, you can add an XAttribute object to the element requiring the prefix and append the prefix to the XNamespace.Xmlns namespace, as seen in listing 9.16.

Listing 9.16. Associating a prefix with a namespace
XNamespace ns = "http://linqinaction.net";
XElement book = new XElement(ns + "book",
  new XAttribute(XNamespace.Xmlns + "l", ns)
);

The resulting XML will look like this:

<l:book xmlns:l="http://linqinaction.net" />

Thus far we’ve primarily focused on producing XML that contains elements. When creating XML in real-world scenarios, the XML that we produce may include attributes, processing instructions, XML DTDs, comments, and more.

To include any of these in our XML is simply a matter of passing them in at the appropriate place within the functional construction statement. For example, to add an attribute to our book element, we can create a new XAttribute and pass it as one of the content parameters of our XElement, as in listing 9.17.

Listing 9.17. Creating XML with an attribute
XElement book = new XElement("book",
  new XAttribute("publicationDate", "October 2005"),
  new XElement("title", "Ajax in Action")
);

In this section we’ve focused exclusively on using functional construction and the LINQ to XML API for creating XML. We’ve also focused on doing so with C# as our programming language. Those VB programmers in the crowd will be excited to know that you’re privy to a nice feature called XML literals, which allows you to embed XML directly within your Visual Basic 9.0 code.

9.5.4. Creating XML with Visual Basic XML literals

When creating XML in Visual Basic 9.0 using LINQ to XML, we can use the functional construction pattern as well as the imperative methods available within the LINQ to XML API. In addition, XML can be embedded directly within VB code using the XML literal syntax. To illustrate the power of the XML literals feature, let’s look at how we can construct XML using functional construction and compare it against the code for creating the same XML using XML literals. Let’s start by taking a look at the XML we’re going to produce.

<book>
  <title>Naked Conversations</title>
  <author>Robert Scoble</author>
  <author>Shel Israel</author>
  <publisher>Wiley</publisher>
</book>

Before checking out how to create this XML using XML literals let’s first do so using functional construction. The VB code to construct the XML using functional construction is shown in listing 9.18.

Listing 9.18. Creating XML using Visual Basic and functional construction
Dim xml As New XElement("book", _
  New XElement("title", "Naked Conversations"), _
  New XElement("author", "Robert Scoble"), _
  New XElement("author", "Shel Israel"), _
  New XElement("publisher", "Wiley") _
)

As you can see, the code for creating the XML using functional construction is exactly the same as the code we’ve already seen when creating XML using C# (besides the minor syntactical differences). Let’s now take a look at listing 9.19, which shows the code for creating the XML using the XML literal syntax offered by VB9.

Listing 9.19. Creating XML using XML literals
Dim xml As XElement = <book>
  <title>Naked Conversations</title>
  <author>Robert Scoble</author>
  <author>Shel Israel</author>
  <publisher>Wiley</publisher>
</book>

With XML literals, we can embed XML directly into our Visual Basic code. Rather than creating LINQ to XML object hierarchies that represent the XML, we instead can define the XML using XML syntax. The result is code that exactly mirrors the resulting XML and is more clear and concise.

In listing 9.19, we create a static XML fragment using XML literals. When building real applications, we need to build XML in a more dynamic fashion. XML literals allow us to embed expressions into the XML literal code using syntax that is similar to the syntax used in ASP.NET. Let’s modify the code to create an XML fragment using values stored in a set of local variables to illustrate how we can embed expressions in our XML. See listing 9.20.

Listing 9.20. Embedding expressions in XML literal expression holes
Dim title as String = "NHibernate in Action"
Dim author as String = "Pierre Kuate"
Dim publisher as String = "Manning"

Dim xml As XElement = <book>
  <title><%= title %></title>
  <author><%= author %></author>
  <publisher><%= publisher %></publisher>
</book>

In the listing code, we use an expression hole, which is expressed with the <%= statement %> syntax to embed dynamic values into our XML literals. While our expressions use local variables, we could just as easily use values returned from a function or pulled from a database. By allowing us to embed our own expressions within the XML literal code, VB9 provides us with an intuitive method for dynamically creating XML fragments using familiar XML syntax.

In addition to supporting expression holes as the content of XML tags, we can also use expression holes to create XML elements dynamically. For instance, if we wanted to store the element name for the root element within our XML in a variable, we can modify our code to look like listing 9.21.

Listing 9.21. Using expression holes to populate the element name of an XML element
Dim elementName as String = "book_tag"
Dim title as String = "NHibernate in Action"
Dim author as String = "Pierre Kuate"
Dim publisher as String = "Manning"

Dim xml As XElement = <<%= elementName %>>
  <title><%= title %></title>
  <author><%= author %></author>
  <publisher><%= publisher %></publisher>
</>

Which results in the following output:

<book_tag>
  <title>NHibernate in Action</title>
  <author>Pierre Kuate</author>
  <publisher>Manning</publisher>
</book_tag>

As we can see, using expression holes as element names is a matter of placing the expression that builds the tag inside the expression hole. Since tags created with expression holes aren’t known until run-time, VB9 allows an empty tag </> to denote the close of an element.

In addition to supporting expression holes as element names and as content of elements, expressions can also be used in place of attribute values:

Dim linkXml = _
       <link updatedDate=<%=Now()%>>http://www.linqinaction.net/</link>

The addition of XML literals provides an intuitive syntax for creating XML in Visual Basic. Rather than having to learn the details of an XML API, XML literals allow programmers to embed XML directly within their code. Under the covers, the Visual Basic compiler coverts the XML literals into the corresponding LINQ to XML API calls that we discussed earlier in this chapter. This allows XML code created within XML literals to interoperate with code written in languages that don’t support XML literals, such as C#.

Now that we’ve covered how to create XML using functional construction, as well as Visual Basic’s XML literals, let’s move on to look at how we can create full XML documents using LINQ to XML’s XDocument class.

9.5.5. Creating XML documents

When working with XDocument objects, you’ll find yourself in familiar territory. All of the methods that we’ve talked about thus far, within the context of elements, apply equally to XDocument. The main difference between the two is what is considered allowable content. When working with XElement objects, we allow XElement objects, XAttribute objects, XText, IEnumerable, and strings to be added as content. XDocument allows the following to be added as child content:

  • One XDocumentType for the DTD.
  • One XDeclaration object, which allows you to specify the pertinent parts of an XML declaration: the XML version, the encoding of the document, and whether the XML document is standalone.
  • Any number of XProcessingInstruction objects. A processing instruction conveys information to an application that processes the XML.
  • One XElement object. This is the root node of the XML document.
  • Any number of XComment objects. The comments will be siblings to the root element. The XComment object can’t be the first argument in the list, as it is invalid for an XML document to start with a comment.

In most usage scenarios, XML documents will be created using the functional construction pattern, as shown in listing 9.22.

Listing 9.22. Create an XML document using the XDocument class and functional construction
XDocument doc = new XDocument(
  new XDeclaration("1.0", "utf-8", "yes"),
  new XProcessingInstruction("XML-stylesheet", "friendly-rss.xsl"),
  new XElement("rss",
    new XElement("channel", "my channel")
  )
);

Now that we’ve constructed our initial XDocument, let’s talk about some of the classes that we may use during its construction.

XDeclaration

The XDeclaration class represents an XML declaration. An XML declaration is used to declare the version and encoding of the document, as well as to indicate whether the XML document is standalone.[2] As such, the XDeclaration class has the following constructor:

2 A standalone XML document does not rely on information from external sources, such as external DTDs, for its content.

public XDeclaration(string version, string encoding, string standalone)

The XDeclaration class can be constructed using an existing XDeclaration or XmlReader. When an existing XmlReader is passed into the constructor, the XML declaration from the XmlReader is read into the XDeclaration. In order for the XML declaration to be read from the XmlReader, it must be positioned on the XML declaration. If the XmlReader is not positioned on an XML declaration, an InvalidOperationException will be thrown.

XProcessingInstruction

The second class that becomes relevant when we start working with XML documents is XProcessingInstruction. The XProcessingInstruction class represents an XML processing instruction. Processing instructions convey information to an application that processes the XML. Like the XDeclaration class, the XProcessingInstruction class can be constructed with an existing XmlReader instance. Another way an XProcessingInstruction can be constructed is via the following constructor:

public XProcessingInstruction(string target, string data)

One of the most common uses of XML processing instructions is to indicate what XSLT stylesheet should be used to display an XML document. For example, to display a human-readable page when visitors click the RSS feed for your blog, you may want to add an XML-stylesheet processing instruction to tell browsers to apply a custom XSL stylesheet when displaying the raw XML feed, as in listing 9.23.

Listing 9.23. Create an XML document with an XML stylesheet processing instruction
XDocument d = new XDocument(
  new XProcessingInstruction("XML-stylesheet",
     "href='http://iqueryable.com/friendly-rss.xsl' type='text/xsl'
      media='screen'"),
    new XElement("rss", new XAttribute("version", "2.0"),
    new XElement("channel",
      new XElement("item", "my item")
    )
  )
);

As we can see, the process of adding XML processing instructions to XML documents is easy as pie when we have the powerful functional construction capabilities offered by LINQ to XML at our disposal. Let’s now move on to the next class that may be necessary for the XML document you’re creating, XDocumentType.

XDocumentType

The XDocumentType class represents an XML document type definition. When constructing XML, we can use a DTD to define the rules for the document, such as what elements are present, as well as the relationships that exist between elements. Like every other class we’ve talked about in this section, XDocumentType has one constructor that allows it to be constructed from an XmlReader and another that gives it the freedom to be created without an XmlReader object. Here is the constructor definition:

public XDocumentType(string name, string publicId, string systemId, string
internalSubset)

To see an example of the XDocumentType class, let’s create a valid HTML page using LINQ to XML. For an HTML document to be considered valid, it must declare the version of HTML that is used in the document; we’ll do so using an XDocumentType object. When we’re finished we’ll end up with the following HTML document:

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
  "http://www.w3.org/TR/html4/strict.dtd">
<html>
  <body>This is the body!</body>
</html>

To create this HTML, we can create a new XDocument object using functional construction and pass along an XDocumentType, as well as an XElement as in listing 9.24.

Listing 9.24. Create an HTML document with a document type via the XDocumentType class
XDocument html = new XDocument(
  new XDocumentType("HTML", "-//W3C//DTD HTML 4.01//EN",
                    "http://www.w3.org/TR/html4/strict.dtd", null),
  new XElement("html",
    new XElement("body", "This is the body!")
  )
);

Now that we’ve seen how to add XML document type declarations to our XML documents, let’s finish off our discussion of the classes that we’ll use when creating XML documents by looking at how we can include XML comments.

XComment

Like the comments that we place within our C# and VB.NET code, XML comments can be added to XML documents to provide an explanation of what is contained within the document. An XML comment can be constructed with a string or by reading the XML comment that an XmlReader is currently positioned on. The XComment class is not limited to use within XDocument classes, but we don’t see it as important so we’ve buried it down here where nobody except you will ever see it. After all, XML is supposed to be human readable, why should we need comments?

Now that we’ve covered how to use LINQ to XML to create XML from scratch using XElement and XDocument, it’s time to look at how we can update and modify the XML we’ve created.

9.5.6. Adding content to XML

LINQ to XML provides a full set of methods for manipulating XML. Let’s start by looking at how we can insert new elements and attributes into an existing XElement.

After loading or constructing an XElement, you may want to add additional child items to the element. The Add method allows content to be added to an existing XElement. The definition of the Add method is similar in nature to the constructors offered by XElement. It provides two overloads. The first overload takes in a single object, while the second allows a variable number of items to be added as content. The following are the two overloads for Add:

public void Add(object content)
public void Add(params object[] content)

These two overloads on Add allow content to be added using the functional construction pattern we discussed in section 9.5.3. To add a single child element to an existing XElement, we can use the following code:

XElement book = new XElement("book");
book.Add(new XElement("author", "Dr. Seuss"));

Of course the content parameter can be anything that is allowable as a child of XElement. We can add an attribute to our XElement by passing an XAttribute to our Add method instead of an XElement.

XElement book = new XElement("book");
book.Add(new XAttribute("publicationDate", "October 2005"));

As shown in listing 9.25, we can also use the second overload that accepts a variable number of objects to assign as content.

Listing 9.25. Add content to an XElement using the Add method
XElement books = new XElement("books");
books.Add(new XElement("book",
    new XAttribute("publicationDate", "May 2006"),
    new XElement("author", "Chris Sells"),
    new XElement("title", "Windows Forms Programming")
  )
);

It’s also important to note that Add will properly handle content that implements IEnumerable. When a content item that implements IEnumerable is passed to the Add method, each item within the IEnumerable is recursively added to the XElement. This allows LINQ queries to be used to construct XML, since the standard query operators, as well as all of the XML query axis methods provided by LINQ to XML, return IEnumerable<XElement>. In chapter 10, we’ll discuss the querying capabilities of LINQ to XML in depth and show how the functional construction pattern of creating XML can be combined with query expressions to create and transform XML. For the time being, let’s look at the following example, which shows how we can leverage the support for IEnumerable within the Add method to add all the child elements in an existing XML document to an XElement.

XElement existingBooks = XElement.Load("existingBooks.xml");
XElement books = new XElement("books");
books.Add(existingBooks.Elements("book"));

By default, when an item is added to an XElement, it is added as the last child of the element. If the content being added is an XElement, the element is added as the last child element. If the content is an XAttribute, the attribute is the last attribute defined within the element. If this isn’t the behavior you’re after, don’t worry. XElement offers several alternate methods. To add the child as the first child, you can call AddFirst. Alternatively, if you know precisely where you want the element placed, you can navigate to an element and call AddAfterSelf or AddBeforeSelf. For example, to add a book element as the second child of our books XElement, we can do the following:

XElement newBook = new XElement("book", "LINQ in Action");
XElement firstBook = books.Element("book");
firstBook.AddAfterSelf(newBook);

The AddFirst, AddAfterSelf, and AddBeforeSelf methods all provide the same two overloads as Add, and they all process the content parameter in the same way. As you explore the LINQ to XML API, you’ll see that it has been designed to work the same way throughout. Rather than finding unexpected behavior when exploring new methods, you’ll find that they work just as you would expect.

Now that we’ve figured out how to add content to our XML, let’s look at how we can remove it.

9.5.7. Removing content from XML

XElement provides several methods for removing child content. The most straightforward approach is to navigate to the item to be deleted and call Remove. Remove works over a single element as well as with an IEnumerable. Calling it on an IEnumerable will remove all elements within the IEnumerable with a single call. In listing 9.26, we show how to delete a single book element, as well as how to remove all book elements.

Listing 9.26. Removing one or many elements from an XElement with Remove
books.Element("book").Remove();  // remove the first book
books.Elements("book").Remove(); // remove all books

Although not as straightforward, the SetElementValue method on XElement can also be used to remove elements. To remove an element using SetElementValue, pass null as the parameter.

books.SetElementValue("book", null);

If you’re interested in keeping your element around but removing all of its content, you can use the Value property. To delete the content of the author element (“Steve Eichert”) in the following XML:

<books>
  <book>
    <author>Steve Eichert</author>
  </book>
</books>

You can navigate to the element and then set the Value property to an empty string.

books.Element("book").Element("author").Value = String.Empty;

Which results in the following XML:

<books>
  <book>
    <author></author>
  </book>
</books>

Several of the methods mentioned in this section can also be used to update XML. We explore their use within that context next. Before moving on, let’s take a deep breath. We’ve been covering a lot of ground and realize that you may be getting tired. Luckily we only have two more sections before we’re finished with our introduction to the LINQ to XML API.

Let’s move on to take a look at how we can update XML content using LINQ to XML.

9.5.8. Updating XML content

LINQ to XML offers several alternatives when it comes to updating XML. The most direct approach is to use the SetElementValue method defined on XElement. SetElementValue allows simple content of child elements to be replaced. Let’s replace Steve Eichert as the author of this book with someone a little more prominent. Let’s first take a look at the XML we’ll be updating.

<books>
  <book>
    <title>LINQ in Action</title>
    <author>Steve Eichert</author>
  </book>
</books>

To update the <author/> element, we navigate to the first book element using the Element axis method. Once we’re positioned on the <book/> element, we call SetElementValue and pass the name of the element that we want to update (author), as well as the new value.

XElement books = new XElement("books.xml");
books.Element("book").SetElementValue("author", "Bill Gates");

After calling SetElementValue, the value of the author element has been updated to Bill Gates:

<books>
  <book>
    <title>LINQ in Action</title>
    <author>Bill Gates</author>
  </book>
</books>

It’s important to remember that SetElementValue only supports simple content. If we try to pass more advanced content, SetElementValue will attempt to convert the content to a string using the GetStringValue method on XContainer. For example, if we update our code to pass an XElement as the value for our author element instead of the string, like so:

books.Element("book").SetElementValue("author", new XElement("foo"));

we’ll end up with an exception being thrown by XContainer, since it does not accept anything that inherits from XObject to be used as content.

To handle more complex content, the ReplaceNodes method that is defined on XContainer should be used. ReplaceNodes supports passing in all different types of content and provides overloads for passing in a variable number of content items. If we update our code to use ReplaceNodes instead of SetElementValue, we end up with the results we’re after. The following code:

books.Element("book").Element("author").ReplaceNodes(new XElement("foo"));

results in

<books>
  <book>
    <title>LINQ in Action</title>
    <author>
      <foo/>
    </author>
  </book>
</books>

Calling ReplaceNodes on an XElement results in all existing content being removed and the content parameter passed to ReplaceNodes being added. The content parameter can be any valid child element of XElement, as well as an IEnumerable. If an IEnumerable is encountered, each item in the enumeration is added as a child content item. ReplaceNodes also has an overload that accepts a variable number of content parameters. This allows multiple child content items to be used as the replacement for the existing content. If we want to replace the entire contents of a <book/> element with a new set of child elements, we can use listing 9.27.

Listing 9.27. Replacing the contents of a element with new content
books.Element("book").ReplaceNodes(
  new XElement("title", "Ajax in Action"),
  new XElement("author", "Dave Crane")
);

Both SetElementValue and ReplaceNodes operate over the content of an element. If you need to replace an entire node rather than update its contents, you can use the ReplaceWith method defined on XNode. ReplaceWith operates over the element itself, rather than its content. This allows entire elements to be replaced. For example, if we want to replace all the <title/> elements within our XML file with <book_title/> elements, we could use listing 9.28.

Listing 9.28. Replace an entire node with ReplaceWith
var titles = books.Descendants("title").ToList();
foreach(XElement title in titles) {
  title.ReplaceWith(new XElement("book_title", (string) title));
}

In the listing, we first select all the <title/> elements using the Descendants axis method (which we’ll discuss in the next chapter). Once we have all of the elements, we loop over each element and call the ReplaceWith method, passing a new XElement with book_title as its XName and the value of the current element as the value. This results in all the <title/> elements being replaced with <book_title/> elements.

As we’ve seen, when updating XML, we have several options at our disposal. ReplaceWith allows entire nodes to be replaced and is ideal for scenarios where we need to replace all instances of a given element with a new element. SetElementValue and ReplaceNodes offer us the ability to replace the contents of elements. SetElementValue is only meant for simple content, while ReplaceNodes supports more advanced content.

Throughout the last several sections, we’ve focused on how to add, delete, and update XML with a strong focus on elements. Since elements are the fundamental building block that we use to build XML, it’s understandable that they’ve received most of our attention. Now that we have a firm grasp of how to work with elements, we need to look into how we can annotate our elements with attributes. After all, attributes are used in the majority of XML documents today. If we’re going to be able to create and read real-world XML documents, we’ll need to understand how LINQ to XML makes that possible. In the next section, we provide a complete run down of how to deal with attributes when working with LINQ to XML.

9.5.9. Working with attributes

The XAttribute class is used to represent an attribute within LINQ to XML. Unlike earlier XML APIs, attributes are not within the same class hierarchy as elements and nodes. In LINQ to XML, attributes are simply name-value pairs. As such, it’s not surprising to find a constructor that allows XAttribute objects to be constructed with a name and value.

public XAttribute(XName name, object value)

During the creation of XML, we can include attributes within our XML by passing them as one of the parameters to the functional construction statements and/or the Add method. To create a book element with a publication date attribute, we can either add the attribute during construction time:

new XElement("book", new XAttribute("pubDate", "July 31, 2006"));

or we can add the attribute after the fact by calling Add and passing the attribute as the content parameter.

book.Add(new XAttribute("pubDate", "July 31, 2006"));

In either case, we end up with the following XML:

<book pubDate="July 31, 2006"/>

In addition to the Add method, we also have the ability to add attributes to elements with SetAttributeValue. SetAttributeValue is similar to the SetElementValue method we discussed earlier. SetAttributeValue can be used to add or update an attribute on an existing XElement. If the attribute already exists on the element, it will be updated, and if it doesn’t exist, it will be added. If we need to update the pubDate attribute, we can use the SetAttributeValue method.

book.SetAttributeValue("pubDate", "October 1, 2006");

Again, like its closely related friend SetElementValue, SetAttributeValue can also be used to remove attributes by passing null as the value parameter. In addition to allowing attributes to be removed with SetAttributeValue, the XAttribute class has a Remove method.

book.Attribute("pubDate").Remove();

Remove can be called on a single XAttribute as well as on an IEnumerable<XAttribute>. Calling Remove on the latter results in all the attributes within the IEnumerable being removed from their associated elements.

As you can see, the way we work with attributes within LINQ to XML closely parallels how we work with elements. The key difference is that XAttribute objects are not nodes in the element tree, but are name-value pairs associated with an XML element.

We’ve gotten to a point where we can create XML from scratch using functional construction as well as manipulate that XML in all ways possible. As we continue, we’ll likely want to figure out how our modified XML can be saved. Lucky for us, that’s the focus of our next section.

9.5.10. Saving XML

The process of saving XML is extremely straightforward. The XElement and XDocument classes provide a Save method that will save your XML to a file, an XmlTextWriter, or an XmlWriter. To save an XElement to disk, we can call Save and pass a file path as a parameter, as in listing 9.29.

Listing 9.29. Saving an XElement to disk with the Save method
XElement books = new XElement("books",
  new XElement("book",
    new XElement("title", "LINQ in Action"),
    new XElement("author", "Steve Eichert"),
    new XElement("author", "Jim Wooley"),
    new XElement("author", "Fabrice Marguerie")
  )
);
books.Save(@"c:ooks.XML");

That’s it! Well, not entirely; you do have the ability to disable formatting of the XML during save by passing SaveOptions.DisableFormatting as a second parameter to the Save method, but it doesn’t get much simpler than that, does it?

Now that we can save our XML, we’ve come full circle with the LINQ to XML programming API. We’ve covered how to load and parse XML from files, URLs, and text, as well as how to create XML using functional construction. Additionally, we’ve covered how we can use the imperative update methods available on XElement and XAttribute (such as Add, SetElementValue, and Remove) to add, update, and delete XML. We finished by looking at how we can use the Save method on XElement and XDocument to save our XML to a file. While we haven’t covered every detail of every class, we’ve covered the major classes and methods that will allow you to start building applications with LINQ to XML. As with any new technology, the best way to learn the intricacies of the LINQ to XML programming API is to start writing applications that use it today.

9.6. Summary

LINQ to XML builds on the infrastructure provided by LINQ to allow XML to be queried using the standard query operators. LINQ to XML provides several XML axis methods that make retrieving elements or attributes easily. While the query capabilities offered by LINQ to XML are significant, just as significant if not more so is the LINQ to XML programming API. It provides a much better programming experience for developers working with XML and has an intuitive API that makes building applications that use XML simpler and more enjoyable.

The LINQ to XML programming API is a new lightweight XML API that was designed for LINQ. It builds on the language innovations brought by LINQ and introduces several new key concepts such as functional construction, context-free XML creation, and simplification of XML names. While Microsoft could have retrofitted existing XML APIs to work with LINQ, creating a new API designed and tuned specifically for LINQ has resulted in an API that makes working with XML productive and enjoyable.

At the heart of the LINQ to XML class hierarchy is XElement. It is the fundamental class that you work with in LINQ to XML. In addition to XElement, the XAttribute, XDocument, and XName classes are prominent. These core classes, as well as the rest of the programming API, have been designed with the programmer in mind, and as such provide an intuitive API for loading, parsing, creating, updating, and saving XML.

Now that we’ve introduced you to LINQ to XML and provided a detailed overview of the XML class hierarchy and programming API, it’s time to move on to a detailed discussion of querying and transforming XML using LINQ to XML. We do that in the next chapter.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.147.72.74