18 Extracting Data from XML

Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Chapter 18
Extracting Data from XML

In This Chapter

Loading XML Documents

Querying XML Documents

Loading XML from a String

Handling Missing Data

Using Query Expressions with XML Data

Annotating Nodes

“Don’t worry. If we screw up they will shoot me first. If I go down stop sucking on my face and run.”

–Jackson Wayfare

Until now, XPath has been the de facto standard for querying Extensible Markup Language (XML). LINQ for XML offers an alternative way to query XML. Each way has its merits and strengths, but LINQ to XML may be easier if you don’t already know XPath.

The key to understanding LINQ to XML is that the basic query structure, grammar, and elements of a LINQ query don’t change just because the source is XML. Like LINQ to SQL, you have to extricate the underlying data items—documents, elements, attributes, and comments—but these all fit in the basic structure of LINQ. In short, everything you have learned thus far is relevant for LINQ to XML. You only need to layer in knowledge about the classes that represent the structure of an XML element and you are up and running.

LINQ to XML supports reading, modifying, transforming, and saving changes to XML documents, as XPath does. This chapter begins with loading and querying documents, handling missing data, annotating nodes, and validating data. The additional capabilities of LINQ to XML are covered in the remaining chapters of this book.

Loading XML Documents

When working with LINQ to Objects, the data is in memory. When working with LINQ to SQL, you need to connect to a data provider—SQL Server—and use a DataContext to get the data in memory. For LINQ to XML, you need to load an XML document to get the data in memory. An XML document can be loaded using an instance of System.Xml.Linq.XDocument or System.Xml.Linq.XElement or can be represented as a literal string and then converted to XML using XElement.Parse.

Although the following example is a little morbid, assume you have an XML document that tracks a lifetime of family pets called PetCemetary.xml. You can use the static method XDocument.Load(“PetCemetary.xml”) to load the entire XML document starting at the root, or you can use XElement.Load, which loads the root but lets you skip the root when querying elements.

XDocument.Load and XElement.Load are overloaded to accept a string filename, a TextReader, or an XmlReader. One version—Load(string, LoadOptions)—loads the XML from a file and LoadOptions lets you preserve whitespace, set the base uniform resource identifier (URI), or request line information from the XmlReader. LoadOptions is defined using the FlagsAttribute, which means this enumeration argument is a set; that is, you can assign one or more values together. The literal options for the LoadOption enumeration are None, PreserveWhitespace, SetBaseUri, and SetLineInfo. The following sections demonstrate XDocument.Load and XElement.Load.

Querying XML Documents

All of the techniques and underpinnings of LINQ that you have learned in previous chapters apply to this part and to this chapter. The primary difference is that the sequences (or collections, if you prefer) come from XML documents, nodes, attributes, and objects, and these things have their own object representation in the framework.

In this section, you will learn how to use XDocument, XElement, and manage attributes and write queries against XML documents using these features of the System.Xml.Linq namespace.

Using XDocument

System.Xml.Linq.XDocument inherits from XContainer. XDocument represents a complete XML document for our purposes. After you have an XDocument—you will use the static Load method in the example—you can perform LINQ queries against the contents of the document.

Listing 18.1 loads the XML document called PetCemetary—an homage to one of my favorite authors, Stephen King, and a history of all of my feline and canine pets over the years. The XML document contains a root node <pets> and each immediate descendant is a <pet>—one of my pets.

Listing 18.1 A LINQ to XML Example That Loads an XML Document and Queries the Pets Defined in That Document

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Xml;
using System.Xml.Linq;

namespace XDocumentDemo
{
  class Program
  {
    static void Main(string[] args)
    {
      XDocument xml = XDocument.Load(“..\..\PetCemetary.xml”);
      var pets = from pet in xml.Elements(“pets”).Elements(“pet”)
                  select pet;

      Array.ForEach(pets.ToArray(), p=>Console.WriteLine(p.Element(“name”).Value));
      Console.ReadLine();
    }
  }
}

In the example, the XML document is loaded via the static method XDocument.Load. The very next line is a LINQ query. Note that the structure of the query is consistent with previous examples, including the from clause, the range and input sequence, and the select clause. The sequence is represented in the example by XDocument.Elements(“pets”).Elements(“pet”), which, when evaluated, returns a collection of pet nodes. The Array.ForEach statement iterates through each pet node and the Lambda Expression requests the name element and prints its value. Listing 18.2 contains the complete listing of the sample XML file.

Listing 18.2 The Sample PetCemetary.xml XML File Contents

<?xml version=”1.0” encoding=”utf-8” ?>
<pets>
  <pet>
    <id>1</id>
    <name>Duke</name>
    <species>Great Dane</species>
    <sex>Male</sex>
    <startYear>1968</startYear>
    <endYear>1970</endYear>
    <causeOfDeath>Suicide</causeOfDeath>
    <specialQuality>Big and goofy</specialQuality>
  </pet>
  <pet>
    <id>2</id>
    <name>Dog</name>
    <species>Some Kind of Cat</species>
    <sex>Female</sex>
    <startYear>1972</startYear>
    <endYear>1974</endYear>
    <causeOfDeath>Car</causeOfDeath>
    <specialQuality>Best mouser</specialQuality>
  </pet>
  <pet>
    <id>3</id>
    <name>Sam</name>
    <species>Labrador</species>
    <sex>Female</sex>
    <startYear>1973</startYear>
    <endYear>1980</endYear>
    <causeOfDeath>Old Age</causeOfDeath>
    <specialQuality>Great hunting dog</specialQuality>
  </pet>
  <pet>
    <id>4</id>
    <name>Hogan</name>
    <species>Yellow Lab Mix</species>
    <sex>Male</sex>
    <startYear>1994</startYear>
    <endYear>2004</endYear>
    <causeOfDeath>Seizure</causeOfDeath>
    <specialQuality>A very good dog</specialQuality>
  </pet>
  <pet>
    <id>5</id>
    <name>Leda</name>
    <species>Chocolate Labrador</species>
    <sex>Female</sex>
    <startYear>2004</startYear>
    <endYear></endYear>
    <causeOfDeath></causeOfDeath>
    <specialQuality>Thumper</specialQuality>
  </pet>
  <pet>
    <id>6</id>
    <name>Po</name>
    <species>Toy Poodle</species>
    <sex>Female</sex>
    <startYear>2003</startYear>
    <endYear>2004</endYear>
    <causeOfDeath>Lethal Injection</causeOfDeath>
    <specialQuality>Mental</specialQuality>
  </pet>
  <pet>
    <id>7</id>
    <name>Big Mama</name>
    <species>Tabby</species>
    <sex>Female</sex>
    <startYear>1998</startYear>
    <endYear></endYear>
    <causeOfDeath></causeOfDeath>
    <specialQuality>Quarterback</specialQuality>
  </pet>
  <pet>
    <id>8</id>
    <name>Ruby</name>
    <species>Rotweiler</species>
    <sex>Female</sex>
    <startYear>1997</startYear>
    <endYear></endYear>
    <causeOfDeath></causeOfDeath>
    <specialQuality>Big baby</specialQuality>
  </pet>
  <pet>
    <id>9</id>
    <name>Nala</name>
    <species>Maine Coon</species>
    <sex>Female</sex>
    <startYear>2007</startYear>
    <endYear></endYear>
    <causeOfDeath></causeOfDeath>
    <specialQuality>La Freaka</specialQuality>
  </pet>
</pets>

Using XElement

When the document (in Listing 18.1) was loaded with the XDocument class, you had to start requesting elements at the root node level, pets. If you load the document with XElement, as demonstrated in Listing 18.3, you can shorten the chain of element requests—references to Elements.

Listing 18.3 Load the Document With XElement and You Can Request the Desired Elements Directly Without Requesting the Root First

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Xml;
using System.Xml.Linq;

namespace XElementDemo
{
  class Program
  {
    static void Main(string[] args)
    {
      XElement xml = XElement.Load(“..\..\PetCemetary.xml”);
      var pets = from pet in xml.Elements(“pet”)
                  select pet;

      Array.ForEach(pets.ToArray(), p=>Console.WriteLine(p.Element(“name”).Value));
      Console.ReadLine();
    }
  }
}

Listing 18.3 is almost identical to Listing 18.1. The difference is that Listing 18.3 skipped the Elements(“pets”) piece of the request chain, shortening the LINQ query a bit.

Managing Attributes

The Elements property returns an IEnumerable of XElement. XElement objects can have attributes—additional values in the element tag—that contain additional information about the tag. In the original PetCemetary.xml file, the pet element has a child element species. In the hierarchy of classifications, genus is next, so you could add a child element genus or add a genus attribute to the species element. The latter would make the species element look like the following:

<species genus=“Dog”>Labrador</species>

Incorporating information about the genus as an attribute allows you to refine your LINQ queries to incorporate attributed information. For example (as shown in Listing 18.4), you can define a temporary range variable genus and add a where clause and predicate with the predicate filtering elements that have a genus equal to Feline.

Listing 18.4 Querying by Attributes of Elements; in the Example, the Resultset Includes Only Pets in the Genus Feline

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Xml;
using System.Xml.Linq;

namespace ManagingAttributesDemo
{
  class Program
  {
    static void Main(string[] args)
    {
      XElement xml = XElement.Load(“..\..\PetCemetary.xml”);
      var pets = from pet in xml.Elements(“pet”)
                 let genus = pet.Element(“species”).Attribute(“genus”)
                 where genus.Value == “Feline”
                  select pet;

      Array.ForEach(pets.ToArray(), p=>Console.WriteLine(p.Element(“name”).Value));
      Console.ReadLine();
    }
  }
}

In the example, the temporary range variable genus is defined in the let clause. This approach shortens other references to that attribute to the genus local range variable. Using let is useful for long predicates and those that are used multiple times. The where clause’s predicate indicates that the LINQ query should only return kitty cats.

Adding Attributes

You can change attribute values, add new attributes, and remove existing attributes. Attributes are added through the XElement.Add method. Listing 18.5 shows a LINQ query that returns pets without the genus attribute on the species element and adds the attribute. (In the example, all pets missing the genus attribute are assigned the genus attribute and the value “Dog”.)

Listing 18.5 Adding New Attributes to an XElement

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Xml;
using System.Xml.Linq;

namespace AddingAttributes
{
  class Program
  {
    static void Main(string[] args)
    {
      string filename = “..\..\PetCemetary.xml”;

XElement xml = XElement.Load(filename); var pets = from pet in xml.Elements(“pet”) let genus = pet.Element(“species”).Attribute(“genus”) where genus == null select pet;

Array.ForEach(pets.ToArray(), p=>Console.WriteLine(p.Element(“name”).Value));

foreach(var pet in pets) { pet.Element(“species”).Add(new XAttribute(“genus”, “Dog”)); }

xml.Save(filename);

Console.ReadLine(); } } }

Removing Attributes

To remove an attribute, you need to invoke the Remove method of the XAttribute object. As the example in Listing 18.6 shows, navigate to the XElement node containing the target attribute. Use the Attribute method, passing in the name of the attribute; this step returns the XAttribute object. Invoke Remove on the returned object. As demonstrated in the listing, these elements can be strung together in a single statement.

Listing 18.6 Removing an Attribute By Calling the Remove Method of the XAttribute (Attributes Are Added at the XElement Object Level)

using System; using System.Collections.Generic; using System.Linq; using System.Text; using System.Xml; using System.Xml.Linq;

namespace RemovingAttributes { class Program { static void Main(string[] args) { string filename = “..\..\PetCemetary.xml”;

XElement xml = XElement.Load(filename); var pets = from pet in xml.Elements(“pet”) where pet.Element(“name”).Value == “Ruby” select pet;

Array.ForEach(pets.ToArray(), p=>Console.WriteLine(p.Element(“name”).Value));

foreach(var pet in pets) { pet.Element(“species”).Attribute(“genus”).Remove(); }

xml.Save(filename);

      Console.ReadLine();
    }
  }
}

Loading XML from a String

Sometimes, Visual Basic (VB) gets left out in the cold, as was the case with anonymous delegates, and sometimes VB gets cool features like My and Literal XML. In VB, you can define literal XML in your VB code. That’s cool. In C#, you can define XML in code, but the way you have to do it is to define the XML as a string and then invoke the XElement.Parse method.

Listing 18.7 demonstrates how you can define XML in your code as a string, call XElement.Parse to convert that string to queryable XML, and then use LINQ to query the XML. Listing 18.7 queries all of the pet elements—there is only one in the sample—and then uses Array.ForEach to display the name of each pet.

Listing 18.7 Converting a String Containing XML into Queryable XML with the XElement.Parse Method

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Xml.Linq;

namespace XElementParseDemo { class Program { static void Main(string[] args) { string xml = “<pets>” + “ <pet>” + “ <id>2</id>” + “ <name>Dog</name>” + “ <species>Some Kind of Cat</species>” + “ <sex>Female</sex>” + “ <startYear>1972</startYear>” + “ <endYear>1974</endYear>” + “ <causeOfDeath>Car</causeOfDeath>” + “ <specialQuality>Best mouser</specialQuality>” + “ </pet>” + “</pets>”;

XElement elem = XElement.Parse(xml); var pets = from pet in elem.Elements(“pet”) select pet;

Array.ForEach(pets.ToArray(), p => Console.WriteLine( p.Element(“name”).Value)); Console.ReadLine(); } } }

Handling Missing Data

Some data might not exist in your XML. For example, it’s valid to have an element with no data, and as you read in the previous section, you can have undefined elements such as attributes. To handle these scenarios, you can use a combination of nullable types, checks for empty element, or null strings.

Listing 18.8 is based on the fact that I have a whole house full of pets that aren’t interned in the pet cemetery. The net effect is that the endYear elements for cuddly little critters on this side of the hereafter contain no data.

Listing 18.8 Using Nullable Types and Inline Conditional Checks to Handle Missing Data

using System; using System.Collections.Generic; using System.Linq; using System.Text; using System.Xml.Linq;

namespace HandlingMissingDataDemo { class Program { static void Main(string[] args) { XElement elem = XElement.Load(@”....PetCemetary.xml”); var pets = from pet in elem.Elements(“pet”) let endYear = pet.Element(“endYear”) select new { Name = pet.Element(“name”).Value, StartYear = (int)pet.Element(“startYear”), EndYear = (int?)(endYear.IsEmpty || endYear.Value.Length == 0 ? null : (int?)Convert.ToInt32(endYear.Value)) }; Array.ForEach(pets.ToArray(), p => { Console.WriteLine(“Name: {0}”, p.Name); Console.WriteLine(“Entered family: {0}”, p.StartYear); Console.WriteLine(“Left family: {0}”, p.EndYear); }); Console.ReadLine(); } } }

In the example, the endYear element is assigned to a range element of the same name in a let statement. In the project, the EndYear property is initialized based on an IsEmpty check and a Value.Length == 0 check. If either of these conditions is true, null is assigned to the nullable integer; otherwise, the year converted to an integer is assigned to the EndYear property of the projection.

It is worth noting the values of attributes and elements always originate from the XML as strings. You can project specific types by converting the string values to the desired type, assuming the types are convertible; for example, “1966” can be converted to an integer.

Using Query Expressions with XML Data

The query techniques you have learned in earlier chapters apply to LINQ to XML, too. To be thorough and to demonstrate subtle differences in how the XML elements are incorporated in queries, the examples in this section demonstrate some common kinds of queries using Yahoo!’s stock quote feature and XML derived from queries to Yahoo!

To request a quote from Yahoo!, enter a uniform resource locator (URL) similar to the following in your browser:

http://download.finance.yahoo.com/d/?s=goog&f=ncbh

Listing 18.9 shows the XML file created manually by taking the return values from a couple of quotes and using them to structure an XML file. (The layout was arbitrarily chosen to facilitate demonstrating aspects of LINQ to XML, which includes the fictitious namespace.)

Listing 18.9 An Arbitrary XML File Created from Actual Stock Quotes from Yahoo!

<?xml version=”1.0” encoding=”utf-8” ?>
<sq:Stocks xmlns:sq=”http://www.stock_quotes.com”>
  <sq:Stock>
    <sq:Symbol>MSFT</sq:Symbol>
    <sq:Price Change=”0.6” Low=”42.1” High=”51.0”>56.0</sq:Price>
  </sq:Stock>
  <sq:Stock>
    <sq:Symbol>MVK</sq:Symbol>
    <sq:Price Change=”-3.2” Low=”22.8” High=”32.4”>25.5</sq:Price>
  </sq:Stock>
  <sq:Stock>
    <sq:Symbol>GOOG</sq:Symbol>
    <sq:Price Change=”8.0” Low=”24.4” High=”34.5”>32.0</sq:Price>
  </sq:Stock>
  <sq:Stock>
    <sq:Symbol>VFINX</sq:Symbol>
    <sq:Price Change=”8.0” Low=”24.4” High=”34.5”>32.0</sq:Price>
  </sq:Stock>
  <sq:Stock>
    <sq:Symbol>HDPMX</sq:Symbol>
    <sq:Price Change=”8.0” Low=”24.4” High=”34.5”>32.0</sq:Price>
  </sq:Stock>
</sq:Stocks>

Using Namespaces

The first thing you will note about Listing 18.9 is that a namespace was added (the xmlns attribute). Listing 18.10 demonstrates how you can define an XNamespace object and then use that variable as a prefix to all of the XElement requests for XML files that use a namespace.

Listing 18.10 Uses an XNamespace to Manage XML Files—from Listing 18.9 Here—Containing Namespaces

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Xml.Linq;

namespace UsingNamespacesWithLinqToXML
{
  class Program
  {
    static void Main(string[] args)
    {
      const string filename = “..\..\Stocks.xml”;
      XElement xml = XElement.Load(filename);
      XNamespace sq = “http://www.stock_quotes.com”;

      var stocks =
        from stock in xml.Elements(sq + “Stock”)
        select new {Name=stock.Element(sq + “Symbol”).Value};
      Array.ForEach(stocks.ToArray(),
        o=>Console.WriteLine(o.Name));
      Console.ReadLine();
    }
  }
}

Nesting Queries

Nested queries are supported by LINQ and, thus, are supported by LINQ to XML. Nested queries make sense for LINQ to XML because of the hierarchical nature of XML files that have nested nodes. Listing 18.11 produces the same results as Listing 18.12, but uses nesting. The code finds all of the XElement Stock nodes and then the Symbol nodes within the Stock nodes.

Although Listing 18.10 is easier and produces the same results as Listing 18.11, Listing 18.11 does show how to nest queries. Nested queries are especially useful when child nodes contain duplicate elements, for example, when a customer has multiple addresses.

Listing 18.11 Demonstrating How to Nest LINQ to XML Queries Syntactically

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Xml.Linq;

namespace NestedLinqQueries
{
  class Program
  {
    static void Main(string[] args)
    {
      const string filename = “..\..\Stocks.xml”;
      XElement xml = XElement.Load(filename);
      XNamespace sq = “http://www.stock_quotes.com”;

      var stocks =
        from stock in xml.Elements(sq + “Stock”)
          where(
            from symbol in stock.Elements(sq + “Symbol”)
            select symbol).Any()
        select new {Name=stock.Element(sq + “Symbol”).Value};

      Array.ForEach(stocks.ToArray(),
        o=>Console.WriteLine(o.Name));
      Console.ReadLine();
    }
  }
}

Filtering with Where Clauses

Where clauses, also called filters, are used to refine the query to selectively return a more specific subset of results. In the example in Listing 18.12, a nested query is used in the outer where clause and an inner where clause is used to check the Change attribute of the Price element. In the example, stocks with a negative price change are returned.

Listing 18.12 A Nested Query That Returns Stocks with a Price Change—Using the Change Attribute of the Price Node—Less Than 0; That Is, Stocks That Are Going Down in Price

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Xml.Linq;

namespace FilteringWithWhereClauses
{
  class Program
  {
    static void Main(string[] args)
    {
      const string filename = “..\..\Stocks.xml”;
      XElement xml = XElement.Load(filename);
      XNamespace sq = “http://www.stock_quotes.com”;

      var stocksThatLostGround =
        from stock in xml.Elements(sq + “Stock”)
          where (
            from price in stock.Elements(sq + “Price”)
            where (decimal)price.Attribute(“Change”) < 0
            select price).Any()
        select stock;

      Array.ForEach(stocksThatLostGround.ToArray(),
        o=>Console.WriteLine(o.Element(sq + “Symbol”).Value));
      Console.ReadLine();
    }
  }
}

Finding Elements Based on Context

XML documents have a context. For example, the list of stock quotes has a root element Stocks. Each Stock element is contained within the root context. Based on the way the stock elements are defined, each stock element has a symbol and price information. The symbol and price are subordinate to the stock element, which defines their context.

LINQ to XML supports navigating to elements based on their context. This is accomplished by invoking members such as XElement.ElementsAfterSelf, XElement.ElementsBeforeSelf, XElement.NextNode, and XElement.HasElements. In the example, an XML document defined as a string is parsed into an XElement, the XNamespace is defined, and a query looks for a price change less than 1.

The range variable is defined as all of the Symbol elements. The let statement is used to create a temporary range of all of the price elements by using the context method XElement.ElementsAfterSelf and the FirstOrDefault method. The result is that the first Price element is returned and assigned to the price range. The where clause filters the resultset by Price’s Change attribute (see Listing 18.13).

Listing 18.13 Defining a Local Range Variable price Using the Context Method ElementsAfterSelf

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Xml.Linq;

namespace FindingElementsBasedOnContext
{
  class Program
  {
    static void Main(string[] args)
    {
      XElement elem = XElement.Parse(
        “<?xml version=”1.0” encoding=”utf-8” ?>” +
        “<sq:Stocks xmlns:sq=”http://www.stock_quotes.com”>” +
        “  <sq:Stock>” +
        “    <sq:Symbol>MSFT</sq:Symbol>” +
        “    <sq:Price Change=”0.6” Low=”42.1” High=”51.0”>56.0
             →</sq:Price>” +
        “  </sq:Stock>” +
        “  <sq:Stock>” +
        “    <sq:Symbol>MVK</sq:Symbol>” +
        “    <sq:Price Change=”-3.2” Low=”22.8” High=”32.4”>25.5
             →</sq:Price>” +
        “  </sq:Stock>” +
        “  <sq:Stock>” +
        “    <sq:Symbol>GOOG</sq:Symbol>” +
        “    <sq:Price Change=”8.0” Low=”24.4” High=”34.5”>32.0
             →</sq:Price>” +
        “  </sq:Stock>” +
        “  <sq:Stock>” +
        “    <sq:Symbol>VFINX</sq:Symbol>” +
        “    <sq:Price Change=”8.0” Low=”24.4” High=”34.5”>32.0
             →</sq:Price>” +
        “  </sq:Stock>” +
        “  <sq:Stock>” +
        “    <sq:Symbol>HDPMX</sq:Symbol>” +
        “    <sq:Price Change=”8.0” Low=”24.4” High=”34.5”>32.0
             →</sq:Price>” +
        “  </sq:Stock>” +
        “</sq:Stocks>”);

XNamespace sq = “http://www.stock_quotes.com”;

      var contextStock =
        from symbol in elem.Elements(sq + “Stock”).Elements(sq + “Symbol”)
        let price = symbol.ElementsAfterSelf().FirstOrDefault()
        where (decimal)(price.Attribute(“Change”)) < 1M
        select symbol;

      Array.ForEach(contextStock.ToArray(), o=>Console.WriteLine(o.Value));
      Console.ReadLine();
    }
  }
}

Sorting XML Queries

Sorting is straightforward. Add an orderby clause based on the desired criteria and LINQ does the rest. If you modify the query in Listing 18.13 as shown in Listing 18.14, the resultset will be sorted by the Price’s Change attribute.

Listing 18.14 Modifying the Query in Listing 18.13 with an orderby Clause Permits Sorting the Query Results as You Would Expect

  var contextStock =
       from symbol in elem.Elements(sq + “Stock”).Elements(sq + “Symbol”)
       let price = symbol.ElementsAfterSelf().FirstOrDefault()
       where (decimal)(price.Attribute(“Change”)) < 1M
       orderby (decimal)price.Attribute(“Change”)
       select symbol;

Calculating Intermediate Values with Let

The let LINQ keyword is an excellent aid in creating temporary range variables. For example, if you want to indicate the difference between the day’s low and high values, you could introduce a new value spread that is the difference between the Price’s High and Low attribute. This value could then be assigned to an anonymous type using the projection syntax.

In Listing 18.15, the XML string is converted to an XElement and the query introduces a calculated value—the spread between the low and high day price—orders the results, and defines a new type that includes the spread using projection syntax in the select clause.

Listing 18.15 Calculating Values Using the let Clause

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Xml.Linq;

namespace IntermediateValuesWithLet
{
  class Program
  {
    static void Main(string[] args)
    {
      XElement elem = XElement.Parse(
         “<?xml version=”1.0” encoding=”utf-8” ?>” +
         “<sq:Stocks xmlns:sq=”http://www.stock_quotes.com”>” +
         “  <sq:Stock>” +
         “    <sq:Symbol>MSFT</sq:Symbol>” +
         “    <sq:Price Change=”0.6” Low=”42.1” High=”51.0”>56.0
              →</sq:Price>” +
         “  </sq:Stock>” +
         “  <sq:Stock>” +
         “    <sq:Symbol>MVK</sq:Symbol>” +
         “    <sq:Price Change=”-3.2” Low=”22.8” High=”32.4”>25.5
         “    →</sq:Price>” +
         “  </sq:Stock>” +
         “  <sq:Stock>” +
         “    <sq:Symbol>GOOG</sq:Symbol>” +
         “    <sq:Price Change=”8.0” Low=”24.4” High=”34.5”>32.0
              →</sq:Price>” +
         “  </sq:Stock>” +
         “  <sq:Stock>” +
         “    <sq:Symbol>VFINX</sq:Symbol>” +
         “    <sq:Price Change=”8.0” Low=”24.4” High=”34.5”>32.0
              →</sq:Price>” +
         “  </sq:Stock>” +
         “  <sq:Stock>” +
         “    <sq:Symbol>HDPMX</sq:Symbol>” +
         “    <sq:Price Change=”8.0” Low=”24.4” High=”34.5”>32.0
              →</sq:Price>” +
         “  </sq:Stock>” +
         “</sq:Stocks>”);

XNamespace sq = “http://www.stock_quotes.com”;

      var stockSpreads =
        from stock in elem.Elements(sq + “Stock”)
        let spread = (decimal)stock.Element(sq + “Price”).Attribute(“High”) -
          (decimal)stock.Element(sq + “Price”).Attribute(“Low”)
        orderby spread descending
        select new {Symbol=stock.Element(sq + “Symbol”).Value, Spread=spread};

      Array.ForEach(stockSpreads.ToArray(), o=>Console.WriteLine(o));
      Console.ReadLine();
    }
  }
}

Annotating Nodes

Node annotations are used to add useful information for coding purposes. For example, you could annotate stock nodes with the query used to return the stock price if you wanted to update the node at some future point. Annotations are not serialized to the XML file, but you can add annotations, remove them from the XElement, and access them for use in code by using the Annotation method of the XElement object.

Annotations are supported for XElement, XAttribute, XCData—CDATA nodes—and XDocument objects. Basically, AddAnnotation is defined in XNode and XContainer and the elements of XML documents are derived from one or the other of these abstract classes.

Listing 18.16 Demonstrating How to Annotate an XElement with a Query Used to Request the Stock Price from Yahoo!—Even Though This Information Is Useful in Code, It Won’t Be Serialized with the XML

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Xml.Linq;

namespace AddAnnotationToXML
{
  class Program
  {
    static void Main(string[] args)
    {
      const string filename = “..\..\Stocks.xml”;
      XElement elem = XElement.Load(filename);
      XNamespace sq = “http://www.stock_quotes.com”;

      var stocksToAnnotate =
        from stock in elem.Elements(sq + “Stock”)
        select stock;

      const string yahooQuery =
       “http://download.finance.yahoo.com/d/?s={0}&f=ncbh”;
      foreach(var stock in stocksToAnnotate)
      {
        stock.AddAnnotation(string.Format(yahooQuery,
          stock.Element(sq + “Symbol”).Value));
        Console.WriteLine(stock.Annotation(typeof(Object)));
      }
      Console.ReadLine();
    }
  }
}

In Listing 18.16, the annotation is added to the Stock element and immediately read back—for demonstration purposes—illustrating how the annotation is retrieved. Note that the argument to Annotation is a Type argument. This implies that an annotation can be any object, and, consequently, it is possible to associate behaviors in the form of objects as annotations to XML elements.

Summary

XML looks hard to read at first glance. However, at its core, XML is matching tag pairs that are nested and can have attributes. The very general nature and simplicity of XML make it powerful; CData, XML Transforms, and XPath make XML more powerful but also introduce usage complexities related to learning additional technologies, such as XPath or XSLT. You will learn about these subjects in Chapter 19, “Comparing LINQ to XML with Other XML Technologies.”

LINQ to XML makes manipulating XML uniform. Until now, programmers had to know C# to manipulate objects, SQL to manipulate databases, and XPath and XSLT to manipulate XML. LINQ offers a uniform way to handle all of these different kinds of things. A uniform way of handling things permits you to focus on the problem rather than wrangling with the technology.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for 18 Extracting Data from XML

Create new playlist

Sign In

Sign Up

Chapter 18Extracting Data from XML

Loading XML Documents

Querying XML Documents

Using XDocument

Using XElement

Managing Attributes

Adding Attributes

Removing Attributes

Loading XML from a String

Handling Missing Data

Using Query Expressions with XML Data

Using Namespaces

Nesting Queries

Filtering with Where Clauses

Finding Elements Based on Context

Sorting XML Queries

Calculating Intermediate Values with Let

Annotating Nodes

Summary

Table of Contents for
18 Extracting Data from XML

Chapter 18
Extracting Data from XML