3.2. Querying XML

Since LINQ to XML supports the LINQ standard query operators, an XML document can be loaded in memory and then queried with the usual LINQ query syntax.

Let's start by analyzing a simple query using a couple of important LINQ to XML classes. Listing 3-1 is the XML representation of our People database.

Example 3-1. The XML Representation of the People Database
<?xml version="1.0" encoding="utf-8" ?>
<people>
    <!--Person section-->
    <person>
        <id>1</id>
        <firstname>Carl</firstname>
        <lastname>Lewis</lastname>
        <idrole>1</idrole>
    </person>
    <person>
        <id>2</id>
        <firstname>Tom</firstname>
        <lastname>Gray</lastname>
        <idrole>2</idrole>
    </person>
    <person>
        <id>3</id>
        <firstname>Mary</firstname>
        <lastname>Grant</lastname>
        <idrole>2</idrole>
    </person>
    <person>
        <id>4</id>
        <firstname>Fabio Claudio</firstname>
        <lastname>Ferracchiati</lastname>
        <idrole>1</idrole>
    </person>
    <!--Role section-->
    <role>
        <id>1</id>
        <roledescription>Manager</roledescription>
    </role>
    <role>
        <id>2</id>
        <roledescription>Developer</roledescription>
    </role>

<!--Salary section-->
    <salary>
        <idperson id="1" year="2004" salaryyear="10000,0000" />
        <idperson id="1" year="2005" salaryyear="15000,0000" />
    </salary>
</people>

In Listing 3-1 the XDocument class provides the Elements method that returns items from the XElement class. The XElement class represents the core of the entire LINQ to XML library. An XElement object is the representation of an element within an XML document. Each node, as well as each leaf, in the XML document is an element. As you can see from the code in Listing 3-2, to obtain an element's value you have to use the Elements method repeatedly. When observing the XML structure you can see that the people element is the root and the person element appears four times to represent four rows in the people data source. Using the Elements method provided by the XElement class you can retrieve a collection of elements and iterate through them. So, by appending the Elements("person") method call to the Elements("people") method call you can retrieve all four person elements in the XML document.

The where condition filters the person elements to retrieve the one whose identifier is equal to one.

Example 3-2. Retrieving a Person's Record from an XML Document
XDocument xml = XDocument.Load(@"....People.xml");

var query = from p in xml.Elements("people").Elements("person")
            where (int)p.Element("id") == 1
            select p;

foreach(var record in query)
{
    Console.WriteLine("Person: {0} {1}",
       record.Element("firstname").Value,
       record.Element("lastname").Value);
}

NOTE

The value of an XElement object is always represented by a string type. If you want to change the string type, you have to cast the value to the desired type. The code in Listing 3-2 casts to int to check the record identifier in the where condition.

Finally, the foreach statement iterates through the elements and prints the name of each person. You use the Value property to retrieve an element's value.

The XDocument class is very similar to the XElement class (it contains the same methods, properties, etc.) but it represents the root element of an XML document. In our example, the XDocument object represents the people element. Its Load method will load the XML document into memory, allowing us to use the XDocument object for queries.

If you don't care about the root element and just want to go straight to a particular element, you can use the Load method of the XElement class. In Listing 3-3 you can see the same query applied to our XML data source but using less code.

Example 3-3. Retrieving Person Data by Using Less Code
XElement xml = XElement.Load(@"....People.xml");

var query = from p in xml.Elements("person")
            where (int)p.Element("id") == 1
            select p;

foreach(var record in query)
{
    Console.WriteLine("Person: {0} {1}",
                                        record.Element("firstname"),
                                        record.Element("lastname"));
}

You can search directly for person records without calling the Elements method for the root element. Moreover, if you omit the Value property (used in Listing 3-2), you can call the ToString method, which returns the full element with its start and end tags (see Figure 3-1).

Figure 3-1. Omitting the Value property, the output will be the full element.

NOTE

Casting an Element to a string is equivalent to using its Value property.

3.2.1. Searching for Attribute Values

The following code shows salary information stored in idperson attributes:

<salary>
    <idperson year="2004" salaryyear="10000,0000">1</idperson>
    <idperson year="2005" salaryyear="15000,0000">1</idperson>
</salary>

Obviously, LINQ to XML provides a way to query elements by their attributes. (See Listing 3-4.)

Example 3-4. Querying by Attribute Values
XElement xml = XElement.Load(@"....People.xml");

var query = from s in xml.Elements("salary").Elements("idperson")
            where (int)s.Attribute("year") == 2004
            select s;

foreach(var record in query)
{
    Console.WriteLine("Amount: {0}", (string)

record.Attribute("salaryyear"));
}

The XAttribute class represents the attributes of an element within an XML document. The Attribute method returns the value of the attribute whose name is specified as its argument.

3.2.2. The Descendants and Ancestors Methods

When elements are deeply nested, you can use the Descendants method to quickly navigate to the desired element. Listing 3-5 shows how to navigate down into nested elements using this quicker way.

Example 3-5. Using the Descendants Method to Navigate Down an XML Tree
XElement xml = XElement.Load(@"....People.xml");

var query = from p in xml.Descendants("person")
            join s in xml.Descendants("idperson")
            on (int)p.Element("id") equals (int)s.Attribute("id")
            select new {FirstName=p.Element("firstname").Value,
                        LastName=p.Element("lastname").Value,
                        Amount=s.Attribute("salaryyear").Value};

foreach(var record in query)
{
    Console.WriteLine("Person: {0} {1}, Salary {2}",record.FirstName,
                                                    record.LastName,
                                                    record.Amount);
}

The code in Listing 3-5 joins two sections within the XML data source: person and salary. As you can see, the query syntax is the same as that used for in-memory objects and database tables and views.

Conversely, the Ancestors method goes up through an XML tree until it reaches the root element. In Listing 3-6, both methods are used to navigate the XML document.

Example 3-6. Using the Ancestors and Descendents Method to Navigate in the XML Document Tree
XElement xml = XElement.Load(@"....People.xml");

var record = xml.Descendants("firstname").First();
foreach(var tag in record.Ancestors())
    Console.WriteLine(tag.Name);

First the Descendants method returns a collection of firstname elements; however, if we use the First standard operator, just the first element will be retrieved. The cursor in the XML tree now points to the first firstname element, containing the Carl value, so if we use the Ancestors method to rise to the top of the document, the collection of the XElement items will contain two tags: person and people.

NOTE

The Descendants and Ancestors methods do not include the current node. For example, if you start from the root node you'll retrieve all the elements except the root. You can use the SelfAndDescendants and SelfAndAncestors methods to include the current node.

3.2.3. Querying XML for Content Type

We can use LINQ to XML to query not only for values, but also for types. For instance, our sample XML data source contains three comments. We can search for all the comments as in Listing 3-7.

Example 3-7. Retrieving All the Comments in an XML Document
XElement xml = XElement.Load(@"....People.xml");

IEnumerable<XComment> record = xml.Nodes().OfType<XComment>();
foreach(XComment comment in record)
    Console.WriteLine(comment);

The XComment class represents XML comments. Note that we used the Nodes method instead of the Elements method. Examine Figure 3-2 carefully to better understand the hierarchy between LINQ to XML classes.

Figure 3-2. The LINQ to XML class hierarchy

The XElement class is not directly linked to the XComment class, so if we want to retrieve comments we have to use the XNode class. For this reason, by using the Nodes method of XElement we can obtain a collection of Node objects. Using the OfType standard operator we can filter the objects for the specified type—in Listing 3-7, XComment.

3.2.4. Querying an XML Document That Uses Schemas

XML elements are often associated with specific namespaces, and their names are prefixed with a namespace identifier For example, Microsoft Office 2003 adds XML support for applications such as Microsoft Word, Microsoft Excel, and so on. If you save a simple Word file in XML format, you get an XML document similar to the snippet in Listing 3-8.

Example 3-8. A Microsoft Word Document Saved Using the XML Format
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<?mso-application progid="Word.Document"?>
<w:wordDocument
xmlns:w="http://schemas.microsoft.com/office/word/2003/wordml"
xmlns:v="urn:schemas-microsoft-com:vml"
xmlns:w10="urn:schemas-microsoft-com:office:word"

xmlns:sl="http://schemas.microsoft.com/schemaLibrary/2003/core"
xmlns:aml="http://schemas.microsoft.com/aml/2001/core"

xmlns:wx="http://schemas.microsoft.com/office/word/2003/auxHint"
xmlns:o="urn:schemas-microsoft-com:office:office"
xmlns:dt="uuid:C2F41010-65B3-11d1-A29F-00AA00C14882"

xmlns:wsp="http://schemas.microsoft.com/office/word/2003/wordml/sp2"
w:macrosPresent="no"
w:embeddedObjPresent="no"
w:ocxPresent="no"
xml:space="preserve">
<w:ignoreElements
w:val="http://schemas.microsoft.com/office/word/2003/wordml/sp2"/>
<o:DocumentProperties>
    <o:Title>Hello LINQ to XML</o:Title>
    <o:Author>Fabio Claudio Ferracchiati</o:Author>
    <o:LastAuthor>Fabio Claudio Ferracchiati</o:LastAuthor>
    <o:Revision>1</o:Revision>
    <o:TotalTime>1</o:TotalTime>
    <o:Created>2006-08-20T07:54:00Z</o:Created>
    <o:LastSaved>2006-08-20T07:55:00Z</o:LastSaved>
    <o:Pages>1</o:Pages>
    <o:Words>1</o:Words>
    <o:Characters>12</o:Characters>
    <o:Company>APress</o:Company>
    <o:Lines>1</o:Lines>
    <o:Paragraphs>1</o:Paragraphs>
    <o:CharactersWithSpaces>12</o:CharactersWithSpaces>
    <o:Version>11.8026</o:Version>
</o:DocumentProperties>
<w:fonts>
    <w:defaultFonts w:ascii="Times New Roman"

w:fareast="Times New Roman"
    w:h-ansi="Times New Roman"
    w:cs="Times New Roman"/>
</w:fonts>
...
...

</w:wordDocument>

To query for the Word file's author, we have to search for the <o:Author> tag. The o: prefix identifies the XML namespace defined as

xmlns:o="urn:schemas-microsoft-com:office:office"

We have to add this information to our LINQ to XML code as shown in Listing 3-9.

Example 3-9. Querying an XML Document That Uses Namespaces
XElement xml = XElement.Load(@"....Hello_LINQ to XML.xml");
XNamespace o = "urn:schemas-microsoft-com:office:office";

var query = from w in xml.Descendants(o + "Author")
            select w;
foreach (var record in query)
    Console.WriteLine("Author: {0}", (string)record);

The XNamespace class represents and XML namespace. If we concatenate the o object with "Author", the Descendants method will go through the XML document until it reaches the Author tag for the namespace represented by the o object.

You can search for a particular attribute in a similar way. In Listing 3-10 the code searches for the default font style used in the Word document.

Example 3-10. Retrieving the Attribute's Value Prefixed with the Namespace Shortcut
XElement xml = XElement.Load(@"....Hello_LINQ to XML.xml");
XNamespace w = "http://schemas.microsoft.com/office/word/2003/wordml";

XElement defaultFonts = xml.Descendants(w + "defaultFonts").First();

Console.WriteLine("Default Fonts: {0}",
    (string)defaultFonts.Attribute(w + "ascii"));

After the Descendants method reached the first w:DefaultFonts element, we used the XElement object's Attribute method to retrieve the w:ascii attribute. Note that the namespace must be concatenated to the attribute name in the same way as for the parent element.

3.2.5. The ElementsBeforeSelf and ElementsAfterSelf Methods

It's often necessary to retrieve child elements starting from the current node. This is easy to do by using the ElementsBeforeSelf and ElementsAfterSelf methods, which retrieve a collection of sibling XElement items that occur before the current element and after the current element, respectively. Listing 3-11 shows both methods in action.

Example 3-11. Using the ElementsBeforeSelf and ElementsAfterSelf Methods
XElement xml = XElement.Load(@"....People.xml");

XElement firstName = xml.Descendants("firstname").First();

Console.WriteLine("Before <firstname>");
foreach(var tag in firstName.ElementsBeforeSelf())
    Console.WriteLine(tag.Name);

Console.WriteLine("");
Console.WriteLine("After <firstname>");
foreach(var tag in firstName.ElementsAfterSelf())
    Console.WriteLine(tag.Name);

After we've positioned the cursor over the first firstname element, calling the two methods will produce the output shown in Figure 3-3.

Figure 3-3. The output shows the XML tags before and after the current XML tag.

NOTE

If you want to retrieve other XML information, such as comments, you have to use the Node versions of ElementsBeforeSelf and ElementsAfterSelf, NodesBeforeSelf, and NodesAfterSelf.

3.2.6. Miscellaneous Functionalities

The XElement class provides other useful properties to easily obtain access to XML-document information. In this section we will look at them individually.

Parent

This property allows us to retrieve the parent element of the current node, as Listing 3-12 shows.

Example 3-12. Using the Parent Property to Retrieve the Parent Node of the firstname Element
XElement xml = XElement.Load(@"....People.xml");

XElement firstName = xml.Descendants("firstname").First();

Console.WriteLine(firstName.Parent);

The parent node of firstname is person, so the output of this code snippet will be the full person element (see Figure 3-4).

Figure 3-4. The output of the code snippet in Listing 3-12

HasElements and HasAttributes

These properties check if the current element has child elements or attributes. (See Listing 3-13.)

Example 3-13. Using HasElements and HasAttributes to Check if the Current Node Has Child Elements and Attributes, Respectively
XElement xml = XElement.Load(@"....People.xml");

            XElement firstName = xml.Descendants("firstname").First();

            Console.WriteLine("FirstName tag has attributes: {0}",
                firstName.HasAttributes);
            Console.WriteLine("FirstName tag has child elements: {0}",
                firstName.HasElements);
            Console.WriteLine("FirstName tag's parent has attributes: {0}",
firstName.Parent.HasAttributes);
            Console.WriteLine("FirstName tag's parent has child elements: {0}",
firstName.Parent.HasElements);

After the cursor reaches the first firstname element by using the Descendants method, the HasAttributes and HasElements properties check if both the firstname element and its parent have attributes and child elements. Figure 3-5 shows the output of this code snippet.

Figure 3-5. The HasElements and HasAttributes properties in action

IsEmpty

This property checks if the current element contains a value or whether it is empty. (See Listing 3-14.)

Example 3-14. Using the IsEmpty Property to Check if Some Elements Are Empty
XElement xml = XElement.Load(@"....People.xml");

            XElement firstName = xml.Descendants("firstname").First();

Console.WriteLine("Is FirstName tag empty? {0}", firstName.IsEmpty ?
               "Yes" : "No");

XElement idPerson = xml.Descendants("idperson").First();

Console.WriteLine("Is idperson tag empty? {0}",
                   idPerson.IsEmpty ? "Yes" : "No");

Figure 3-6 shows the output of this code snippet. If you look at the XML data source (Listing 3-1) you'll see that the firstname element is not empty because it contains the Carl value. You'll also see that the idperson tag is empty (and has only attributes).

Figure 3-6. The output of the code snippet using IsEmpty

Declaration

Using this property we can retrieve information about the XML document declaration. In Listing 3-15 we load the XML document using the XDocument class because it fills the LINQ to XML classes with all the possible information, such as the XML declaration, namespaces, and so on.

In Listing 3-15 we use the Declaration property to retrieve Encoding, Version, and Standalone information from the XML document declaration.

Example 3-15. Using the Declaration Property
XDocument xml = XDocument.Load(@"....Hello_LINQ to XML.xml");

Console.WriteLine("Encoding: {0}", xml.Declaration.Encoding);
Console.WriteLine("Version: {0}", xml.Declaration.Version);
Console.WriteLine("Standalone: {0}", xml.Declaration.Standalone);

Figure 3-7 shows the output of Listing 3-15.

Figure 3-7. The output of the code snippet using the Declaration property

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.221.136.142