Chapter 20
Constructing XML from Non-XML Data

In This Chapter

Image Constructing XML from CSV Files

Image Generating Text Files from XML

Image Using XML and Embedded LINQ Expressions (in VB)

“Thinking is the hardest work there is, which is probably the reason why so few engage in it.”

—Henry Ford

The previous chapter provided some technical comparisons and contrasts between LINQ to XML, XSLT, and XPath. (LINQ is Language Integrated Query, XML is eXtensible Markup Language, and XSLT is eXtensible Stylesheet Language Transformations.) This chapter is more of a “how to get some routine tasks done” rather simply with LINQ to XML and functional construction, and the last section demonstrates literal XML with embedded LINQ. (Although the literal XML is a Visual Basic [VB] feature, you will learn how you can use literal XML and embedded LINQ in C# if you want to.)

Sometimes being surprised surprises me. For instance, it is surprising that there is a staggeringly large amount of data that is moved around using File Transfer Protocol (FTP) and comma-delimited text files (often indicated by .csv file extension where CSV means comma-separated values). On a recent project with somewhere in the neighborhood of 100 third-party providers motor vehicle data was moved around as fixed-length files over FTP. This is important data, too, such as problem drivers, driving records for truck drivers, birth and death records, Social Security numbers, address standardization, and much, much more. What’s surprising about this—fixed-length records, text, and FTP comboplatter—is that it is based on technology that is at least 20 years old. (Clearly, “services” have not ubiquitously replaced the old ways of doing things.)

The problem with all this fixed-length data and FTPing of files is that it’s not very robust, it’s not real time, and every programmer has to write parsing algorithms to convert this data—whether fixed-length or comma separated values—into something usable, such as objects. Worse, XML can basically represent serialized objects that can be reconstituted with a couple of lines of code (in .NET). This chapter shows you just how easy it is to convert to and from text data to objects using LINQ to XML and functional construction. (Hopefully, as a practical chapter, it will save you a lot of work.)

Constructing XML from CSV Files

A comma separated value file is a text file containing values that are separated by commas; generally, each line of text represents an individual record. You can split each value on the comma and use functional construction to convert the text file to XML. Once in XML, deserialization (or LINQ) can be used to convert the XML neatly into an object or collection of objects.

For the example, use Yahoo!’s stock-quoting capability. The quote string can be defined to return a .csv file. The following query

http://quote.yahoo.com/d/quotes.csv?s={0}&f=nlh

requests one or more quotes and returns the data in quotes.csv. The parameter s={0} contains the stocks to obtain quotes for, and the parameter f=nlh returns the name, the last price, and the intraday high. Listing 20.1 contains the complete example. The listing is followed by a decomposed explanation of the technical solution.

Listing 20.1 Obtaining Quotes from Yahoo! in a .csv File and Using Functional Construction to Convert That Data to XML

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.IO;
using System.Xml.Linq;
using System.Net;
using System.Text.RegularExpressions;

namespace XMLFromCommaSeparatedValues
{
  class Program
  {
    static void Main(string[] args)
    {
     string all = GetQuotes(“MSFT GOOG DELL”);
     string[] quotes =
       all.Replace(“<b>”, “”).Replace(“</b>”, “”)

          .Replace(“””, ““).Split(new char[]{‘ ’},
          StringSplitOptions.RemoveEmptyEntries);

     // Read into an array of strings.
     XElement stockQuotes = new XElement(“Root”,
         from quote in quotes
         let fields = quote
          .Split(new char[]{‘,’, ‘-’},
           StringSplitOptions.RemoveEmptyEntries)
        select
           new XElement(“Company”, fields[0].Trim(),
           new XElement(“LastPrice”, fields[2].Trim(),
             new XAttribute(“Time”, fields[1].Trim())),
             new XElement(“HighToday”, fields[3].Trim())));

   stockQuotes.Save(“..\..\quotes.xml”);
    Console.WriteLine(stockQuotes);
    Console.ReadLine();
  }
  static string GetQuotes(string stocks)
  {
                 string url =
          @“http://quote.yahoo.com/d/quotes.csv?s={0}&f=nlh”;

            HttpWebRequest request =
            (HttpWebRequest)HttpWebRequest.Create(string.Format(url, stocks));

          HttpWebResponse response = (HttpWebResponse)request.GetResponse();

            using(StreamReader reader = new StreamReader(

              response.GetResponseStream(), Encoding.ASCII))
            {
              try
              {
                return reader.ReadToEnd();
              }
              finally
              {
                 // don’t need to close the reader because Dispose does
                 response.Close();
              }
            }

          }
        }
       }

The Main function as written calls GetQuotes passing the symbols for Microsoft, Google, and Dell to GetQuotes. The result is parsed into a string array, effectively removing the HTML <b> tags, the quotation marks, and splitting the string results by line. The remaining code uses a LINQ query that projects a new type using functional construction. A Root element is added with a nested Company element, the LastPrice, an Attribute on LastPrice, the Time, and the HighToday (see Listing 20.2 for the resulting XML). Finally, the XML is written to quotes.xml.

GetQuotes passes the formatted uniform resource locator (URL) to Yahoo! using an HttpWebRequest. The response is obtained from the HttpWebRequest as an HttpWebResponse, and a StreamReader is used to read the entire response. The response is returned as a string—StreamReader.ReadToEnd. Listing 20.2 shows the resulting XML after the code runs.

Listing 20.2 The Formatted XML After the LINQ Query Uses Functional Construction to Build the XML

  <Root>
  <Company>MICROSOFT CP
    <LastPrice Time=”2:12pm”>31.38</LastPrice>
    <HighToday>31.45</HighToday>
    </Company>
    <Company>GOOGLE
       <LastPrice Time=”2:12pm”>547.08</LastPrice>
       <HighToday>559.31</HighToday>
     </Company>
     <Company>DELL INC
       <LastPrice Time=”2:12pm”>19.06</LastPrice>
       <HighToday>19.20</HighToday>
     </Company>
   </Root>

It goes without saying that the actual data will change each time you run the query. When you have the XML, the data can be used as is and formatted with XSLT, or LINQ to XML to project a new object, or deserialization to read into a class containing the desired properties.

Generating Text Files from XML

The symmetric reverse operation can be performed to convert an XML file into a text file, for example, if you are sending data to an entity that expects text .csv files. (With slight modifications, you could send fixed-length text files too by changing the formatting string in the example.)

Listing 20.3 uses LINQ to XML to read the elements of an XML document and format them as a .csv file. The decomposition follows the listing.

Listing 20.3 Using LINQ to XML to Convert XML to a Comma-Separated Data File

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Xml.Linq;

namespace CommaSeparatedFileFromXML
{
  class Program
  {
    static void Main(string[] args)
    {
      XElement blackjackStats = XElement.Load(“..\..\CurrentStats.xml”);
      string file =
        (from elem in blackjackStats.Elements(“Player”)
        let statistics = elem.Element(“Statistics”)
        select string.Format(“{0},{1},{2},{3},{4},{5},{6},{7},{8}” +
        “{9},{10},{11},{12},{13},{14},{15}{16}”,
          (string)elem.Attribute(“Name”),
          (string)statistics.Element(“AverageAmountLost”),
          (string)statistics.Element(“AverageAmountWon”),
          (string)statistics.Element(“Blackjacks”),
          (string)statistics.Element(“Losses”),
          (string)statistics.Element(“NetAverageWinLoss”),
          (string)statistics.Element(“NetWinLoss”),
          (string)statistics.Element(“PercentageOfBlackJacks”),
          (string)statistics.Element(“PercentageOfLosses”),
          (string)statistics.Element(“PercentageOfPushes”),
          (string)statistics.Element(“PercentageOfWins”),
          (string)statistics.Element(“Pushes”),
          (string)statistics.Element(“Surrenders”),
          (string)statistics.Element(“TotalAmountLost”),
          (string)statistics.Element(“TotalAmountWon”),
          (string)statistics.Element(“Wins”),
          Environment.NewLine)).
            Aggregate(new StringBuilder(),
            (builder, str) => builder.Append(str),
              builder => builder.ToString());

         Console.WriteLine(file);
          Console.ReadLine();


   }
  }
 }

Using CurrentStats.xml from the Blackjack example in Chapter 18, “Extracting Data from XML,” you begin by loading the XML document. Next, the from clause defines a range value elem, which is each of the Player nodes. (There is only one in the file, but this example would work correctly if there were multiple player stats.) After the from clause, a temporary range value, statistics, is initialized with the Player’s Statistics node. The projection is the result of a string-formatting call that creates a single, comma-delimited string from each of the statistics’ child nodes.

The string items are read variously from the attributes and elements subordinate to the statistics element and a new line. All of the results of the query are aggregated using the extension method Aggregate. The StringBuilder is used to accumulate the CSV lines. The first Lambda Expression accumulates each of the strings into the builder, and the second Lambda Expression (the third parameter) converts the final result into a string. The result of the Aggregate operation is the CSV values.

In the listing (20.3), the result is written to the console, but you could just as easily save the result to a file. Listing 20.4 contains the output from Listing 20.3.

Listing 20.4 The Output From the LINQ to XML Statement and the Aggregate Operation

Player 1,-
28.125,30.681818181818183,1,8,5.9210526315789478,
→112.5,0.041666666666666664,33.33333333333332916.666666666666664,
→45.833333333333329,4,1,-225,337.5,11,

The output is actually written as a single line for each Player object in the XML file but is wrapped here because of page-space limitations.

Using XML and Embedded LINQ Expressions (in VB)

The potential exists in .NET for feature envy. For example, C# has anonymous types, but VB doesn’t. VB has literal XML and C# doesn’t. Although it would be easier to reconcile features because the purposes of these two languages are so similar, technically, it really doesn’t matter. If there is a feature in one or the other language, simply use that feature in a class library and reference the class library.

Because literal XML in VB and literal XML with embedded LINQ queries are so cool, so an example was added here. (Lobby Microsoft to add it to C#.)

The example is composed of three assemblies. One assembly contains a LINQ to SQL mapped ORM for the Northwind Customers table. This assembly is shared by the console application and the VB assembly. The VB assembly contains literal XML and embedded LINQ. The VB assembly converts the LINQ to SQL Customer objects to XML. The third assembly, the console application, contains the DataContext, requests the Customers from the Northwind Traders database, and uses the VB assembly to easily convert the Table<Customer> collection to formatted XML.

Listing 20.5 contains the ORM mapped Customers table. This information was covered in Chapter 15, “Joining Database Tables with LINQ Queries,” so the code is shown without elaboration.

Listing 20.5 The Customer Class Defined Using LINQ to SQL

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Data.Linq;
using System.Data.Linq.Mapping;

namespace CustomerClass
{
  [Table(Name=”Customers”)]
  public class Customer
  {

[Column(IsPrimaryKey=true)]
    public string CustomerID{ get; set; }

[Column]
    public string CompanyName{ get; set; }

[Column]
    public string ContactName{ get; set; }

[Column]
    public string ContactTitle{ get; set; }

[Column]
    public string Address{ get; set; }

[Column]
    public string City{ get; set; }

[Column]
    public string Region{ get; set; }

[Column]
    public string PostalCode{ get; set; }

[Column]
    public string Country{ get; set; }

[Column]
    public string Phone{ get; set; }

[Column]
    public string Fax{ get; set; }
  }
 }

In Listing 20.5, the Customer class uses automatic properties because this table is a dumb entity—a record structure really.

Listing 20.6 contains the console application. You have seen how to define a custom DataContext—also in Chapter 15—so this code is also presented without elaboration.

Listing 20.6 A C# Application That Defines the DataContext and Orchestrates Reading the Customers Table Using LINQ to SQL and Calling the VB Class Library’s GetXML Method

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Data.Linq;
using System.Xml.Linq;
using System.Data.Linq.Mapping;
using CustomerClass;
using Temp;

namespace LinqToSQlToXmlWithEmbeddedExpression
{
  class Program
  {
   static void Main(string[] args)
   {
     Northwind northwind = new Northwind();
     Table<Customer> customers = northwind.GetTable<Customer>();

     Console.WriteLine(LinqtoSqlToXml.GetXML(customers.ToList<Customer>()));
     Console.ReadLine();


   }
  }

  public class Northwind : DataContext
  {
    private static readonly string connectionString =
      “Data Source=BUTLER;Initial Catalog=Northwind;Integrated Security=True”;
    public Northwind() : base(connectionString){}
  }
 }

Finally, here is the VB code demonstrating literal XML and embedded LINQ. In the listing (20.7), an anonymous type—Dim xmlLiteral—is defined and assigned to literal XML. The type of xmlLiteral will be XElement. The root node is <Customers>. The part that looks like block script is called embedded expression. The embedded expressions are denoted by the <%= %> pairings.

Notice that the embedded expression is really a LINQ query with From and Select clauses. The Select clause is a projection whose result is XML. For instance, cust is the range variable and the attribute and element values are derived from embedded expressions that execute the statement in the script. In Listing 20.7, for example, the select clause produces a <Customer> tag for each customer, and the <Customer> tag has attributes CustomerID, CompanyName, and ContactName. The values for the attributes are defined by executing the embedded expressions, such as cust.CustomerID. Finally, the resultant XElement (XML) is returned as a string to the caller.

Listing 20.7 Literal XML (in VB) with Embedded Expressions and LINQ

Imports CustomerClass

Public Class LinqtoSqlToXml
   Public Shared Function GetXML(ByVal customers As List(Of Customer)) As String
    Dim xmlLiteral = <Customers>
      <%= From cust In customers _
      Select <Customer CustomerID=<%= cust.CustomerID %>
             CompanyName=<%= cust.CompanyName %>
             ContactName=<%= cust.ContactName %>>
      <Address><%= cust.Address %></Address>
      <City><%= cust.City %></City>
      <State><%= cust.Region %></State>
      <ZipCode><%= cust.PostalCode %></ZipCode>
      </Customer> %>
      </Customers>
     Return xmlLiteral.ToString()
  End Function
 End Class

The compiler treats each XML literal and embedded expression as a constructor call to the appropriate XML type, passing the literal or expression as an argument to the constructor. The result is an XElement with element and attribute child nodes.

For more information on Literal XML in VB, see the help topic “XML Element Literal” and “Embedded Expressions in XML” in Visual Studio’s help documentation.

Summary

A real problem in the enterprise space is how to deal with legacy data in all of its various formats. The public, commercial answer is to use services—services oriented architecture (SOA) being the catchall phrase. At its essence, a service is a façade that simplifies access to things. But, what is underneath the service? The legacy data still has to be put in a form that is consumable in the first place.

XML is very consumable data. And, the easiest way to get data into an XML format is to use LINQ to XML. In this chapter, the examples were intended to help illustrate just how little code is needed if you use LINQ to XML to convert likely kinds of legacy data from comma separated values (or fixed-length records) to highly transmittable text and XML.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.188.61.81