Chapter 14. Language Integrated Query (LINQ)

One of the most exciting new features in the .NET Framework v3.5 is the Language Integrated Query (LINQ). LINQ introduces to developers a standard and consistent language for querying and updating data, which include objects (such as arrays and collections), databases, XML documents, ADO.NET DataSets, and so forth.

Today, most developers need to know a myriad of technologies to successfully manipulate data. For example, if you are dealing with databases, you have to understand Structured Query Language (SQL). If you are dealing with XML documents, you must understand technologies such as XPath, XQuery, and XSLT. And if you are working with ADO.NET DataSets, then you need to know the various classes and properties in ADO.NET that you can use.

A better approach would be to have a unified view of the data, regardless of its form and structure. That is the motivation behind the design of LINQ. This chapter provides the basics of LINQ and shows how you can use LINQ to access objects, DataSets, and XML documents, as well as SQL databases.

LINQ Architecture

Figure 14-1 shows the architecture of LINQ. The bottom layer contains the various data sources with which your applications could be working. On top of the data sources are the LINQ-enabled data sources: LINQ to Objects, LINQ to DataSet, LINQ to SQL, LINQ to Entities, and LINQ to XML. LINQ-enabled data sources are also known as LINQ providers; they translate queries expressed in Visual Basic or C# into the native language of the data source. To access all these data sources through LINQ, developers use either C# or Visual Basic and write LINQ queries.

Figure 14-1

Figure 14.1. Figure 14-1

Note

LINQ to Entities is beyond the scope of this book. It was slated to be released later in 2008 and is not part of Visual Studio 2008.

So how does your application view the LINQ-enabled data sources?

  • In LINQ to Objects, the source data is made visible as an IEnumerable<T> or IQueryable<T> collection.

  • In LINQ to XML, the source data is made visible as an IEnumerable<XElement>.

  • In LINQ to DataSet, the source data is made visible as an IEnumerable<DataRow>.

  • In LINQ to SQL, the source data is made visible as an IEnumerable or IQueryable of whatever custom objects you have defined to represent the data in the SQL table.

LINQ to Objects

Let's start with LINQ to Objects. It enables you to use LINQ to directly query any IEnumerable<T> or IQueryable<T> collections (such as string[], int[], and List<T>) directly without needing to use an immediate LINQ provider or API such as the LINQ to SQL or LINQ to XML.

Say that you have a collection of data stored in an array, and you want to be able to retrieve a subset of the data quickly. In the old way of doing things, you write a loop and iteratively retrieve all the data that matches your criteria. That's time-consuming because you have to write all the logic to perform the comparison and so on. Using LINQ, you can declaratively write the condition using an SQL-like statement, and the compiler des the job of retrieving the relevant data for you.

Suppose that you have an array of type string that contains a list of names. The following program prints out all the names in the string array that start with the character G:

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;

namespace LINQ
{
    class Program
    {
        static void Main(string[] args)
        {
            string[] allNames = new string[] {
                "Jeffrey", "Kirby", "Gabriel",
                "Philip", "Ross", "Adam",
                "Alston", "Warren", "Garfield"};

            foreach (string str in allNames)
            {
                if (str.StartsWith("G"))
                {
                    Console.WriteLine(str);
                }
            }

            Console.ReadLine();
        }
    }
}

Using LINQ to Objects, you can rewrite the program as follows:

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;

namespace LINQ
{
    class Program
    {
        static void Main(string[] args)
        {
            string[] allNames = new string[] {
                "Jeffrey", "Kirby", "Gabriel",
                "Philip", "Ross", "Adam",
"Alston", "Warren", "Garfield"};

            IEnumerable<string> foundNames =
                from name in allNames
                where name.StartsWith("G")
                select name;

            foreach (string str in foundNames)
                Console.WriteLine(str);

            Console.ReadLine();
        }
    }
}

Notice that you have declared the foundNames variable to be of type IEnumerable<string>, and the expression looks similar to that of SQL:

IEnumerable<string> foundNames =
                from name in allNames
                where name.StartsWith("G")
                select name;

The one important difference from SQL queries is that in a LINQ query the operator sequence is reversed. In SQL, you use the select-from-where format, while LINQ queries use the format from-where-select. This reversal in order allows IntelliSense to know which data source you are using so that it can provide useful suggestions for the where and select clauses.

The result of the query in this case is IEnumerable<string>. You can also use the new implicit typing feature in C# 3.0 to let the C# compiler automatically infer the type for you, like this:

var foundNames =
                from name in allNames
                where name.StartsWith("G")
                select name;

When you now use a foreach loop to go into the foundNames variable, it will contain a collection of names that starts with the letter G. In this case, it returns Gabriel, Garfield.

The usefulness of LINQ is more evident when you have more complex filters. For example:

var foundNames =
                from name in allNames
                where name.StartsWith("G") && name.EndsWith("l")
                select name;

In this case, only names that begin with G and end with "l" will be retrieved (Gabriel).

Here's an example where you have an array of integer values. You want to retrieve all the odd numbers in the array and sort them in descending order (that is, the bigger numbers come before the smaller numbers). Using LINQ, your code looks like this:

int[] nums = { 12, 34, 10, 3, 45, 6, 90, 22, 87, 49, 13, 32 };
            var oddNums = from n in nums
                          where (n % 2 == 1)
                          orderby n descending
                          select n;
            foreach (int n in oddNums)
                Console.WriteLine(n);

And here's what the code will print out:

87
49
45
13
3

To find out the total number of odd numbers found by the query, you can use the Count() method from the oddNums variable (of type IEnumerable<int>):

int count = oddNums.Count();

You can also convert the result into an int array, like this:

int[] oddNumsArray = oddNums.ToArray();

Query Syntax versus Method Syntax and Lambda Expressions

The two LINQ queries in the previous section use the query syntax, which is written in a declarative manner, like this:

var oddNums = from n in nums
                         where (n % 2 == 1)
                         orderby n descending
                         select n;

In addition to using the query syntax, you can also use the method syntax, which is written using method calls like Where and Select, like this:

int[] nums = { 12, 34, 10, 3, 45, 6, 90, 22, 87, 49, 13, 32 };
IEnumerable<int> oddNums = nums.Where(n => n % 2 == 1). OrderByDescending(n => n);

To find the total number of odd numbers in the array, you can also use the method syntax to query the array directly, like this:

int count = (nums.Where(n => n % 2 == 1).OrderBy(n => n)).Count();

Let's take a look at method syntax and how it works. First, the expression:

(n => n % 2 == 1)

is known as the lambda expression. The => is the lambda operator. You read it as "goes to," so this expression reads as "n goes to n modulus 2 equals to 1." Think of this lambda expression as a function that accepts a single input parameter, contains a single statement, and returns a value, like this:

static bool function(int n)
        {
            return (n % 2 == 1);
        }

The compiler automatically infers the type of n (which is int in this case because nums is an int array) in the lambda expression. However, you can also explicitly specify the type of n, like this:

IEnumerable<int> oddNums =
    nums.Where((int n) => n % 2 == 1).OrderByDescending(n => n);

The earlier example of the string array can also be rewritten using the method syntax as follows:

string[] allNames = new string[] {
       "Jeffrey", "Kirby", "Gabriel",
       "Philip", "Ross", "Adam",
       "Alston", "Warren", "Garfield"};

   var foundNames = allNames.Where(name = name.StartsWith("G") &&
                                           name.EndsWith("l"));

Which syntax should you use? Here's some information regarding the two syntaxes:

  • There is no performance difference between the method syntax and the query syntax.

  • The query syntax is much more readable, so use it whenever possible.

  • Use the method syntax for cases where there is no query syntax equivalent. For example, the Count and Max methods have no query equivalent syntax.

LINQ and Extension Methods

Chapter 4 explored extension methods and how you can use them to extend functionality to an existing class without needing to subclass it. One of the main reasons why the extension method feature was incorporated into the C# 3.0 language was because of LINQ.

Consider the earlier example where you have an array called allNames containing an array of strings. In .NET, objects that contain a collection of objects must implement the IEnumerable interface, so the allNames variable implicitly implements the IEnumerable interface, which only exposes one method — GetEnumerator. But when you use IntelliSense in Visual Studio 2008 to view the list of methods available in the allNames object, you see a list of additional methods, such as Select, Take, TakeWhile, Where, and so on (see Figure 14-2).

Figure 14-2

Figure 14.2. Figure 14-2

In C# 3.0, all these additional methods are known as extension methods, and they are extended to objects that implement the IEnumerable interface. These extension methods are the LINQ standard query operators.

In Visual Studio 2008, all extension methods are denoted by an additional arrow icon, as shown in Figure 14-3.

Figure 14-3

Figure 14.3. Figure 14-3

To add extension methods to objects implementing the IEnumerable interface, you need a reference to System.Core.dll and import the namespace by specifying the namespace:

using System.Linq;

The following table lists the LINQ standard query operators.

Operator Type

Operator Name

Aggregation

Aggregate, Average, Count, LongCount, Max, Min, Sum

Conversion

Cast, OfType, ToArray, ToDictionary, ToList, ToLookup, ToSequence

Element

DefaultIfEmpty, ElementAt, ElementAtOrDefault, First, FirstOrDefault, Last, LastOrDefault, Single, SingleOrDefault

Equality

EqualAll

Generation

Empty, Range, Repeat

Grouping

GroupBy

Joining

GroupJoin, Join

Ordering

OrderBy, ThenBy, OrderByDescending, ThenByDescending, Reverse

Partitioning

Skip, SkipWhile, Take, TakeWhile

Quantifiers

All, Any, Contains

Restriction

Where

Selection

Select, SelectMany

Set

Concat, Distinct, Except, Intersect, Union

Deferred Query Execution

The query variable itself only stores the query; it does not execute the query or store the result.

Take another look at the preceding example:

int[] nums = { 12, 34, 10, 3, 45, 6, 90, 22, 87, 49, 13, 32 };
         var oddNums = nums.Where
            (n => n % 2 == 1). OrderByDescending(n => n);

The oddNums variable simply stores the query (not the result of the query). The query is only executed when you iterate over the query variable, like this:

foreach (int n in oddNums)
             Console.WriteLine(n);

This concept is known as deferred execution, and it means that every time you access the query variable, the query is executed again. This is useful because you can just create one query and every time you execute it you will always get the most recent result.

To prove that deferred execution really works, the following program first defines a query and then prints out the result using a foreach loop. Twenty is added to each element in the array, and then the foreach loop is executed again.

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;

namespace ConsoleApplication5
{
    class Program
    {
        static void Main(string[] args)
        {

            int[] nums = { 12, 34, 10, 3, 45, 6, 90, 22, 87, 49, 13, 32 };
            var oddNums = nums.Where(n => n % 2 == 1).OrderByDescending(n => n);

            Console.WriteLine("First execution");
            Console.WriteLine("---------------");
            foreach (int n in oddNums)
                Console.WriteLine(n);

            //---add 20 to each number in the array---
            for (int i = 0; i < 11; i++)
                nums[i] += 20;

            Console.WriteLine("Second execution");
            Console.WriteLine("----------------");
            foreach (int n in oddNums)
                Console.WriteLine(n);

            Console.ReadLine();
        }
    }
}

The program prints out the following output:

First execution
---------------
87
49
45
13
3
Second execution
----------------
107
69
65
33
23

Because the output for the second foreach loop is different from the first, the program effectively proves that the query is not executed until it is accessed.

Note

Deferred execution works regardless of whether you are using the query or method syntax.

Forced Immediate Query Execution

One way to force an immediate execution of the query is to explicitly convert the query result into a List object. For example, the following query converts the result to a List object:

var oddNums = nums.Where
                (n => n % 2 == 1).OrderByDescending(n => n).ToList();

In this case, the query is executed immediately, as proven by the following program and its output:

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;

namespace ConsoleApplication5
{
    class Program
    {
        static void Main(string[] args)
        {
            int[] nums = { 12, 34, 10, 3, 45, 6, 90, 22, 87, 49, 13, 32 };
            var oddNums = nums.Where
                (n => n % 2 == 1).OrderByDescending(n => n).ToList();

            Console.WriteLine("First execution");
            Console.WriteLine("---------------");
            foreach (int n in oddNums)
                Console.WriteLine(n);

            //---add 20 to each number in the array---
            for (int i = 0; i < 11; i++)
                nums[i] += 20;

            Console.WriteLine("Second execution");
            Console.WriteLine("----------------");
            foreach (int n in oddNums)
                Console.WriteLine(n);

            Console.ReadLine();
        }
    }
}

Here's the program's output:

First execution
---------------
87
49
45
13
3
Second execution
----------------
87
49
45
13
3

The output of the first and second execution is the same, proving that the query is executed immediately after it's defined.

To force a LINQ query to execute immediately, you can use aggregate functions so that the query must iterate over the elements at once. An aggregate function takes a collection of values and returns a scalar value.

Aggregate functions are discussed in more detail later in this chapter.

Following is an example that uses the Count() aggregate function. The program selects all the odd numbers from an array and then counts the total number of odd numbers. Each number is then multiplied by two (which makes them all become even numbers).

static void Main(string[] args)
        {
            int[] nums = { 12, 34, 10, 3, 45, 6, 90, 22, 87, 49, 13, 32 };
            var oddNumsCount = nums.Where
                (n => n % 2 == 1).OrderByDescending(n => n).Count();

            Console.WriteLine("First execution");
            Console.WriteLine("---------------");
            Console.WriteLine("Count: {0}", oddNumsCount);

            //---add 20 to each number in the array---
            for (int i = 0; i < 11; i++)
                nums[i] *= 2; //---all number should now be even---

            Console.WriteLine("Second execution");
            Console.WriteLine("----------------");
            Console.WriteLine("Count: {0}", oddNumsCount);

            Console.ReadLine();
        }

The output shows that once the query is executed, its value does not change:

First execution
---------------
Count: 5
Second execution
----------------
Count: 5

LINQ and Anonymous Types

Although Chapter 4 explored anonymous types and how they allow you to define data types without having to formally define a class, you have not yet seen their real use. In fact, anonymous type is another new feature that Microsoft has designed with LINQ in mind.

Consider the following Contact class definition:

public class Contact
{
    public int id { get; set; }
    public string FirstName { get; set; }
    public string LastName { get; set; }
}

Suppose that you have a list containing Contact objects, like this:

List<Contact> Contacts = new List<Contact>() {
                new Contact() {id = 1, FirstName = "John", LastName = "Chen"},
                new Contact() {id = 2, FirstName = "Maryann", LastName = "Chen" },
                new Contact() {id = 3, FirstName = "Richard", LastName = "Wells" }
            };

You can use LINQ to query all contacts with Chen as the last name:

IEnumerable<Contact> foundContacts = from c in Contacts
                                     where c.LastName == "Chen"
                                     select c;

The foundContacts object is of type IEnumerable<Contact>. To print out all the contacts in the result, you can use the foreach loop:

foreach (var c in foundContacts)
            {
                Console.WriteLine("{0} - {1} {2}", c.id, c.FirstName, c.LastName);
            }

The output looks like this:

1 - John Chen
2 - Maryann Chen

However, you can modify your query such that the result can be shaped into a custom class instead of type Contact. To do so, modify the query as the following highlighted code shows:

var foundContacts = from c in Contacts
                                where c.LastName == "Chen"
                                select new
                                {
                                    id = c.id,
                                    Name = c.FirstName + " " + c.LastName
                                };

Here, you reshape the result using the anonymous type feature new in C# 3.0. Notice that you now have to use the var keyword to let the compiler automatically infer the type of foundContacts. Because the result is an anoymous type that you are defining, the following generates an error:

IEnumerable<Contact> foundContacts = from c in Contacts
                                         where c.LastName == "Chen"
                                         select new
                                         {
                                             id = c.id,
                                             Name = c.FirstName + " " + c.LastName
                                         };

To print the results, use the foreach loop as usual:

foreach (var c in foundContacts)
            {
                Console.WriteLine("{0} - {1}", c.id, c.Name);
            }

Figure 14-4 shows that IntelliSense automatically knows that the result is an anonymous type with two fields — id and Name.

Figure 14-4

Figure 14.4. Figure 14-4

LINQ to DataSet

Besides manipulating data in memory, LINQ can also be used to query data stored in structures like DataSets and DataTables.

ADO.NET is the data access technology in .NET that allows you to manipulate data sources such as databases. If you are familiar with ADO.NET, you are familiar with the DataSet object, which represents an in-memory cache of data. Using LINQ to DataSet, you can use LINQ queries to access data stored in a DataSet object. Figure 14-5 shows the relationships between LINQ to DataSet and ADO.NET 2.0.

Figure 14-5

Figure 14.5. Figure 14-5

Notice that LINQ to DataSet is built on top of ADO.NET 2.0. You can continue using your ADO.NET code to access data stored in a DataSet, but using LINQ to DataSet will greatly simplify your tasks.

The best way to understand LINQ to DataSet is to look at an example and see how it can simplify your coding. The following code shows how, using ADO.NET, you can connect to the pubs sample database, retrieve all the authors from the Authors table, and then print their IDs and names to the output window:

using System;
using System.Collections.Generic;
using System.ComponentModel;
using System.Data;
using System.Drawing;
using System.Linq;
using System.Text;
using System.Windows.Forms;
using System.Data.SqlClient;

namespace LINQtoDataset
{
    public partial class Form1 : Form
    {
        public Form1()
        {
            InitializeComponent();
        }

        private void Form1_Load(object sender, EventArgs e)
        {
            SqlConnection conn;
            SqlCommand comm;
            SqlDataAdapter adapter;
            DataSet ds = new DataSet();

            //---loads the Authors table into the dataset---
            conn = new SqlConnection(@"Data Source=.SQLEXPRESS;" +
                   "Initial Catalog=pubs;Integrated Security=True");
            comm = new SqlCommand("SELECT * FROM Authors", conn);
            adapter = new SqlDataAdapter(comm);
adapter.Fill(ds);

            foreach (DataRow row in ds.Tables[0].Rows)
            {
                Console.WriteLine("{0} - {1} {2}",
                    row["au_id"], row["au_fname"], row["au_lname"]);
            }
        }
    }
}

Observe that all the rows in the Authors table are now stored in the ds DataSet object (in ds.Tables[0]). To print only those authors living in CA, you would need to write the code to do the filtering:

foreach (DataRow row in ds.Tables[0].Rows)
            {
                if (row["state"].ToString() == "CA")
                {
                    Console.WriteLine("{0} - {1} {2}",
                        row["au_id"], row["au_fname"], row["au_lname"]);
                }
            }

Using LINQ to DataSet, you can write a query that only retrieves authors living in CA:

//---query for authors living in CA---
            EnumerableRowCollection<DataRow> authors =
                from author in ds.Tables[0].AsEnumerable()
                where author.Field<string>("State") == "CA"
                select author;

The result of the query is of type EnumerableRowCollection<DataRow>. Alternatively, you can also use the var keyword to let the compiler determine the correct data type:

var authors =
                from author in ds.Tables[0].AsEnumerable()
                where author.Field<string>("State") == "CA"
                select author;

Note

To make use of LINQ to DataSet, ensure that you have a reference to System.Data.DataSetExtensions.dll in your project.

To display the result, you can either bind the result to a DataGridView control using the AsDataView() method:

//---bind to a datagridview control---
            dataGridView1.DataSource = authors.AsDataView();

Or, iteratively loop through the result using a foreach loop:

foreach (DataRow row in authors)
            {
                Console.WriteLine("{0} - {1}, {2}",
                   row["au_id"], row["au_fname"], row["au_lname"]);
            }

To query the authors based on their contract status, use the following query:

EnumerableRowCollection<DataRow> authors =
               from author in ds.Tables[0].AsEnumerable()
               where author.Field<Boolean>("Contract") == true
               select author;

Reshaping Data

Using the new anonymous types feature in C# 3.0, you can define a new type without needing to define a new class. Consider the following statement:

//---query for authors living in CA---
            var authors =
                from author in ds.Tables[0].AsEnumerable()
                where author.Field<string>("State") == "CA"
                select new
                {
                    ID = author.Field<string>("au_id"),
                    FirstName = author.Field<string>("au_fname"),
                    LastName = author.Field<string>("au_lname")
                };

Here, you select all the authors living in the CA state and at the same time create a new type consisting of three properties: ID, FirstName, and LastName. If you now type the word authors, IntelliSense will show you that authors is of type EnumerableRowCollection <'a> authors, and 'a is an anonymous type containing the three fields (see Figure 14-6).

Figure 14-6

Figure 14.6. Figure 14-6

You can now print out the result using a foreach loop:

foreach (var row in authors)
            {
                Console.WriteLine("{0} - {1}, {2}",
                   row.ID, row.FirstName, row.LastName);
            }

To databind to a DataGridView control, you first must convert the result of the query to a List object:

//---query for authors living in CA---
            var authors =
                (from author in ds.Tables[0].AsEnumerable()
                where author.Field<string>("State") == "CA"
                select new
                {
                    ID = author.Field<string>("au_id"),
                    FirstName = author.Field<string>("au_fname"),
                    LastName = author.Field<string>("au_lname")
                }).ToList();

            //---bind to a datagridview control---
            dataGridView1.DataSource = authors;

Aggregate Functions

In an earlier section, you used the following query to obtain a list of authors living in CA:

var authors =
                from author in ds.Tables[0].AsEnumerable()
                where author.Field<string>("State") == "CA"
                select author;

To get the total number of authors living in CA, you can use the Count() extension method (also known as an aggregate function), like this:

Console.WriteLine(authors.Count());

A much more efficient way would be to use the following query in method syntax:

var query =
                ds.Tables[0].AsEnumerable()
                .Count(a => a.Field<string>("State")=="CA");
            Console.WriteLine(query);

LINQ supports the following standard aggregate functions:

Aggregate function

Description

Aggregate

Performs a custom aggregation operation on the values of a collection.

Average

Calculates the average value of a collection of values.

Count

Counts the elements in a collection, optionally only those elements that satisfy a predicate function.

LongCount

Counts the elements in a large collection, optionally only those elements that satisfy a predicate function.

Max

Determines the maximum value in a collection.

Min

Determines the minimum value in a collection.

Sum

Calculates the sum of the values in a collection.

For example, the following statements print out the largest odd number contained in the nums array:

int[] nums = { 12, 34, 10, 3, 45, 6, 90, 22, 87, 49, 13, 32 };
       var maxOddNums = nums.Where
           (n => n % 2 == 1).OrderByDescending(n => n).Max();
       Console.WriteLine("Largest odd number: {0}", maxOddNums); //---87---

The following statements print out the sum of all the odd numbers in nums:

int[] nums = { 12, 34, 10, 3, 45, 6, 90, 22, 87, 49, 13, 32 };
       var sumOfOddNums = nums.Where
           (n => n % 2 == 1).OrderByDescending(n => n).Sum();
       Console.WriteLine("Sum of all odd number: {0}", sumOfOddNums); //---197---

Joining Tables

So far you've been dealing with a single table. In real life, you often have multiple, related tables. A good example is the Northwind sample database, which contains a number of related tables, three of which are shown in Figure 14-7.

Figure 14-7

Figure 14.7. Figure 14-7

Here, the Customers table is related to the Orders table via the CustomerID field, while the Orders table is related to the Order_Details table via the OrderID field.

You can use LINQ to DataSet to join several tables stored in a DataSet. Here's how. First, load the three tables into the DataSet, using the following code:

conn = new SqlConnection(@"Data Source=.SQLEXPRESS;" +
                   "Initial Catalog=Northwind;Integrated Security=True");
            comm = new SqlCommand("SELECT * FROM Customers; SELECT * FROM Orders;
SELECT * FROM [Order Details]", conn);
            adapter = new SqlDataAdapter(comm);
            adapter.Fill(ds);

The three tables loaded onto the DataSet can now be referenced using three DataTable objects:

DataTable customersTable = ds.Tables[0];    //---Customers---
            DataTable ordersTable = ds.Tables[1];       //---Orders---
            DataTable orderDetailsTable = ds.Tables[2]; //---Order Details---

The following LINQ query joins two DataTable objects — customersTable and ordersTable — using the query syntax:

//---using query syntax to join two tables - Customers and Orders---
            var query1 =
                (from customer in customersTable.AsEnumerable()
                 join order in ordersTable.AsEnumerable() on
                 customer.Field<string>("CustomerID") equals
                 order.Field<string>("CustomerID")
                 select new
                 {
                     id = customer.Field<string>("CustomerID"),
                     CompanyName = customer.Field<string>("CompanyName"),
                     ContactName = customer.Field<string>("ContactName"),
                     OrderDate = order.Field<DateTime>("OrderDate"),
                     ShipCountry = order.Field<string>("ShipCountry")
                 }).ToList();

As evident in the query, the Customers and Orders table are joined using the CustomerID field. The result is reshaped using an anonymous type and then converted to a List object using the ToList() extension method. You can now bind the result to a DataGridView control if desired. Figure 14-8 shows the result bound to a DataGridView control.

Figure 14-8

Figure 14.8. Figure 14-8

You can also rewrite the query using the method syntax:

//---using method syntax to join two tables - Customers and Orders---
            var query1 =
                (customersTable.AsEnumerable().Join(ordersTable.AsEnumerable(),
                 customer => customer.Field<string>("CustomerID"),
                 order => order.Field<string>("CustomerID"),
                 (customer, order) => new
                 {
                     id = customer.Field<string>("CustomerID"),
                     CompanyName = customer.Field<string>("CompanyName"),
                     ContactName = customer.Field<string>("ContactName"),
                     OrderDate = order.Field<DateTime>("OrderDate"),
                     ShipCountry = order.Field<string>("ShipCountry")
                 })).ToList();

The following query joins three DataTable objects — customersTable, ordersTable, and orderDetailsTable — and sorts the result according to the OrderID field:

//---three tables join---
            var query2 =
                (from customer in customersTable.AsEnumerable()
                 join order in ordersTable.AsEnumerable() on
                 customer.Field<string>("CustomerID") equals
                 order.Field<string>("CustomerID")
                 join orderDetail in orderDetailsTable.AsEnumerable() on
                 order.Field<int>("OrderID") equals
                 orderDetail.Field<int>("OrderID")
                 orderby order.Field<int>("OrderID")
                 select new
                 {
                     id = customer.Field<string>("CustomerID"),
                     CompanyName = customer.Field<string>("CompanyName"),
                     ContactName = customer.Field<string>("ContactName"),
                     OrderDate = order.Field<DateTime>("OrderDate"),
                     ShipCountry = order.Field<string>("ShipCountry"),
                     OrderID = orderDetail.Field<int>("OrderID"),
                     ProductID = orderDetail.Field<int>("ProductID")
                 }).ToList();

As evident from the query, the Customers table is related to the Orders table via the CustomerID field, and the Orders table is related to the Order Details table via the OrderID field.

Figure 14-9 shows the result of the query.

Figure 14-9

Figure 14.9. Figure 14-9

Typed DataSet

So far you've used the Field() extension method to access the field of a DataTable object. For example, the following program uses LINQ to DataSet to query all the customers living in the USA. The result is then reshaped using an anonymous type:

SqlConnection conn;
            SqlCommand comm;
            SqlDataAdapter adapter;
            DataSet ds = new DataSet();

            conn = new SqlConnection(@"Data Source=.SQLEXPRESS;" +
                   "Initial Catalog=Northwind;Integrated Security=True");
            comm = new SqlCommand("SELECT * FROM Customers", conn);
            adapter = new SqlDataAdapter(comm);
            adapter.Fill(ds, "Customers");

            var query1 =
                (from customer in ds.Tables[0].AsEnumerable()
                 where customer.Field<string>("Country") == "USA"
                 select new
                 {
                     CustomerID = customer.Field<string>("CustomerID"),
                     CompanyName = customer.Field<string>("CompanyName"),
                     ContactName = customer.Field<string>("ContactName"),
                     ContactTitle = customer.Field<string>("ContactTitle")
                 }).ToList();

            dataGridView1.DataSource = query1;

As your query gets more complex, the use of the Field() extension method makes the query unwieldy. A good way to resolve this is to use the typed DataSet feature in ADO.NET. A typed DataSet provides strongly typed methods, events, and properties and so this means you can access tables and columns by name, instead of using collection-based methods.

To add a typed DataSet to your project, first add a DataSet item to your project in Visual Studio 2008 (see Figure 14-10). Name it TypedCustomersDataset.xsd.

Figure 14-10

Figure 14.10. Figure 14-10

In the Server Explorer window, open a connection to the database you want to use (in this case it is the Northwind database) and drag and drop the Customers table onto the design surface of TypedCustomersDataSet.xsd (see Figure 14-11). Save the TypedCustomersDataSet.xsd file.

Figure 14-11

Figure 14.11. Figure 14-11

With the typed DataSet created, rewrite the query as follows:

SqlConnection conn;
            SqlCommand comm;
            SqlDataAdapter adapter;
            TypedCustomersDataSet ds = new TypedCustomersDataSet();

            conn = new SqlConnection(@"Data Source=.SQLEXPRESS;" +
                   "Initial Catalog=Northwind;Integrated Security=True");
            comm = new SqlCommand("SELECT * FROM Customers", conn);
            adapter = new SqlDataAdapter(comm);
            adapter.Fill(ds, "Customers");

            var query1 =
                (from customer in ds.Customers
                 where customer.Country == "USA"
                 select new
                 {
                     customer.CustomerID,
                     customer.CompanyName,
                     customer.ContactName,
                     customer.ContactTitle
                 }).ToList();

            dataGridView1.DataSource = query1;

Notice that the query is now much clearer because there is no need to use the Field() extension method. Figure 14-12 shows the output.

Figure 14-12

Figure 14.12. Figure 14-12

Detecting Null Fields

Using the same query used in the previous section, let's modify it so that you can retrieve all customers living in the WA region:

var query1 =
                (from customer in ds.Customers
                 where customer.Region=="WA"
                 select new
                 {
                     customer.CustomerID,
                     customer.CompanyName,
                     customer.ContactName,
                     customer.ContactTitle
                 }).ToList();

When you execute the query, the program raises an exception. That's because some of the rows in the Customers table have null values for the Region field. To prevent this from happening, you need to use the IsNull() method to check for null values, like this:

var query1 =
                (from customer in ds.Customers
                 where !customer.IsNull("Region") &&  customer.Region == "WA"
                 select new
                 {
                     customer.CustomerID,
                     customer.CompanyName,
                     customer.ContactName,
                     customer.ContactTitle
                 }).ToList();

Notice that LINQ uses short-circuiting when evaluating the conditions in the where statement, so the IsNull() method must be placed before other conditions.

Interestingly, the Field() extension method handles nullable types, so you do not have to explicitly check for null values if you are not using typed DataSets.

Saving the Result of a Query to a DataTable

The result of a LINQ query can be saved into a DataTable object by using the CopyToDataTable() method. The CopyToDataTable() method takes the result of a query and copies the data into a DataTable, which can then be used for data binding.

The following example shows a LINQ query using typed DataSet with the result copied to a DataTable object and then bound to a DataGridView control:

var query1 =
               from customer in ds.Customers
               where customer.Country == "USA"
               select customer;

            DataTable USACustomers = query1.CopyToDataTable ();
            dataGridView1.DataSource = USACustomers;

Note that the CopyToDataTable() method only operates on an IEnumerable<T> source where the generic parameter T is of type DataRow. Hence, it does not work for queries that project anonymous types or queries that perform table joins.

LINQ to XML

Also very cool is LINQ's capability to manipulate XML documents. In the past, you had to use XPath or XQuery whenever you need to manipulate XML documents. Using LINQ to XML, you can now query XML trees and documents using the familiar LINQ syntax.

Note

To use the LINQ to XML, you must add a reference to the System.Xml.Linq.dll in your project and also import the System.Xml.Linq namespace.

Creating XML Trees

To create an XML document tree in memory, use the XDocument object, which represents an XML document. To create an XML element, use the XElement class; for attributes, use the XAttribute class. The following code shows how to build an XML document using these objects:

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Xml.Linq;

namespace LINQtoXML
{
    class Program
    {
        static void Main(string[] args)
        {
            XDocument library = new XDocument(
               new XElement("Library",
                   new XElement("Book",
                     new XAttribute("published", "NYP"),
                     new XElement("Title", "C# 2008 Programmers' Reference"),
                     new XElement("Publisher", "Wrox")
                  ),
                  new XElement("Book",
new XAttribute("published", "Published"),
                     new XElement("Title", "Professional Windows Vista " +
                                  "Gadgets Programming"),
                     new XElement("Publisher", "Wrox")
                  ),
                  new XElement("Book",
                     new XAttribute("published", "Published"),
                     new XElement("Title", "ASP.NET 2.0 - A Developer's " +
                                  "Notebook"),
                     new XElement("Publisher", "O'Reilly")
                  ),
                  new XElement("Book",
                     new XAttribute("published", "Published"),
                     new XElement("Title", ".NET 2.0 Networking Projects"),
                     new XElement("Publisher", "Apress")
                  ),
                  new XElement("Book",
                     new XAttribute("published", "Published"),
                     new XElement("Title", "Windows XP Unwired"),
                     new XElement("Publisher", "O'Reilly")
                  )
               )
            );
        }
    }
}

The indentation gives you an overall visualization of the document structure.

To save the XML document to file, use the Save() method:

library.Save("Books.xml");

To print out the XML document as a string, use the ToString() method:

Console.WriteLine(library.ToString());

When printed, the XML document looks like this:

<Library>
  <Book published="NYP">
    <Title>C# 2008 Programmers' Reference</Title>
    <Publisher>Wrox</Publisher>
  </Book>
  <Book published="Published">
    <Title>Professional Windows Vista Gadgets Programming</Title>
    <Publisher>Wrox</Publisher>
  </Book>
  <Book published="Published">
    <Title>ASP.NET 2.0 - A Developer's Notebook</Title>
    <Publisher>O'Reilly</Publisher>
  </Book>
  <Book published="Published">
    <Title>.NET 2.0 Networking Projects</Title>
    <Publisher>Apress</Publisher>
  </Book>
  <Book published="Published">
    <Title>Windows XP Unwired</Title>
    <Publisher>O'Reilly</Publisher>
  </Book>
</Library>

To load an XML document into the XDocument object, use the Load() method:

XDocument LibraryBooks = new XDocument();
            LibraryBooks = XDocument.Load("Books.xml");

Querying Elements

You can use LINQ to XML to locate specific elements. For example, to retrieve all books published by Wrox, you can use the following query:

var query1 =
                from book in LibraryBooks.Descendants("Book")
                where book.Element("Publisher").Value == "Wrox"
                select book.Element("Title").Value;
Console.WriteLine("------");
                Console.WriteLine("Result");
                Console.WriteLine("------");
                foreach (var book in query1)
                {
                    Console.WriteLine(book);
                }

This query generates the following output:

------
Result
------
C# 2008 Programmers' Reference
Professional Windows Vista Gadgets Programming

To retrieve all not-yet-published (NYP) books from Wrox, you can use the following query:

var query2 =
               from book in library.Descendants("Book")
               where book.Attribute("published").Value == "NYP" &&
                     book.Element("Publisher").Value=="Wrox"
               select book.Element("Title").Value;

You can shape the result of a query as you've seen in earlier sections:

var query3 =
                from book in library.Descendants("Book")
                where book.Element("Publisher").Value == "Wrox"
                select new
                {
                    Name = book.Element("Title").Value,
                    Pub = book.Element("Publisher").Value
                };

            Console.WriteLine("------");
            Console.WriteLine("Result");
            Console.WriteLine("------");
            foreach (var book in query3)
            {
                Console.WriteLine("{0} ({1})", book.Name, book.Pub);
            }

This code generates the following output:

------
Result
------
C# 2008 Programmers' Reference (Wrox)
Professional Windows Vista Gadgets Programming (Wrox)

Besides using an anonymous type to reshape the result, you can also pass the result to a non-anonymous type. For example, suppose that you have the following class definition:

public class Book
    {
        public string Name { get; set; }
        public string Pub { get; set; }
    }

You can shape the result of a query to the Book class, as the following example shows:

var query4 =
                 from book in library.Descendants("Book")
                 where book.Element("Publisher").Value == "Wrox"
                 select new Book
                 {
                     Name = book.Element("Title").Value,
                     Pub = book.Element("Publisher").Value
                 };

            List<Book> books = query4.ToList();

An Example Using RSS

Let's now take a look at the usefulness of LINQ to XML. Suppose that you want to build an application that downloads an RSS document, extracts the title of each posting, and displays the link to each post.

Figure 14-13 shows an example of an RSS document.

Figure 14-13

Figure 14.13. Figure 14-13

To load an XML document directly from the Internet, you can use the Load() method from the XDocument class:

XDocument rss =
XDocument.Load(@"http://www.wrox.com/WileyCDA/feed/RSS_WROX_ALLNEW.xml");

To retrieve the title of each posting and then reshape the result, use the following query:

var posts =
                from item in rss.Descendants("item")
                select new
                {
                    Title = item.Element("title").Value,
                    URL = item.Element("link").Value
                };

In particular, you are looking for all the <item> elements and then for each <item> element found you would extract the values of the <title> and <link> elements.

<rss>
   <channel>
   ...
      <item>
         <title>...</title>
         <link>...</link>
         ...
      </item>

      <item>
         <title>...</title>
         <link>...</link>
         ...
      </item>

      <item>
         <title>...</title>
         <link>...</link>
         ...
      </item>

...

Finally, print out the title and URL for each post:

foreach (var post in posts)
            {
                Console.WriteLine("{0}", post.Title);
                Console.WriteLine("{0}", post.URL);
                Console.WriteLine();
            }

Figure 14-14 shows the output.

Figure 14-14

Figure 14.14. Figure 14-14

Query Elements with a Namespace

If you observe the RSS document structure carefully, you notice that the <creator> element has the dc namespace defined (see Figure 14-15).

Figure 14-15

Figure 14.15. Figure 14-15

The dc namespace is defined at the top of the document, within the <rss> element (see Figure 14-16).

Figure 14-16

Figure 14.16. Figure 14-16

When using LINQ to XML to query elements defined with a namespace, you need to specify the namespace explicitly. The following example shows how you can do so using the XNamespace element and then using it in your code:

XDocument rss =
XDocument.Load(@"http://www.wrox.com/WileyCDA/feed/RSS_WROX_ALLNEW.xml");

            XNamespace dcNamespace = "http://purl.org/dc/elements/1.1/";

            var posts =
                from item in rss.Descendants("item")
                select new
                {
                    Title = item.Element("title").Value,
                    URL = item.Element("link").Value,
                    Creator = item.Element(dcNamespace + "creator").Value
                };

            foreach (var post in posts)
            {
                Console.WriteLine("{0}", post.Title);
                Console.WriteLine("{0}", post.URL);
                Console.WriteLine("{0}", post.Creator);
                Console.WriteLine();
            }

Figure 14-17 shows the query result.

Figure 14-17

Figure 14.17. Figure 14-17

Retrieving Postings in the Last 10 Days

The <pubDate> element in the RSS document contains the date the posting was created. To retrieve all postings published in the last 10 days, you would need to use the Parse() method (from the DateTime class) to convert the string into a DateTime type and then deduct it from the current time. Here's how that can be done:

XDocument rss =
                XDocument.Load(
                @"http://www.wrox.com/WileyCDA/feed/RSS_WROX_ALLNEW.xml");

            XNamespace dcNamespace = "http://purl.org/dc/elements/1.1/";

            var posts =
                from item in rss.Descendants("item")
                where (DateTime.Now -
                       DateTime.Parse(item.Element("pubDate").Value)).Days < 10
                select new
                {
                    Title = item.Element("title").Value,
                    URL = item.Element("link").Value,
                    Creator = item.Element(dcNamespace + "creator").Value,
                    PubDate = DateTime.Parse(item.Element("pubDate").Value)
                };

            Console.WriteLine("Today's date: {0}",
                               DateTime.Now.ToShortDateString());
            foreach (var post in posts)
            {
                Console.WriteLine("{0}", post.Title);
                Console.WriteLine("{0}", post.URL);
                Console.WriteLine("{0}", post.Creator);
                Console.WriteLine("{0}", post.PubDate.ToShortDateString());
                Console.WriteLine();
            }

LINQ to SQL

LINQ to SQL is a component of the .NET Framework (v3.5) that provides a runtime infrastructure for managing relational data as objects.

With LINQ to SQL, a relational database is mapped to an object model. Instead of manipulating the database directly, developers manipulate the object model, which represents the database. After changes are made to it, the object model is submitted to the database for execution.

Visual Studio 2008 includes the new Object Relational Designer (O/R Designer), which provides a user interface for creating LINQ to SQL entity classes and relationships. It enables you to easily model and visualize a database as a LINQ to SQL object model.

Using the Object Relational Designer

To see how LINQ to SQL works, create a new Windows application using Visual Studio 2008.

First, add a new LINQ to SQL Classes item to the project. Use the default name of DataClasses1.dbml (see Figure 14-18).

Figure 14-18

Figure 14.18. Figure 14-18

In Server Explorer, open a connection to the database you want to use. For this example, use the pubs sample database. Drag and drop the following tables onto the design surface of DataClasses1.dbml:

  • authors

  • publishers

  • titleauthor

  • titles

Figure 14-19 shows the relationships among these four tables.

Figure 14-19

Figure 14.19. Figure 14-19

Now save the DataClasses1.dbml file, and Visual Studio 2008 will create the relevant classes to represent the tables and relationships that you just modeled. For every LINQ to SQL file you added to your solution, a DataContext class is generated. You can view this using the Class Viewer (View

Figure 14-19
Figure 14-20

Figure 14.20. Figure 14-20

Querying

With the database modeled using the LINQ to SQL designer, it's time to write some code to query the database. First, create an instance of the DataClasses1DataContext class:

DataClasses1DataContext database = new DataClasses1DataContext();

To retrieve all the authors living in CA, use the following code:

var authors = from a in database.authors
                          where (a.state == "CA")
                          select new
                          {
                              Name = a.au_fname + " " + a.au_lname
                          };

            foreach (var a in authors)
                Console.WriteLine(a.Name);

To retrieve all the titles in the titles table and at the same time print out the publisher name of each title, you first retrieve all the titles from the titles table:

var titles = from t in database.titles
                         select t;

And then you retrieve each title's associated publisher:

foreach (var t in titles)
            {
                Console.Write("{0} ", t.title1);
                var publisher = from p in database.publishers
                                where p.pub_id == t.pub_id
                                select p;
                if (publisher.Count() > 0)
                    Console.WriteLine("({0})", publisher.First().pub_name);
            }

The output looks something like this:

Cooking with Computers: Surreptitious Balance Sheets (Algodata Infosystems)
        You Can Combat Computer Stress! (New Moon Books)
        How to Motivate Your Employees Straight Talk About Computers (Algodata Infosystems)
        Silicon Valley Gastronomic Treats (Binnet & Hardley)
        The Gourmet Microwave (Binnet & Hardley)
        The Psychology of Computer Cooking (Binnet & Hardley)
        But Is It User Friendly? (Algodata Infosystems)
        Secrets of Silicon Valley (Algodata Infosystems)
        Net Etiquette (Algodata Infosystems)
        Computer Phobic AND Non-Phobic Individuals: Behavior Variations (Binnet & Hardley)
        Is Anger the Enemy? (New Moon Books)
        Life Without Fear (New Moon Books)
        Prolonged Data Deprivation: Four Case Studies (New Moon Books)
        Emotional Security: A New Algorithm (New Moon Books)
        Onions, Leeks, and Garlic: Cooking Secrets of the Mediterranean (Binnet & Hardley)
        Fifty Years in Buckingham Palace Kitchens (Binnet & Hardley)
        Sushi, Anyone? (Binnet & Hardley)

Inserting New Rows

To insert a row into a table, use the InsertOnSubmit() method. For example, the following code inserts a new author into the authors table:

DataClasses1DataContext database = new DataClasses1DataContext();

            author a = new author()
            {
                au_id = "789-12-3456",
                au_fname = "James",
                au_lname = "Bond",
                phone = "987654321"
            };

            //---record is saved to object model---
            database.authors.InsertOnSubmit(a);

Note that the InsertOnSubmit() method only affects the object model; it does not save the changes back to the database. To save the changes back to the database, you need to use the SubmitChanges() method:

//---send changes to database---
            database.SubmitChanges();

What happens when you need to insert a new book title from a new author? As you saw earlier, the titles table is related to the titleauthors via the title_id field, while the authors table is related to the titleauthors table via the author_id field. Therefore, if you insert a new row into the titles table, you need to insert a new row into the authors and titleauthors tables as well.

To do so, you first create a new author and title row:

DataClasses1DataContext database = new DataClasses1DataContext();
            author a = new author()
            {
                au_id = "123-45-6789",
                au_fname = "Wei-Meng",
                au_lname = "Lee",
                phone = "123456789"
            };

            title t = new title()
            {
                title_id = "BU5555",
                title1 = "How to Motivate Your Employees",
                pubdate = System.DateTime.Now,
                type = "business"
            };

Then, add a new titleauthor row by associating its author and title properties with the new title and author row you just created:

titleauthor ta = new titleauthor()
            {
                author = a,
                title = t
            };

Finally, save the changes to the object model and submit the changes to the database:

//---record is saved to object model---
            database.titleauthors.InsertOnSubmit(ta);

            //---send changes to database---
            database.SubmitChanges();

Notice that you do not need to worry about indicating the title_id and author_id fields in the titleauthors table; LINQ to SQL does those for you automatically.

Updating Rows

Updating rows using LINQ to SQL is straightforward — you retrieve the record you need to modify:

DataClasses1DataContext database = new DataClasses1DataContext();
            title bookTitle = (from t in database.titles
                               where (t.title_id == "BU5555")
                               select t).Single();

The Single() method returns the only element of a sequence, and throws an exception if there is not exactly one element in the sequence.

Modify the field you want to change:

bookTitle.title1 = "How to Motivate Your Staff";

And submit the changes using the SubmitChanges() method:

database.SubmitChanges();

The query can alternatively be written using the method syntax, like this:

title bookTitle = database.titles.Single(t => t.title_id == "BU5555");

Deleting Rows

To delete a row, you first retrieve the row to delete:

DataClasses1DataContext database = new DataClasses1DataContext();
            //---find author ---
            var author = from a in database.authors
                         where a.au_id == "789-12-3456"
                         select a;

Then, locate the row to delete by using the First() method, and finally call the DeleteOnSubmit() method to delete the row:

if (author.Count() > 0)
            {
                database.authors.DeleteOnSubmit(author.First());
                database.SubmitChanges();
            }

The First() method returns the first element of a sequence.

If you have multiple rows to delete, you need to delete each row individually, like this:

//---find author ---
            var authors = from a in database.authors
                          where a.au_id == "111-11-1111" ||
                                a.au_id == "222-22-1111"
                          select a;

            foreach (author a in authors)
            {
                database.authors.DeleteOnSubmit(a);
            }
            database.SubmitChanges();

So far the deletion works only if the author to be deleted has no related rows in the titleauthors and titles tables. If the author has associated rows in the titleauthors and titles tables, these examples cause an exception to be thrown because the deletions violate the referential integrity of the database (see Figure 14-21).

Figure 14-21

Figure 14.21. Figure 14-21

Because LINQ to SQL does not support cascade-delete operations, you need to make sure that rows in related tables are also deleted when you delete a row. The following code example shows how to delete a title from the titles and titleauthors tables:

DataClasses1DataContext database = new DataClasses1DataContext();
            string titleid_to_remove = "BU5555";

            //---find all associated row in Titles table---
            var title = from t in database.titles
                        where t.title_id == titleid_to_remove
                        select t;

            //---delete the row in the Titles table---
            foreach (var t in title)
                database.titles.DeleteOnSubmit(t);

            //---find all associated row in TitleAuthors table---
var titleauthor = from ta in database.titleauthors
                              where ta.title_id == titleid_to_remove
                              select ta;

            //---delete the row in the TitleAuthors table---
            foreach (var ta in titleauthor)
                database.titleauthors.DeleteOnSubmit(ta);

            //---submit changes to database---
            database.SubmitChanges();

Summary

This chapter, provides a quick introduction to the Language Integrated Query (LINQ) feature, which is new in .NET 3.5. It covered LINQ's four key implementations: LINQ to Objects, LINQ to XML, LINQ to Dataset, and LINQ to SQL. LINQ enables you to query various types of data sources, using a unified query language, making data access easy and efficient.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.225.57.67