Chapter 13. Introducing LINQ

One of the common programming tasks C# programmers perform every day is finding and retrieving objects in memory, a database, or an XML file. For example, you may be developing a cell phone customer support system that will allow a customer to see how much each member of the family has spent in phone calls. To do so, you’ll need to retrieve records from various sources (phone company records online, phone books kept locally, etc.), filtered by various criteria (by name or by month), and sorted in various ways (e.g., by date, by family member).

One way you might have implemented this in the past would be to search a database by address, returning all the records to the user, perhaps presenting them in a listbox. The user would pick the name she was interested in and the data of interest (e.g., the number of ringtones downloaded in the past three months), and you would go back to the database (or perhaps to a different database) and retrieve that information, using the chosen family member’s unique ID as a key, retrieving the required data.

Although C# provides support for in-memory searches such as finding a name in a collection, traditionally, you were required to turn to another technology (such as ADO.NET) to retrieve data from a database. Although ADO.NET made this fairly easy, a sharp distinction was drawn between retrieving data from in-memory collections and retrieving data from persistent storage.

In-memory searches lacked the powerful and flexible query capabilities of SQL, whereas ADO.NET was not integrated into C#, and SQL itself was not object-oriented (in fact, the point of ADO.NET was to bridge the object-to-relational model). LINQ is an integrated feature of C# 3.0 itself, and thus (at long last) brings an object-oriented bridge over the impedance mismatch between object-oriented languages and relational databases.

The goal of LINQ (Language-INtegrated Query) is to integrate extensive query capabilities into the C# language, to make SQL-like capabilities part of the language, and to remove the distinctions among searching a database, an XML document, or an in-memory data collection.

This chapter will introduce LINQ and show how it fits into C# and into your programming. Subsequent chapters will dive into the details of using LINQ to retrieve and manipulate data in databases and in other data repositories. You’ll learn about ADO.NET in Chapter 16.

Defining and Executing a Query

In previous versions of C#, if you wanted to find an object in a database you had to leave C# and turn to the Framework (most often ADO.NET). With LINQ, you can stay within C#, and thus within a fully class-based perspective.

Tip

Many books start with anonymous methods, then introduce Lambda expressions, and finally introduce LINQ. It is my experience that it is far easier to understand each of these concepts by going in the opposite direction, starting with queries and introducing Lambda expressions for what they are: enabling technologies. Each of these topics will, however, be covered here and in subsequent chapters.

Let’s start simply by searching a collection for objects that match a given criterion, as demonstrated in Example 13-1.

Example 13-1. A simple LINQ query
using System;
using System.Collections.Generic;
using System.Linq;
namespace Programming_CSharp
{
    // Simple customer class
    public class Customer
    {
        public string FirstName { get; set; }
        public string LastName { get; set; }
        public string EmailAddress { get; set; }

        // Overrides the Object.ToString(  ) to provide a
        // string representation of the object properties.
        public override string ToString(  )
        {
            return string.Format("{0} {1}
Email:   {2}",
                        FirstName, LastName, EmailAddress);
        }
    }

    // Main program
    public class Tester
    {
        static void Main(  )
        {
            List<Customer> customers = CreateCustomerList(  );

            // Find customer by first nameIEnumerable<Customer> result =
                from   customer in customers
                where  customer.FirstName == "Donna"
                select customer;
            Console.WriteLine("FirstName == "Donna"");
            foreach (Customer customer in result)
                Console.WriteLine(customer.ToString(  ));

            customers[3].FirstName = "Donna";
            Console.WriteLine("FirstName == "Donna" (take two)");
            foreach (Customer customer in result)
                Console.WriteLine(customer.ToString(  ));
       }

        // Create a customer list with sample data
        private static List<Customer> CreateCustomerList(  )
        {
            List<Customer> customers = new List<Customer>
                {
                    new Customer { FirstName = "Orlando",
                                    LastName = "Gee",
                                    EmailAddress = "[email protected]"},
                    new Customer { FirstName = "Keith",
                                    LastName = "Harris",
                                    EmailAddress = "[email protected]" },
                    new Customer { FirstName = "Donna",
                                    LastName = "Carreras",
                                    EmailAddress = "[email protected]" },
                    new Customer { FirstName = "Janet",
                                    LastName = "Gates",
                                    EmailAddress = "[email protected]" },
                    new Customer { FirstName = "Lucy",
                                    LastName = "Harrington",
                                    EmailAddress = "[email protected]" }
                };
            return customers;
        }
    }
}

Output:
FirstName == "Donna"
Donna Carreras
Email:   [email protected]
FirstName == "Donna" (take two)
Donna Carreras
Email:   [email protected]
Donna Gates
Email:   [email protected]

Example 13-1 defines a simple Customer class with three properties: FirstName, LastName, and EmailAddress. It overrides the Object.ToString( ) method to provide a string representation of its instances.

Creating the Query

The program starts by creating a customer list with some sample data, taking advantage of object initialization as discussed in Chapter 4. Once the list of customers is created, Example 13-1 defines a LINQ query:

IEnumerable<Customer> result =
                from   customer in customers
                where  customer.FirstName == "Donna"
                select customer;

The result variable is initialized with a query expression. In this example, the query will retrieve all Customer objects whose first name is “Donna” from the customer list. The result of such a query is a collection that implements IEnumerable<T>, where T is the type of the result object. In this example, because the query result is a set of Customer objects, the type of the result variable is IEnumerable<Customer>.

Let’s dissect the query and look at each part in more detail.

The from clause

The first part of a LINQ query is the from clause:

from   customer in customers

The generator of a LINQ query specifies the data source and a range variable. A LINQ data source can be any collection that implements the System.Collections.Generic.IEnumerable<T> interface. In this example, the data source is customers, an instance of List<Customer> that implements IEnumerable<T>.

Tip

You’ll see how to do the same query against a SQL database in Chapter 15.

A LINQ range variable is like an iteration variable in a foreach loop, iterating over the data source. Because the data source implements IEnumerable<T>, the C# compiler can infer the type of the range variable from the data source. In this example, because the type of the data source is List<Customer>, the range variable customer is of type Customer.

Filtering

The second part of this LINQ query is the where clause, which is also called a filter. This portion of the clause is optional:

where  customer.FirstName == "Donna"

The filter is a Boolean expression. It is common to use the range variable in a where clause to filter the objects in the data source. Because customer in this example is of type Customer, you use one of its properties, in this case FirstName, to apply the filter for your query.

Of course, you may use any Boolean expression as your filter. For instance, you can invoke the String.StartsWith( ) method to filter customers by the first letter of their last name:

where  customer.LastName.StartsWith("G")

You can also use composite expressions to construct more complex queries. In addition, you can use nested queries where the result of one query (the inner query) is used to filter another query (the outer query).

Projection (or select)

The last part of a LINQ query is the select clause (known to database geeks as the “projection”), which defines (or projects) the results:

select customer;

In this example, the query returns the customer objects that satisfy the query condition. You may constrain which fields you project, much as you would with SQL. For instance, you can return only the qualified customers’ email addresses only:

  select customer.EmailAddress;

Deferred Query Evaluation

LINQ implements deferred query evaluation, meaning that the declaration and initialization of a query expression do not actually execute the query. Instead, a LINQ query is executed, or evaluated, when you iterate through the query result:

foreach (Customer customer in result)
    Console.WriteLine(customer.ToString(  ));

Because the query returns a collection of Customer objects, the iteration variable is an instance of the Customer class. You can use it as you would any Customer object. This example simply calls each Customer object’s ToString( ) method to output its property values to the console.

Each time you iterate through this foreach loop, the query will be reevaluated. If the data source has changed between executions, the result will be different. This is demonstrated in the next code section:

customers[3].FirstName = "Donna";

Here, you modify the first name of the customer “Janet Gates” to “Donna” and then iterate through the result again:

Console.WriteLine("FirstName == "Donna" (take two)");
foreach (Customer customer in result)
    Console.WriteLine(customer.ToString(  ));

As shown in the sample output, you can see that the result now includes Donna Gates as well.

In most situations, deferred query evaluation is desired because you want to obtain the most recent data in the data source each time you run the query. However, if you want to cache the result so that it can be processed later without having to reexecute the query, you can call either the ToList( ) or the ToArray( ) method to save a copy of the result. Example 13-2 demonstrates this technique as well.

Example 13-2. A simple LINQ query with cached results
using System;
using System.Collections.Generic;
using System.Linq;
namespace Programming_CSharp
{
    // Simple customer class
    public class Customer
    {
        // Same as in Example 13-1
    }

    // Main program
    public class Tester
    {
        static void Main(  )
        {
            List<Customer> customers = CreateCustomerList(  );

            // Find customer by first name
            IEnumerable<Customer> result =
                from customer in customers
                where customer.FirstName == "Donna"
                select customer;
            List<Customer> cachedResult = result.ToList<Customer>(  );

            Console.WriteLine("FirstName == "Donna"");
            foreach (Customer customer in cachedResult)
                Console.WriteLine(customer.ToString(  ));

            customers[3].FirstName = "Donna";
            Console.WriteLine("FirstName == "Donna" (take two)");
            foreach (Customer customer in cachedResult)
                Console.WriteLine(customer.ToString(  ));
        }

        // Create a customer list with sample data
        private static List<Customer> CreateCustomerList(  )
        {
            // Same as in Example 13-1
        }
    }
}

Output:
FirstName == "Donna"
Donna Carreras
Email:   [email protected]
FirstName == "Donna" (take two)
Donna Carreras
Email:   [email protected]

In this example, you call the ToList<T> method of the result collection to cache the result. Note that calling this method causes the query to be evaluated immediately. If the data source is changed after this, the change will not be reflected in the cached result. You can see from the output that there is no Donna Gates in the result.

One interesting point here is that the ToList<T> and ToArray<T> methods are not actually methods of IEnumerable; that is, if you look in the documentation for IEnumerable, you will not see them in the methods list. They are actually extension methods provided by LINQ. We will look at extension methods in more detail later in this chapter.

If you are familiar with SQL, you will notice a striking similarity between LINQ and SQL, at least in their syntax. The only odd-one-out at this stage is that the select statement in LINQ appears at the end of LINQ query expressions, instead of at the beginning, as in SQL. Because the generator, or the from clause, defines the range variable, it must be stated first. Therefore, the projection part is pushed back.

LINQ and C#

LINQ provides many of the common SQL operations, such as join queries, grouping, aggregation, and sorting of results. In addition, it allows you to use the object-oriented features of C# in query expressions and processing, such as hierarchical query results.

Joining

You will often want to search for objects from more than one data source. LINQ provides the join clause that offers the ability to join many data sources, not all of which need be databases. Suppose you have a list of customers containing customer names and email addresses, and a list of customer home addresses. You can use LINQ to combine both lists to produce a list of customers, with access to both their email and home addresses:

    from customer in customers
         join address in addresses on
              customer.Name equals address.Name
    ...

The join condition is specified in the on subclause, similar to SQL, except that the objects joined need not be tables or views in a database. The join class syntax is:

 [data source 1] join [data source 2] on [join condition]

Here, we are joining two data sources, customers and addresses, based on the customer name properties in each object. In fact, you can join more than two data sources using a combination of join clauses:

from customer in customers
    join address in addresses on
        customer.Name equals address.Name
    join invoice in invoices  on
        customer.Id   equals invoice.CustomerId
    join invoiceItem in invoiceItems on
        invoice.Id    equals invoiceItem.invoiceId

A LINQ join clause returns a result only when objects satisfying the join condition exist in all data sources. For instance, if a customer has no invoice, the query will not return anything for that customer, not even her name and email address. This is the equivalent of a SQL inner join clause.

Tip

LINQ cannot perform an outer join (which returns a result if either of the data sources contains objects that meet the join condition).

Ordering and the var Keyword

You can also specify the sort order in LINQ queries with the orderby clause:

from customer in Customers
    orderby customer.LastName
    select customer;

This sorts the result by customer last name in ascending order. Example 13-3 shows how you can sort the results of a join query.

Example 13-3. A sorted join query
using System;
using System.Collections.Generic;
using System.Linq;

namespace Programming_CSharp
{
    // Simple customer class
    public class Customer
    {
        // Same as in Example 13-1
    }

    // Customer address class
    public class Address
    {
        public string Name   { get; set; }
        public string Street { get; set; }
        public string City   { get; set; }

        // Overrides the Object.ToString(  ) to provide a
        // string representation of the object properties.
        public override string ToString(  )
        {
            return string.Format("{0}, {1}", Street, City);
        }
    }

    // Main program
    public class Tester
    {
        static void Main(  )
        {
            List<Customer> customers = CreateCustomerList(  );
            List<Address>  addresses = CreateAddressList(  );

            // Find all addresses of a customer
            var result =
                from customer in customers
                join address in addresses on
                     string.Format("{0} {1}", customer.FirstName,
                         customer.LastName)
                     equals address.Name
                orderby customer.LastName, address.Street descending
                select new { Customer = customer, Address = address };

            foreach (var ca in result)
            {
                Console.WriteLine(string.Format("{0}
Address: {1}",
                    ca.Customer, ca.Address));
            }
        }

        // Create a customer list with sample data
        private static List<Customer> CreateCustomerList(  )
        {
            // Same as in Example 13-1
        }

        // Create a customer list with sample data
        private static List<Address> CreateAddressList(  )
        {
            List<Address> addresses = new List<Address>
                {
                    new Address { Name   = "Janet Gates",
                                  Street = "165 North Main",
                                    City = "Austin" },
                    new Address { Name   = "Keith Harris",
                                  Street = "3207 S Grady Way",
                                    City = "Renton" },
                    new Address { Name   = "Janet Gates",
                                  Street = "800 Interchange Blvd.",
                                    City = "Austin" },
                    new Address { Name   = "Keith Harris",
                                  Street = "7943 Walnut Ave",
                                    City = "Renton" },
                    new Address { Name   = "Orlando Gee",
                                  Street = "2251 Elliot Avenue",
                                    City = "Seattle" }
                };
            return addresses;
        }
    }
}

Output:
Janet Gates
Email:   [email protected]
Address: 800 Interchange Blvd., Austin
Janet Gates
Email:   [email protected]
Address: 165 North Main, Austin
Orlando Gee
Email:   [email protected]
Address: 2251 Elliot Avenue, Seattle
Keith Harris
Email:   [email protected]
Address: 7943 Walnut Ave, Renton
Keith Harris
Email:   [email protected]
Address: 3207 S Grady Way, Renton

The Customer class is identical to the one used in Example 13-1. The address is also very simple, with a customer name field containing customer names in the <first name> <last name> form, and the street and city of customer addresses.

The CreateCustomerList( ) and CreateAddressList( ) methods are just helper functions to create sample data for this example. This example also uses the new C# object and collection initializers, as explained in Chapter 4.

The query definition, however, looks quite different from the last example:

var result =
    from   customer in customers
           join address in addresses on
                string.Format("{0} {1}", customer.FirstName, customer.LastName)
                equals address.Name
    orderby customer.LastName, address.Street descending
    select new { Customer = customer, Address = address.Street };

The first difference is the declaration of the result. Instead of declaring the result as an explicitly typed IEnumerable<Customer> instance, this example declares the result as an implicitly typed variable using the new var keyword. We will leave this for just a moment, and jump to the query definition itself.

The generator now contains a join clause to signify that the query is to be operated on two data sources: customers and addresses. Because the customer name property in the Address class is a concatenation of customer first and last names, you construct the names in Customer objects to the same format:

string.Format("{0} {1}", customer.FirstName, customer.LastName)

The dynamically constructed customer full name is then compared with the customer name property in the Address objects using the equals operator:

string.Format("{0} {1}", customer.FirstName, customer.LastName)
equals address.Name

The orderby clause indicates the order in which the result should be sorted:

    orderby customer.LastName, address.Street descending

In the example, the result will be sorted first by customer last name in ascending order, then by street address in descending order.

The combined customer name, email address, and home address are returned. Here you have a problem—LINQ can return a collection of objects of any type, but it can’t return multiple objects of different types in the same query, unless they are encapsulated in one type. For instance, you can select either an instance of the Customer class or an instance of the Address class, but you cannot select both, like this:

    select customer, address

The solution is to define a new type containing both objects. An obvious way is to define a CustomerAddress class:

    public class CustomerAddress
    {
        public Customer Customer { get; set; }
        public Address Address   { get; set; }
    }

You can then return customers and their addresses from the query in a collection of CustomerAddress objects:

var result =
    from   customer in customers
           join address in addresses on
                string.Format("{0} {1}", customer.FirstName, customer.LastName)
                equals address.Name
    orderby customer.LastName, address.Street descending
    Select new CustomerAddress { Customer = customer, Address = address };

Grouping and the group Keyword

Another powerful feature of LINQ, commonly used by SQL programmers but now integrated into the language itself, is grouping, as shown in Example 13-4.

Example 13-4. A group query
using System;
using System.Collections.Generic;
using System.Linq;

namespace Programming_CSharp
{
    // Customer address class
    public class Address
    {
        // Same as in Example 13-3
    }

    // Main program
    public class Tester
    {
        static void Main(  )
        {
            List<Address>  addresses = CreateAddressList(  );// Find addresses grouped by customer name
            var result =
                from address in addresses
                group address by address.Name;
            foreach (var group in result)
            {
                Console.WriteLine("{0}", group.Key);
                foreach (var a in group)
                    Console.WriteLine("	{0}", a);
            }
                     }

        // Create a customer list with sample data
        private static List<Address> CreateAddressList(  )
        {
        // Same as in Example 13-3
        }
    }
}

Output:
Janet Gates
        165 North Main, Austin
        800 Interchange Blvd., Austin
Keith Harris
        3207 S Grady Way, Renton
        7943 Walnut Ave, Renton
Orlando Gee
        2251 Elliot Avenue, Seattle

Example 13-4 makes use of the group keyword, a query operator that splits a sequence into a group given a key value—in this case, customer name (address.Name). The result is a collection of groups, and you’ll need to enumerate each group to get the objects belonging to it.

Anonymous Types

Often, you do not want to create a new class just for storing the result of a query. C# 3.0 provides anonymous types that allow us to declare both an anonymous class and an instance of that class using object initializers. For instance, we can initialize an anonymous customer address object:

new { Customer = customer, Address = address }

This declares an anonymous class with two properties, Customer and Address, and initializes it with an instance of the Customer class and an instance of the Address class. The C# compiler can infer the property types with the types of assigned values, so here, the Customer property type is the Customer class, and the Address property type is the Address class. As a normal, named class, anonymous classes can have properties of any type.

Behind the scenes, the C# compiler generates a unique name for the new type. This name cannot be referenced in application code; therefore, it is considered nameless.

Implicitly Typed Local Variables

Now, let’s go back to the declaration of query results where you declare the result as type var:

var result = ...

Because the select clause returns an instance of an anonymous type, you cannot define an explicit type IEnumerable<T>. Fortunately, C# 3.0 provides another feature—implicitly typed local variables—that solves this problem.

You can declare an implicitly typed local variable by specifying its type as var:

var id = 1;
var name = "Keith";
var customers = new List<Customer>(  );
var person = new {FirstName = "Donna", LastName = "Gates", Phone="123-456-7890" };

The C# compiler infers the type of an implicitly typed local variable from its initialized value. Therefore, you must initialize such a variable when you declare it. In the preceding code snippet, the type of id will be set as an integer, the type of name as a string, and the type of customers as a strongly typed List<T> of Customer objects. The type of the last variable, person, is an anonymous type containing three properties: FirstName, LastName, and Phone. Although this type has no name in our code, the C# compiler secretly assigns it one and keeps track of its instances. In fact, the Visual Studio IDE IntelliSense is also aware of anonymous types, as shown in Figure 13-1.

Visual Studio IntelliSense recognizes anonymous types
Figure 13-1. Visual Studio IntelliSense recognizes anonymous types

Back in Example 13-3, result is an instance of the constructed IEnumerable<T> that contains query results, where the type of the argument T is the anonymous type that contains two properties: Customer and Address.

Now that the query is defined, the next statement executes it using the foreach loop:

foreach (var ca in result)
{
    Console.WriteLine(string.Format("{0}
Address: {1}",
        ca.Customer, ca.Address));
}

As the result is an implicitly typed IEnumerable<T> of the anonymous class {Customer, Address}, the iteration variable is also implicitly typed to the same class. For each object in the result list, this example simply prints its properties.

Extension Methods

If you already know a little SQL, the query expressions introduced in previous sections are quite intuitive and easy to understand because LINQ is similar to SQL. As C# code is ultimately executed by the .NET CLR, the C# compiler has to translate query expressions to the format understandable by .NET. Because the .NET runtime understands method calls that can be executed, the LINQ query expressions written in C# are translated into a series of method calls. Such methods are called extension methods, and they are defined in a slightly different way than normal methods.

Example 13-5 is identical to Example 13-1 except it uses query operator extension methods instead of query expressions. The parts of the code that have not changed are omitted for brevity.

Example 13-5. Using query operator extension methods
using System;
using System.Collections.Generic;
using System.Linq;
namespace Programming_CSharp
{
    // Simple customer class
    public class Customer
    {
        // Same as in Example 13-1
    }

    // Main program
    public class Tester
    {
        static void Main(  )
        {
            List<Customer> customers = CreateCustomerList(  );

            // Find customer by first nameIEnumerable<Customer> result =
                customers.Where(customer => customer.FirstName == "Donna");
            Console.WriteLine("FirstName == "Donna"");
            foreach (Customer customer in result)
                Console.WriteLine(customer.ToString(  ));
        }

        // Create a customer list with sample data
        private static List<Customer> CreateCustomerList(  )
        {
            // Same as in Example 13-1
        }
    }
}

Output:
(Same as in Example 13-1)

Example 13-5 searches for customers whose first name is “Donna” using a query expression with a where clause. Here’s the original code from Example 13-1:

IEnumerable<Customer> result =
    from   customer in customers
    where  customer.FirstName == "Donna"
    select customer;

Here is the extension Where( ) method:

IEnumerable<Customer> result =
    customers.Where(customer => customer.FirstName == "Donna");

You may have noticed that the select clause seems to have vanished in this example. For details on this, please see the sidebar, "Whither the select Clause?" (And try to remember, as Chico Marx reminded us, “There ain’t no such thing as a Sanity Clause.”)

Recall that Customers is of type List<Customer>, which might lead you to think that List<T> must have implemented the Where method to support LINQ. It does not. The Where method is called an extension method because it extends an existing type. Before we go into more details in this example, let’s take a closer look at extension methods.

Defining and Using Extension Methods

C# 3.0 introduces extension methods that provide the ability for programmers to add methods to existing types. For instance, System.String does not provide a Right( ) function that returns the rightmost n characters of a string. If you use this functionality a lot in your application, you may have considered building and adding it to your library. However, System.String is defined as sealed, so you can’t subclass it. It is not a partial class, so you can’t extend it using that feature.

Of course, you can’t modify the .NET core library directly either. Therefore, you would have to define your own helper method outside of System.String and call it with syntax such as this:

MyHelperClass.GetRight(aString, n)

This is not exactly intuitive. With C# 3.0, however, there is a more elegant solution. You can actually add a method to the System.String class; in other words, you can extend the System.String class without having to modify the class itself. Such a method is called an extension method. Example 13-6 demonstrates how to define and use an extension method.

Example 13-6. Defining and using extension methods
using System;

namespace Programming_CSharp_Extensions
{
    // Container class for extension methods.
    public static class ExtensionMethods
    {
        // Returns a substring containing the rightmost
        // n characters in a specific string.
        public static string Right(this string s, int n)
        {
            if (n < 0 || n > s.Length)
                return s;
            else
                return s.Substring(s.Length - n);
        }
    }

    public class Tester
    {
        public static void Main(  )
        {
            string hello = "Hello";
            Console.WriteLine("hello.Right(−1) = {0}", hello.Right(−1));
            Console.WriteLine("hello.Right(0) = {0}", hello.Right(0));
            Console.WriteLine("hello.Right(3) = {0}", hello.Right(3));
            Console.WriteLine("hello.Right(5) = {0}", hello.Right(5));
            Console.WriteLine("hello.Right(6) = {0}", hello.Right(6));
        }
    }
}

Output:
hello.Right(−1) = Hello
hello.Right(0) =
hello.Right(3) = llo
hello.Right(5) = Hello
hello.Right(6) = Hello

The first parameter of an extension method is always the target type, which is the string class in this example. Therefore, this example effectively defines a Right( ) function for the string class. You want to be able to call this method on any string, just like calling a normal System.String member method:

aString.Right(n)

In C#, an extension method must be defined as a static method in a static class. Therefore, this example defines a static class, ExtensionMethods, and a static method in this class:

public static string Right(this string s, int n)
{
    if (n < 0 || n > s.Length)
        return s;
    else
        return s.Substring(s.Length - n);
}

Compared to a regular method, the only notable difference is that the first parameter of an extension method always consists of the this keyword, followed by the target type, and finally an instance of the target type:

this string s

The subsequent parameters are just normal parameters of the extension method. The method body has no special treatment compared to regular methods either. Here, this function simply returns the desired substring or, if the length argument n is invalid, the original string.

To use an extension method, it must be in the same scope as the client code. If the extension method is defined in another namespace, you should add a “using” directive to import the namespace where the extension method is defined. You can’t use fully qualified extension method names as you do with a normal method. The use of extension methods is otherwise identical to any built-in methods of the target type. In this example, you simply call it like a regular System.String method:

hello.Right(3)

Extension Method Restrictions

It is worth mentioning, however, that extension methods are somewhat more restrictive than regular member methods—extension methods can only access public members of target types. This prevents the breach of encapsulation of the target types.

Another restriction is that if an extension method conflicts with a member method in the target class, the member method is always used instead of the extension method, as you can see in Example 13-7.

Example 13-7. Conflicting extension methods
using System;

namespace Programming_CSharp_Extensions
{
    // Container class for extension methods.
    public static class ExtensionMethods
    {
        // Returns a substring between the specific
        // start and end index of a string.
        public static string Substring(this string s, int startIndex, int endIndex)
        {
            if (startIndex >= 0 && startIndex <= endIndex && endIndex < s.Length)
                return s.Substring(startIndex, endIndex - startIndex);
            else
                return s;
        }
    }

    public class Tester
    {
        public static void Main(  )
        {
            string hello = "Hello";
            Console.WriteLine("hello.Substring(2, 3) = {0}",
                              hello.Substring(2, 3));
        }
    }
}

Output:
hello.Substring(2, 3) = llo

The Substring( ) extension method in this example has exactly the same signature as the built-in String.Substring(int startIndex, int length) method. As you can see from the output, it is the built-in Substring( ) method that is executed in this example. Now, we’ll go back to Example 13-4, where we used the LINQ extension method, Where, to search a customer list:

IEnumerable<Customer> result =
    customers.Where(customer => customer.FirstName == "Donna");

This method takes a predicate as an input argument.

Tip

In C# and LINQ, a predicate is a delegate that examines certain conditions and returns a Boolean value indicating whether the conditions are met.

The predicate performs a filtering operation on queries. The argument to this method is quite different from a normal method argument. In fact, it’s a lambda expression, which I introduced in Chapter 12.

Lambda Expressions in LINQ

In Chapter 12, I mentioned that you can use lambda expressions to define inline delegate definitions. In the following expression:

customer => customer.FirstName == "Donna"

the left operand, customer, is the input parameter. The right operand is the lambda expression that checks whether the customer’s FirstName property is equal to “Donna.” Therefore, for a given customer object, you’re checking whether its first name is Donna. This lambda expression is then passed into the Where method to perform this comparison operation on each customer in the customer list.

Queries defined using extension methods are called method-based queries. Although the query and method syntaxes are different, they are semantically identical, and the compiler translates them into the same IL code. You can use either of them based on your preference.

Let’s start with a very simple query, as shown in Example 13-8.

Example 13-8. A simple method-based query
using System;
using System.Linq;

namespace SimpleLamda
{
  class Program
  {
     static void Main(string[] args)
     {

      string[] names = { "Jesse", "Donald", "Douglas" };
      var dNames = names.Where(n => n.StartsWith("D"));
      foreach (string foundName in dNames)
      {
         Console.WriteLine("Found: " + foundName);
      }

     }
   }
}
Output:
Found: Donald
Found: Douglas

The statement names.Where is shorthand for:

System.Linq.Enumerable.Where(names,n=>n.StartsWith("D"));

Where is an extension method and so you can leave out the object (names) as the first argument, and by including the namespace System.Linq, you can call upon Where directly on the names object rather than through Enumerable.

Further, the type of dNames is Ienumerable<string>; we are using the new ability of the compiler to infer this by using the keyword var. This does not undermine type-safety, however, because var is compiled into the type Ienumerable<string> through that inference.

Thus, you can read this line:

var dNames = names.Where(n => n.StartsWith("D"));

as “fill the IEnumerable collection dNames from the collection names with each member where the member starts with the letter D.”

As the method syntax is closer to how the C# compiler processes queries, it is worth spending a little more time to look at how a more complex query is expressed to gain a better understanding of LINQ. Let’s translate Example 13-3 into a method-based query to see how it would look (see Example 13-9).

Example 13-9. Complex query in method syntax
namespace Programming_CSharp
{
    // Simple customer class
    public class Customer
    {
        // Same as in Example 13-1
    }

    // Customer address class
    public class Address
    {
        // Same as in Example 13-3
    }

    // Main program
    public class Tester
    {
        static void Main(  )
        {
            List<Customer> customers = CreateCustomerList(  );
            List<Address> addresses = CreateAddressList(  );

            var result = customers.Join(addresses,
                 customer => string.Format("{0} {1}", customer.FirstName,
                             customer.LastName),
                 address => address.Name,
                 (customer, address) => new { Customer = customer, Address =
                  address })
                 .OrderBy(ca => ca.Customer.LastName)
                 .ThenByDescending(ca => ca.Address.Street);

            foreach (var ca in result)
            {
                Console.WriteLine(string.Format("{0}
Address: {1}",
                    ca.Customer, ca.Address));
            }
        }

        // Create a customer list with sample data
        private static List<Customer> CreateCustomerList(  )
        {
            // Same as in Example 13-3
        }

        // Create a customer list with sample data
        private static List<Address> CreateAddressList(  )
        {
            // Same as in Example 13-3
        }
    }
}

Output:
Janet Gates
Email:   [email protected]
Address: 800 Interchange Blvd., Austin
Janet Gates
Email:   [email protected]
Address: 165 North Main, Austin
Orlando Gee
Email:   [email protected]
Address: 2251 Elliot Avenue, Seattle
Keith Harris
Email:   [email protected]
Address: 7943 Walnut Ave, Renton
Keith Harris
Email:   [email protected]
Address: 3207 S Grady Way, Renton

In Example 13-3, the query is written in query syntax:

var result =
    from   customer in customers
           join address in addresses on
                string.Format("{0} {1}", customer.FirstName, customer.LastName)
                equals address.Name
    orderby customer.LastName, address.Street descending
    select new { Customer = customer, Address = address.Street };

It is translated into the method syntax:

var result = customers.Join(addresses,
        customer => string.Format("{0} {1}", customer.FirstName,
                    customer.LastName),
        address => address.Name,
        (customer, address) => new { Customer = customer, Address = address })
        .OrderBy(ca => ca.Customer.LastName)
        .ThenByDescending(ca => ca.Address.Street);

The lambda expression takes some getting used to. Start with the OrderBy clause; you read that as “Order in this way: for each customerAddress, get the Customer’s LastName.” You read the entire statement as, “start with customers and join to addresses as follows, for customers concatenate the First.Name and Last.Name, and then for address fetch each Address.Name and join the two, then for the resulting record create a CustomerAddress object where the customer matches the Customer and the address matches the Address; now order these first by each customer’s LastName and then by each Address’ Street name.”

The main data source, the customers collection, is still the main target object. The extension method, Join( ), is applied to it to perform the join operation. Its first argument is the second data source, addresses. The next two arguments are join condition fields in each data source. The final argument is the result of the join condition, which is in fact the select clause in the query.

The OrderBy clauses in the query expression indicate that you want to order by the customers’ last name in ascending order, and then by their street address in descending order. In the method syntax, you must specify this preference by using the OrderBy and the ThenBy methods.

You can just call OrderBy methods in sequence, but the methods must be in reverse order. That is, you must invoke the method to order the last field in the query OrderBy list first, and order the first field in the query OrderBy list last. In this example, you will need to invoke the order by street method first, followed by the order by name method:

var result = customers.Join(addresses,
        customer => string.Format("{0} {1}", customer.FirstName,
                    customer.LastName),
        address => address.Name,
        (customer, address) => new { Customer = customer, Address = address })
        .OrderByDescending(ca => ca.Address.Street)
        .OrderBy(ca => ca.Customer.LastName);

As you can see from the result, the results for both examples are identical. Therefore, you can choose either based on your own preference.

Tip

Ian Griffiths, one of the smarter C# programmers on Earth, who blogs at IanG on Tap (http://www.interact-sw.co.uk/iangblog/), makes the following point, which I will illustrate in Chapter 15, but which I did not want to leave hanging here: “You can use exactly these same two syntaxes on a variety of different sources, but the behavior isn’t always the same. The meaning of a lambda expression varies according to the signature of the function it is passed to. In these examples, it’s a succinct syntax for a delegate. But if you were to use exactly the same form of queries against a SQL data source, the lambda expression is turned into something else.”

All the LINQ extension methods—Join, Select, Where, and so on—have multiple implementations, each with different target types. Here, we’re looking at the ones that operate over IEnumerable. The ones that operate over IQueryable are subtly different. Rather than taking delegates for the join, projection, where, and other clauses, they take expressions. Those are wonderful and magical things that enable the C# source code to be transformed into an equivalent SQL query.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.15.235.196