LINQ Essentials

Here is a regular LINQ query expression. It uses the C# extensions discussed in the previous section. This query returns the list of multithreaded processes that have more than five active threads. For each multithreaded process, the process identifier and process name are returned:

var processes =
    Process.GetProcesses()
    .Where(p  => p.Threads.Count > 5)
    .Select(p => new { p.Id, Name = p.ProcessName });

Figure 6-1 diagrams the preceding LINQ query expression. New features of C# 3.0 are mapped to various aspects of the query expression. The processes variable is an implicitly typed variable. The type is defined by the results of the query expression. The Where operator is an extension method. As with any extension method, the first parameter is a this argument. LINQ operators are reviewed in the section "LINQ Operators," later in this chapter. A lambda expression, as identified by the lambda (=>) operator, is used by the Where operator and the Select operator. Object initializers are used to initialize an anonymous type that has an Id and a Name field.

A query expression mapped to C# language extensions

Figure 6-1. A query expression mapped to C# language extensions

Core Elements

The core elements of LINQ query expressions are implicitly typed variables, operators, and sequences. The result of a query expression typically is assigned to an implicitly typed variable. One reason is that query expressions sometimes return anonymous types. For the implicitly typed variable, the actual type is set from the result of the query expression.

Operators are extension methods and the building blocks of query expressions. There are many operators, which are grouped into categories for clarity. Filtering, Generation, Partitioning, and Sorting are some of the categories, and the entire list is presented in the section "LINQ Operators," later in this chapter. A sequence in a query expression is a series of dependent operators. The result of one operator is input to the next operator, the result of that operator is input to the following operator, and so on. Essentially, each operator, as an extension method, calls the next operator, and the result of each operator in the sequence is passed to the next. The following code contains a sequence, in which the Where operator calls the OrderBy operator, which in turn finally calls the Select operator. The result of the Select operator is stored in the processes variable.

var processes =
    Process.GetProcesses()
    .Where(p => p.Threads.Count > 5)
    .OrderBy(p => p.ProcessName)
    .Select(p => new { p.Id, Name = p.ProcessName });

The Where, OrderBy, and Select operators are extension methods and members of the Enumerable class. In the following code, the operators are called directly as extension methods. For example, the Where operator is called as the Enumerable.Where extension method. This is semantically equivalent and the results are identical to the previous code. The extension methods in this example actually execute inside-out.

var processes =
    Enumerable.Select(
        (Enumerable.OrderBy(
            (Enumerable.Where(Process.GetProcesses(),
                (p => p.Threads.Count > 5))),
            (p => p.ProcessName))),
        p => new { p.Id, Name = p.ProcessName });

To improve efficiency, LINQ supports lazy evaluation, meaning that query expressions do not execute and return results immediately. Instead, the results are iterated as needed. If a query expression returned a list of 10,000 people, would you want to enumerate that list all at once? Probably not: it would be a huge drain on performance and may have noticeable effects on your application. Less noticeable is the effect of distributing the cost of retrieval over the iteration of the collection. Another advantage to this approach is that it allows the querying and retrieval of a portion of the list if you want. This also helps performance. Otherwise, if your query returned 10,000 people, there would be no performance benefit to enumerating 10 names—the cost already has been paid. LINQ does provide the ability to force an immediate retrieval of the query results. For example, IEnumerable.ToList and IEnumerable.ToArray force the entire collection to be immediately returned in a query expression.

The following code demonstrates lazy evaluation. Actually, this code depends completely on lazy evaluation. Without it, this code would break. In the example, a series of factorials is being calculated. For example, a seed of 5 would return the factorial of every integral value from 1 to 5 (that is, 1, 2, 6, 24, and 120). However, the seed is specified after the query expression that returns the results. This is the line where the seed is set; note that it appears after the LINQ query:

end = 5;

In this example, the query could not have been performed without the seed. Without the seed, there is nothing to return, especially because the query expression is not enumerating a static list. Look closely at the code. There is no collection defined at compile time. Rather, the collection is generated at run time within the Factorial class inside the GetEnumerator method. The GetEnumerator method uses the seed, which is not set until after the query expression. (The use of Cast<int>is explained in the section "Conversion Operators," later in this chapter.)

using System;
using System.Collections;
using System.Collections.Generic;
using System.Linq;

namespace Donis.CSharpBook {
    public class Starter {

        public static int end = 0;

        private static void Main() {
            var result =
                new Factorial().Cast<int>()
                .Select(f => f);

            end = 5;
            foreach (uint item in result) {
                Console.WriteLine(item.ToString());
            }
        }
    }

    class Factorial : IEnumerable {

        private uint count = 1;
        public uint _factorial = 1;

        public IEnumerator GetEnumerator() {
            for (; 0 < Starter.end--; ) {
                _factorial = _factorial * (count++);
                yield return _factorial;
            }
        }
    }
}

As mentioned, the developer could force the immediate retrieval of query results with ToList or ToArray. The following code queries an array of names using lazy evaluation. In the Person class, there is a Name property. The get function traces to the console. Therefore, you will know any time the property is accessed. In addition, the remainder of the application is instrumented to document the sequence of events. For example, when the list is queried, is the Name property immediately accessed? Alternatively, is the Name property accessed in the foreach loop?

using System;
using System.Diagnostics;
using System.Linq;

namespace CSharp {
    public class Starter {
        private static void Main() {
             Person[] names = { new Person{ Name = "Wilson" },
                                new Person{ Name = "Bob" },
                                new Person{ Name = "Lisa" },
                                new Person{ Name = "Sally" } };
           Console.WriteLine("***query expression***");
           var result =
                names
                .Where(n => n.Name.StartsWith("B"))
                .Select(n => n);
           Console.WriteLine("***after query expression***");
           Console.WriteLine("***foreach***");
           foreach (var item in result) {
               Console.WriteLine("{0}", item);
           }
           Console.WriteLine("***after foreach***");
        }
    }

    class Person {
        string _name = "";
        public string Name {
            get {
                Console.WriteLine("Retrieving " + _name);
                return _name;
            }
            set {
                _name = value;
            }
        }
        public override string ToString() {
            return _name;
        }
    }
}

Here is the output from the preceding program:

***query expression***
***after query expression***
***foreach***
Retrieving Wilson
Retrieving Bob
Bob
Retrieving Lisa
Retrieving Sally
***after foreach***

From examining the output of the program, you can see that the Name property is not accessed during the query. This is lazy evaluation. The Name property is accessed later only in the foreach loop, as each individual item is iterated. The following code is modified to request the names immediately with a call to Enumerable.ToList. The pivotal line is shown in bold:

private static void Main() {
    Person[] names = { new Person{ Name = "Wilson" },
                       new Person{ Name = "Bob" },
                       new Person{ Name = "Lisa" },
                       new Person{ Name = "Sally" } };
    Console.WriteLine("***query expression***");
    var result =
        names
        .Where(n => n.Name.StartsWith("B"))
        .Select(n => n);
    Console.WriteLine("***after query expression***");
    List<Person> immediate = result.ToList();
    Console.WriteLine("***foreach***");
    foreach(var item in immediate) {
        Console.WriteLine("{0}", item);
        Console.WriteLine("***after foreach***");
    }
}

Here are the results from the modified program:

***query expression***
***after query expression***
Retrieving Wilson
Retrieving Bob
Retrieving Lisa
Retrieving Sally
***foreach***
Bob
***after foreach***

The results show that the Name property is used after the query expression but before the foreach loop. It is used when the ToList method is invoked.

Conversion Operators

You may have noticed that in the Factorial example presented earlier in the chapter, the Cast operator was called. The reason is that the type that was queried was not LINQ-compliant. In LINQ to Objects, collections must implement the IEnumerable<T> interface. (Generic classes are discussed in Chapter 7.) If not, the collection cannot be the source in a query expression. This is necessary, for example, when using nongeneric collection types with LINQ. The most common culprit is ArrayList, which was commonly used before .NET 2.0, when generic collections were introduced. The Cast operator converts a non-generic collection, such as ArrayList, into an IEnumerable<T> type. (Generic and non-generic collections were discussed in Chapter 5.)

Here is the Cast operator. It is an extension method that accepts a non-generic collection as input and returns a generic type:

public static IEnumerable<TResult> Enumerable.Cast<TResult>(this IEnumerable source)

Here is the relevant code from the Factorial example. The Cast operator converts an array of int to an IEnumerable<T> type:

var result =
    new Factorial().Cast<int>()
    .Select(f => f);

The OfType operator, another extension method, is an alternative to the Cast operator. It filters a non-generic collection into a generic collection of a specific type. Elements of the collection that are not of that type are removed. Here is the syntax of the OfType operator. The operator accepts a non-generic collection and returns a generic collection. The TResult generic parameter is the target type. Other element types are excluded and are not copied to the new generic collection:

public static IEnumerable<TResult> OfType<TResult>(this IEnumerable source)

Here is the modified query using the OfType operator:

var result =
    new Factorial().OfType<uint>()
    .Select(f => f);

LINQ Query Expression Syntax

In this chapter, the sample code has used the extension method syntax for LINQ query expressions. In the following code, the Enumerable.Where and Enumerable.Select extension methods are called:

var result =
    names
    .Where(n => n.Name.StartsWith("B"))
    .Select(n => n);

You have seen how to call the extension method directly. However, you also can use the query syntax intrinsic to C#. In C# 3.0, query expressions have been promoted to the language level, which provides a single interface for writing query expressions that transcends the domain of the data source. You don’t have to learn a separate query syntax for SQL, ADO.NET, XML, and others. In addition, the language syntax for query expressions is more implicit than the extension method syntax. For example, you do not have to write lambda expressions explicitly into the query expression. The shortcoming of this approach is that the C# query expression does not support every LINQ operator. For example, the SequenceEqual, Distinct, and Range operators are not supported. In those circumstances, you have to use the extension method syntax for query expressions. In addition, you cannot mix and match the different syntaxes for query expressions. The query expression must use the expression method syntax or the C# language syntax exclusively. Here is the previous query using the C# language syntax:

var result =
    from item in names
    where item.Name.StartsWith("B")
    select item;

The from clause identifies the source collection (names) and the name (item) for each element of the collection. The where and select clauses are semantically equivalent to the Where and Select operators. This approach is less wordy and more transparent as to intent. Therefore, for the remainder of this book, the C# language query syntax is used whenever possible.

Where Is LINQ?

The core library where LINQ is implemented is the System.Core.Dll library. This library is an addition to the .NET Framework 3.5. System.Core.Dll is referenced automatically for any .NET Framework 3.5 application. System.Core.Dll contains most of the infrastructure of LINQ but also includes other classes, such as a HashSet<T> generic type. My favorite additions are the types for supporting anonymous and named pipes: the AnonymousPipeServerStream and NamedPipeServerStream types, respectively. Pipes are relatively common in Win32 applications, and it is nice to see them seamlessly integrated into the .NET Framework. Other dynamic-link libraries (DLLs) were added in support of different providers. System.XML.Linq.Dll supports LINQ to XML, while System.Data.Linq.Dll supports LINQ to DataSet.

Namespaces

Along with the new DLLs for LINQ, there are also new namespaces. Table 6-1 lists some of the namespaces related to LINQ.

Table 6-1. LINQ namespaces

Namespace

Description

System.Linq

The core namespace of LINQ, contains the essential ingredients, such as the extension methods for the query operators. The query operators are found in the Enumerable class. The System.Linq namespace supports LINQ to Objects types. This namespace is found in the System.Core.Dll library.

System.Linq.Expressions

Provides support for expression trees. The primary types in this namespace are BinaryExpression, ConditionalExpression, ConstantExpression, LambdaExpression, and UnaryExpression. The types represent binary, conditional, constant, lambda, and unary expressions, respectively. This namespace is found in the System.Core.Dll library.

System.XML.Linq

The core namespace for LINQ to XML. The primary types in this namespace are XAttribute, XDocument, and XElement. These types represent an XML attribute, a document, and an element, respectively. This namespace is found in the System.XML.Linq.Dll library.

System.Data.Linq

The core namespace for LINQ to SQL. The primary classes in this namespace are the ChangeSet and Table<TEntity> classes. The ChangeSet type is a collection of changes to the data. The generic Table class represents a database table. This namespace is found in the System.Data.Linq.Dll library.

System.Data.Linq.Mapping

This namespace is for database mapping from a hierarchal database to LINQ to SQL objects. The primary classes in this namespace are ColumnAttribute, DatabaseAttribute, FunctionAttribute, and TableAttribute. ColumnAttribute maps a database column to a class. DatabaseAttribute defines attributes for a class that maps to a database. FunctionAttribute maps a stored procedure to a class method. Finally, TableAttribute maps a database table onto a class. This namespace is found in the System.Data.Linq.Dll library.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.144.244.250