Chapter 9. Introduction to LINQ and the List Collection

 

To write it, it took three months; to conceive it three minutes; to collect the data in it—all my life.

 
 --F. Scott Fitzgerald
 

Science is feasible when the variables are few and can be enumerated...

 
 --Paul Valéry
 

You shall listen to all sides and filter them from your self.

 
 --Walt Whitman
 

The portraitist can select one tiny aspect of everything shown at a moment to incorporate into the final painting.

 
 --Robert Nozick
 

List, list, O, list!

 
 --William Shakespeare
<feature> <supertitle>Objectives</supertitle>

In this chapter you’ll learn:

<objective>

Basic LINQ concepts.

</objective>
<objective>

How to query an array using LINQ.

</objective>
<objective>

Basic .NET collections concepts.

</objective>
<objective>

How to create and use a generic List collection.

</objective>
<objective>

How to query a generic List collection using LINQ.

</objective>
</feature>
<feature> <supertitle>Outline</supertitle> </feature>

Introduction

The preceding chapter introduced arrays—simple data structures used to store data items of a specific type. Although commonly used, arrays have limited capabilities. For instance, you must specify an array’s size, and if at execution time, you wish to modify it, you must do so manually by creating a new array or by using the Array class’s Resize method, which creates a new array and copies the existing elements into the new array for you.

Here, we introduce a set of prepackaged data structures—the .NET Framework’s collection classes—that offer greater capabilities than traditional arrays. They’re reusable, reliable, powerful and efficient and have been carefully designed and tested to ensure quality and performance. This chapter focuses on the List collection. Lists are similar to arrays but provide additional functionality, such as dynamic resizing—they automatically increase their size at execution time to accommodate additional elements. We use the List collection to implement several examples similar to those used in the preceding chapter.

Large amounts of data are often stored in a database—an organized collection of data. (We discuss databases in detail in Chapter 18.) A database management system (DBMS) provides mechanisms for storing, organizing, retrieving and modifying data in the database. A language called SQL—pronounced “sequel”—is the international standard used to perform queries (i.e., to request information that satisfies given criteria) and to manipulate data. For years, programs accessing a relational database passed SQL queries to the database management system, then processed the results. This chapter introduces C#’s new LINQ (Language Integrated Query) capabilities. LINQ allows you to write query expressions, similar to SQL queries, that retrieve information from a wide variety of data sources, not just databases. We use LINQ to Objects in this chapter to query arrays and Lists, selecting elements that satisfy a set of conditions—this is known as filtering. Figure 9.1 shows where and how we use LINQ throughout the book to retrieve information from many data sources.

Table 9.1. LINQ usage throughout the book.

Chapter

Used to

Chapter 9, Introduction to LINQ and the List Collection

Query arrays and Lists.

Chapter 16, Strings and Characters

Select GUI controls in a Windows Forms application.

Chapter 17, Files and Streams

Search a directory and manipulate text files.

Chapter 18, Databases and LINQ

Retrieve information from a database.

Chapter 19, Web App Development with ASP.NET

Retrieve information from a database to be used in a web-based application.

Chapter 26, XML and LINQ to XML

Query an XML document.

Chapter 28, Windows Communication Foundation (WCF) Web Services

Query and update a database. Process XML returned by WCF services.

Chapter 29, Silverlight and Rich Internet Applications

Process XML returned by web services to a Silverlight application.

LINQ Providers

The syntax of LINQ is built into C#, but LINQ queries may be used in many different contexts because of libraries known as providers. A LINQ provider is a set of classes that implement LINQ operations and enable programs to interact with data sources to perform tasks such as sorting, grouping and filtering elements.

In this book, we discuss LINQ to SQL and LINQ to XML, which allow you to query databases and XML documents using LINQ. These providers, along with LINQ to Objects, mentioned above, are included with Visual Studio and the .NET Framework. There are many providers that are more specialized, allowing you to interact with a specific website or data format. An extensive list of available providers is located at:

blogs.msdn.com/charlie/archive/2006/10/05/Links-to-LINQ.aspx

Querying an Array of int Values Using LINQ

Figure 9.2 demonstrates querying an array of integers using LINQ. Repetition statements that filter arrays focus on the process of getting the results—iterating through the elements and checking whether they satisfy the desired criteria. LINQ specifies the conditions that selected elements must satisfy. This is known as declarative programming—as opposed to imperative programming (which we’ve been doing so far) in which you specify the actual steps to perform a task. The query in lines 20–22 specifies that the results should consist of all the ints in the values array that are greater than 4. It does not specify how those results are obtained—the C# compiler generates all the necessary code automatically, which is one of the great strengths of LINQ. To use LINQ to Objects, you must import the System.Linq namespace (line 4).

Example 9.2. LINQ to Objects using an int array.

 1   // Fig. 9.2: LINQWithSimpleTypeArray.cs
 2   // LINQ to Objects using an int array.
 3   using System;
 4   using System.Linq;
 5
 6   class LINQWithSimpleTypeArray
 7   {
 8      public static void Main( string[] args )
 9      {
10         // create an integer array
11         int[] values = { 2, 9, 5, 0, 3, 7, 1, 4, 8, 5 };
12
13         // display original values
14         Console.Write( "Original array:" );
15         foreach ( var element in values )
16            Console.Write( " {0}", element );
17
18         // LINQ query that obtains values greater than 4 from the array
19         var filtered =
20            from value in values
21            where value > 4     
22            select value;       
23
24         // display filtered results
25         Console.Write( "
Array values greater than 4:" );
26         foreach ( var element in filtered )
27            Console.Write( " {0}", element );
28
29         // use orderby clause to sort original array in ascending order
30         var sorted =
31            from value in values
32            orderby value       
33            select value;       
34
35         // display sorted results
36         Console.Write( "
Original array, sorted:" );
37         foreach ( var element in sorted )
38            Console.Write( " {0}", element );
39
40         // sort the filtered results into descending order
41         var sortFilteredResults =
42            from value in filtered  
43            orderby value descending
44            select value;           
45
46         // display the sorted results
47         Console.Write(
48            "
Values greater than 4, descending order (separately):" );
49         foreach ( var element in sortFilteredResults )
50            Console.Write( " {0}", element );
51
52         // filter original array and sort in descending order
53         var sortAndFilter =
54            from value in values    
55            where value > 4         
56            orderby value descending
57            select value;           
58
59         // display the filtered and sorted results
60         Console.Write(
61            "
Values greater than 4, descending order (one query):" );
62         foreach ( var element in sortAndFilter )
63            Console.Write( " {0}", element );
64
65         Console.WriteLine();
66      } // end Main
67   } // end class LINQWithSimpleTypeArray

Original array: 2 9 5 0 3 7 1 4 8 5
Array values greater than 4: 9 5 7 8 5
Original array, sorted: 0 1 2 3 4 5 5 7 8 9
Values greater than 4, descending order (separately): 9 8 7 5 5
Values greater than 4, descending order (one query): 9 8 7 5 5

The from Clause and Implicitly Typed Local Variables

A LINQ query begins with a from clause (line 20), which specifies a range variable (value) and the data source to query (values). The range variable represents each item in the data source (one at a time), much like the control variable in a foreach statement. We do not specify the range variable’s type. Since it is assigned one element at a time from the array values, which is an int array, the compiler determines that the range variable value should be of type int. This is a C# feature called implicitly typed local variables, which enables the compiler to infer a local variable’s type based on the context in which it’s used.

Introducing the range variable in the from clause at the beginning of the query allows the IDE to provide IntelliSense while you write the rest of the query. The IDE knows the range variable’s type, so when you enter the range variable’s name followed by a dot (.) in the code editor, the IDE can display the range variable’s methods and properties.

The var Keyword and Implicitly Typed Local Variables

You can also declare a local variable and let the compiler infer the variable’s type based on the variable’s initializer. To do so, the var keyword is used in place of the variable’s type when declaring the variable. Consider the declaration

var x = 7;

Here, the compiler infers that the variable x should be of type int, because the compiler assumes that whole-number values, like 7, are of type int. Similarly, in the declaration

var y = -123.45;

the compiler infers that y should be of type double, because the compiler assumes that floating-point number values, like -123.45, are of type double. Typically, implicitly typed local variables are used for more complex types, such as the collections of data returned by LINQ queries. We use this feature in lines 19, 30, 41 and 53 to enable the compiler to determine the type of each variable that stores the results of a LINQ query. We also use this feature to declare the control variable in the foreach statements at lines 15–16, 26–27, 37–38, 49–50 and 62–63. In each case, the compiler infers that the control variable is of type int because the array values and the LINQ query results all contain int values.

The where Clause

If the condition in the where clause (line 21) evaluates to true, the element is selected—i.e., it’s included in the results. Here, the ints in the array are included only if they’re greater than 4. An expression that takes an element of a collection and returns true or false by testing a condition on that element is known as a predicate.

The select Clause

For each item in the data source, the select clause (line 22) determines what value appears in the results. In this case, it’s the int that the range variable currently represents. A LINQ query typically ends with a select clause.

Iterating Through the Results of the LINQ Query

Lines 26–27 use a foreach statement to display the query results. As you know, a foreach statement can iterate through the contents of an array, allowing you to process each element in the array. Actually, the foreach statement can iterate through the contents arrays, collections and the results of LINQ queries. The foreach statement in lines 26–27 iterates over the query result filtered, displaying each of its items.

LINQ vs. Repetition Statements

It would be simple to display the integers greater than 4 using a repetition statement that tests each value before displaying it. However, this would intertwine the code that selects elements and the code that displays them. With LINQ, these are kept separate, making the code easier to understand and maintain.

The orderby Clause

The orderby clause (line 32) sorts the query results in ascending order. Lines 43 and 56 use the descending modifier in the orderby clause to sort the results in descending order. An ascending modifier also exists but isn’t normally used, because it’s the default. Any value that can be compared with other values of the same type may be used with the orderby clause. A value of a simple type (e.g., int) can always be compared to another value of the same type; we’ll say more about comparing values of reference types in Chapter 12.

The queries in lines 42–44 and 54–57 generate the same results, but in different ways. The first query uses LINQ to sort the results of the query from lines 20–22. The second query uses both the where and orderby clauses. Because queries can operate on the results of other queries, it’s possible to build a query one step at a time, and pass the results of queries between methods for further processing.

More on Implicitly Typed Local Variables

Implicitly typed local variables can also be used to initialize arrays without explicitly giving their type. For example, the following statement creates an array of int values:

var array = new[] { 32, 27, 64, 18, 95, 14, 90, 70, 60, 37 };

Note that there are no square brackets on the left side of the assignment operator, and that new[] is used to specify that the variable is an array.

An Aside: Interface IEnumerable<T>

As we mentioned, the foreach statement can iterate through the contents of arrays, collections and LINQ query results. Actually, foreach iterates over any so-called IEnumerable<T> object, which just happens to be what a LINQ query returns.

IEnumerable<T> is an interface. Interfaces define and standardize the ways in which people and systems can interact with one another. For example, the controls on a radio serve as an interface between radio users and the radio’s internal components. The controls allow users to perform a limited set of operations (e.g., changing the station, adjusting the volume, and choosing between AM and FM), and different radios may implement the controls in different ways (e.g., using push buttons, dials or voice commands). The interface specifies what operations a radio permits users to perform but does not specify how the operations are implemented. Similarly, the interface between a driver and a car with a manual transmission includes the steering wheel, the gear shift, the clutch, the gas pedal and the brake pedal. This same interface is found in nearly all manual-transmission cars, enabling someone who knows how to drive one manual-transmission car to drive another.

Software objects also communicate via interfaces. A C# interface describes a set of methods that can be called on an object—to tell the object, for example, to perform some task or return some piece of information. The IEnumerable<T> interface describes the functionality of any object that can be iterated over and thus offers methods to access each element. A class that implements an interface must define each method in the interface with a signature identical to the one in the interface definition. Implementing an interface is like signing a contract with the compiler that states, “I will declare all the methods specified by the interface.” Chapter 12 covers use of interfaces in more detail, as well as how to define your own interfaces.

Arrays are IEnumerable<T> objects, so a foreach statement can iterate over an array’s elements. Similarly, each LINQ query returns an IEnumerable<T> object. Therefore, you can use a foreach statement to iterate over the results of any LINQ query. The notation <T> indicates that the interface is a generic interface that can be used with any type of data (for example, ints, strings or Employees). You’ll learn more about the <T> notation in Section 9.4. You’ll learn more about interfaces in Section 12.7.

Querying an Array of Employee Objects Using LINQ

LINQ is not limited to querying arrays of primitive types such as ints. It can be used with most data types, including strings and user-defined classes. It cannot be used when a query does not have a defined meaning—for example, you cannot use orderby on objects that are not comparable. Comparable types in .NET are those that implement the IComparable interface, which is discussed in Section 22.4. All built-in types, such as string, int and double implement IComparable. Figure 9.3 presents the Employee class. Figure 9.4 uses LINQ to query an array of Employee objects.

Example 9.3. Employee class.

 1   // Fig. 9.3: Employee.cs
 2   // Employee class with FirstName, LastName and MonthlySalary properties.
 3   public class Employee
 4   {
 5      private decimal monthlySalaryValue; // monthly salary of employee
 6
 7      // auto-implemented property FirstName
 8      public string FirstName { get; set; }
 9
10      // auto-implemented property LastName
11      public string LastName { get; set; }
12
13      // constructor initializes first name, last name and monthly salary
14      public Employee( string first, string last, decimal salary )
15      {
16         FirstName = first;
17         LastName = last;
18         MonthlySalary = salary;
19      } // end constructor
20
21      // property that gets and sets the employee's monthly salary
22      public decimal MonthlySalary
23      {
24         get
25         {
26            return monthlySalaryValue;
27         } // end get
28         set
29         {
30            if ( value >= 0M ) // if salary is nonnegative
31            {
32               monthlySalaryValue = value;
33            } // end if
34         } // end set
35      } // end property MonthlySalary
36
37      // return a string containing the employee's information
38      public override string ToString()
39      {
40         return string.Format( "{0,-10} {1,-10} {2,10:C}",
41            FirstName, LastName, MonthlySalary );
42      } // end method ToString
43   } // end class Employee

Example 9.4. LINQ to Objects using an array of Employee objects.

 1   // Fig. 9.4: LINQWithArrayOfObjects.cs
 2   // LINQ to Objects using an array of Employee objects.
 3   using System;
 4   using System.Linq;
 5
 6   public class LINQWithArrayOfObjects
 7   {
 8      public static void Main( string[] args )
 9      {
10         // initialize array of employees
11         Employee[] employees = {
12            new Employee( "Jason", "Red", 5000M ),
13            new Employee( "Ashley", "Green", 7600M ),
14            new Employee( "Matthew", "Indigo", 3587.5M ),
15            new Employee( "James", "Indigo", 4700.77M ),
16            new Employee( "Luke", "Indigo", 6200M ),
17            new Employee( "Jason", "Blue", 3200M ),
18            new Employee( "Wendy", "Brown", 4236.4M ) }; // end init list
19
20         // display all employees
21         Console.WriteLine( "Original array:" );
22         foreach ( var element in employees )
23            Console.WriteLine( element );
24
25         // filter a range of salaries using && in a LINQ query
26         var between4K6K =
27            from e in employees                                       
28            where e.MonthlySalary >= 4000M && e.MonthlySalary <= 6000M
29            select e;                                                 
30
31         // display employees making between 4000 and 6000 per month
32         Console.WriteLine( string.Format(
33            "
Employees earning in the range {0:C}-{1:C} per month:",
34            4000, 6000 ) );
35         foreach ( var element in between4K6K )
36            Console.WriteLine( element );
37
38         // order the employees by last name, then first name with LINQ
39         var nameSorted =
40            from e in employees            
41            orderby e.LastName, e.FirstName
42            select e;                      
43
44         // header
45         Console.WriteLine( "
First employee when sorted by name:" );
46
47         // attempt to display the first result of the above LINQ query
48         if ( nameSorted.Any() )
49            Console.WriteLine( nameSorted.First() );
50         else
51            Console.WriteLine( "not found" );
52
53         // use LINQ to select employee last names
54         var lastNames =
55            from e in employees
56            select e.LastName; 
57
58         // use method Distinct to select unique last names
59         Console.WriteLine( "
Unique employee last names:" );
60         foreach ( var element in lastNames.Distinct() )
61            Console.WriteLine( element );
62
63         // use LINQ to select first and last names
64         var names =
65            from e in employees                           
66            select new { e.FirstName, Last = e.LastName };
67
68         // display full names
69         Console.WriteLine( "
Names only:" );
70         foreach ( var element in names )
71            Console.WriteLine( element );
72
73         Console.WriteLine();
74      } // end Main
75   } // end class LINQWithArrayOfObjects

Original array:
Jason      Red         $5,000.00
Ashley     Green       $7,600.00
Matthew    Indigo      $3,587.50
James      Indigo      $4,700.77
Luke       Indigo      $6,200.00
Jason      Blue        $3,200.00
Wendy      Brown       $4,236.40

Employees earning in the range $4,000.00-$6,000.00 per month:
Jason      Red         $5,000.00
James      Indigo      $4,700.77
Wendy      Brown       $4,236.40

First employee when sorted by name:
Jason      Blue        $3,200.00

Unique employee last names:
Red
Green
Indigo
Blue
Brown

Names only:
{ FirstName = Jason, Last = Red }
{ FirstName = Ashley, Last = Green }
{ FirstName = Matthew, Last = Indigo }
{ FirstName = James, Last = Indigo }
{ FirstName = Luke, Last = Indigo }
{ FirstName = Jason, Last = Blue }
{ FirstName = Wendy, Last = Brown }

Accessing the Properties of a LINQ Query’s Range Variable

Line 28 of Fig. 9.4 shows a where clause that accesses the properties of the range variable. In this example, the compiler infers that the range variable is of type Employee based on its knowledge that employees was defined as an array of Employee objects (lines 11–18). Any bool expression can be used in a where clause. Line 28 uses the conditional AND (&&) operator to combine conditions. Here, only employees that have a salary between $4,000 and $6,000 per month, inclusive, are included in the query result, which is displayed in lines 35–36.

Sorting a LINQ Query’s Results By Multiple Properties

Line 41 uses an orderby clause to sort the results according to multiple properties—specified in a comma-separated list. In this query, the employees are sorted alphabetically by last name. Each group of Employees that have the same last name is then sorted within the group by first name.

Any, First and Count Extension Methods

Line 48 introduces the query result’s Any method, which returns true if there’s at least one element, and false if there are no elements. The query result’s First method (line 49) returns the first element in the result. You should check that the query result is not empty (line 48) before calling First.

We’ve not specified the class that defines methods First and Any. Your intuition probably tells you they’re methods declared in the IEnumerable<T> interface, but they aren’t. They’re actually extension methods, but they can be used as if they were methods of IEnumerable<T>.

LINQ defines many more extension methods, such as Count, which returns the number of elements in the results. Rather than using Any, we could have checked that Count was nonzero, but it’s more efficient to determine whether there’s at least one element than to count all the elements. The LINQ query syntax is actually transformed by the compiler into extension method calls, with the results of one method call used in the next. It’s this design that allows queries to be run on the results of previous queries, as it simply involves passing the result of a method call to another method.

Selecting a Portion of an Object

Line 56 uses the select clause to select the range variable’s LastName property rather than the range variable itself. This causes the results of the query to consist of only the last names (as strings), instead of complete Employee objects. Lines 60–61 display the unique last names. The Distinct extension method (line 60) removes duplicate elements, causing all elements in the result to be unique.

Creating New Types in the select Clause of a LINQ Query

The last LINQ query in the example (lines 65–66) selects the properties FirstName and LastName. The syntax

new { e.FirstName, Last = e.LastName }

creates a new object of an anonymous type (a type with no name), which the compiler generates for you based on the properties listed in the curly braces ({}). In this case, the anonymous type consists of properties for the first and last names of the selected Employee. The LastName property is assigned to the property Last in the select clause. This shows how you can specify a new name for the selected property. If you don’t specify a new name, the property’s original name is used—this is the case for FirstName in this example. The preceding query is an example of a projection—it performs a transformation on the data. In this case, the transformation creates new objects containing only the FirstName and Last properties. Transformations can also manipulate the data. For example, you could give all employees a 10% raise by multiplying their MonthlySalary properties by 1.1.

When creating a new anonymous type, you can select any number of properties by specifying them in a comma-separated list within the curly braces ({}) that delineate the anonymous type definition. In this example, the compiler automatically creates a new class having properties FirstName and Last, and the values are copied from the Employee objects. These selected properties can then be accessed when iterating over the results. Implicitly typed local variables allow you to use anonymous types because you do not have to explicitly state the type when declaring such variables.

When the compiler creates an anonymous type, it automatically generates a ToString method that returns a string representation of the object. You can see this in the program’s output—it consists of the property names and their values, enclosed in braces. Anonymous types are discussed in more detail in Chapter 18.

Introduction to Collections

The .NET Framework Class Library provides several classes, called collections, used to store groups of related objects. These classes provide efficient methods that organize, store and retrieve your data without requiring knowledge of how the data is being stored. This reduces application-development time.

You’ve used arrays to store sequences of objects. Arrays do not automatically change their size at execution time to accommodate additional elements—you must do so manually by creating a new array or by using the Array class’s Resize method.

The collection class List<T> (from namespace System.Collections.Generic) provides a convenient solution to this problem. The T is a placeholder—when declaring a new List, replace it with the type of elements that you want the List to hold. This is similar to specifying the type when declaring an array. For example,

List< int > list1;

declares list1 as a List collection that can store only int values, and

List< string > list2;

declares list2 as a List of strings. Classes with this kind of placeholder that can be used with any type are called generic classes. Generic classes and additional generic collection classes are discussed in Chapters 22 and 23, respectively. Figure 23.2 provides a table of collection classes. Figure 9.5 shows some common methods and properties of class List<T>.

Table 9.5. Some methods and properties of class List<T>.

Method or property

Description

Add

Adds an element to the end of the List.

Capacity

Property that gets or sets the number of elements a List can store without resizing.

Clear

Removes all the elements from the List.

Contains

Returns true if the List contains the specified element; otherwise, returns false.

Count

Property that returns the number of elements stored in the List.

IndexOf

Returns the index of the first occurrence of the specified value in the List.

Insert

Inserts an element at the specified index.

Remove

Removes the first occurrence of the specified value.

RemoveAt

Removes the element at the specified index.

RemoveRange

Removes a specified number of elements starting at a specified index.

Sort

Sorts the List.

TrimExcess

Sets the Capacity of the List to the number of elements the List currently contains (Count).

Figure 9.6 demonstrates dynamically resizing a List object. The Add and Insert methods add elements to the List (lines 13–14). The Add method appends its argument to the end of the List. The Insert method inserts a new element at the specified position. The first argument is an index—as with arrays, collection indices start at zero. The second argument is the value that’s to be inserted at the specified index. All elements at the specified index and above are shifted up by one position. This is usually slower than adding an element to the end of the List.

Example 9.6. Generic List<T> collection demonstration.

 1   // Fig. 9.6: ListCollection.cs
 2   // Generic List collection demonstration.
 3   using System;
 4   using System.Collections.Generic;
 5
 6   public class ListCollection
 7   {
 8      public static void Main( string[] args )
 9      {
10         // create a new List of strings
11         List< string > items = new List< string >();
12
13         items.Add( "red" ); // append an item to the List          
14         items.Insert( 0, "yellow" ); // insert the value at index 0
15
16         // display the colors in the list
17         Console.Write(
18            "Display list contents with counter-controlled loop:" );
19         for ( int i = 0; i < items.Count; i++ )
20            Console.Write( " {0}", items[ i ] );
21
22         // display colors using foreach
23         Console.Write(
24            "
Display list contents with foreach statement:" );
25         foreach ( var item in items )
26            Console.Write( " {0}", item );
27
28         items.Add( "green" ); // add "green" to the end of the List
29         items.Add( "yellow" ); // add "yellow" to the end of the List
30
31         // display the List
32         Console.Write( "
List with two new elements:" );
33         foreach ( var item in items )
34            Console.Write( " {0}", item );
35
36         items.Remove( "yellow" ); // remove the first "yellow"
37
38         // display the List
39         Console.Write( "
Remove first instance of yellow:" );
40         foreach ( var item in items )
41            Console.Write( " {0}", item );
42
43         items.RemoveAt( 1 ); // remove item at index 1
44
45         // display the List
46         Console.Write( "
Remove second list element (green):" );
47         foreach ( var item in items )
48            Console.Write( " {0}", item );
49
50         // check if a value is in the List
51         Console.WriteLine( "
"red" is {0}in the list",
52            items.Contains( "red" ) ? string.Empty : "not " );
53
54         // display number of elements in the List
55         Console.WriteLine( "Count: {0}", items.Count );
56
57         // display the capacity of the List
58         Console.WriteLine( "Capacity: {0}", items.Capacity );
59      } // end Main
60   } // end class ListCollection

Display list contents with counter-controlled loop: yellow red
Display list contents with foreach statement: yellow red
List with two new elements: yellow red green yellow
Remove first instance of yellow: red green yellow
Remove second list element (green): red yellow
"red" is in the list
Count: 2
Capacity: 4

Lines 19–20 display the items in the List. The Count property returns the number of elements currently in the List. Lists can be indexed like arrays by placing the index in square brackets after the List variable’s name. The indexed List expression can be used to modify the element at the index. Lines 25–26 output the List by using a foreach statement. More elements are then added to the List, and it’s displayed again (lines 28–34).

The Remove method is used to remove the first element with a specific value (line 36). If no such element is in the List, Remove does nothing. A similar method, RemoveAt, removes the element at the specified index (line 43). When an element is removed through either of these methods, all elements above that index are shifted down by one—the opposite of the Insert method.

Line 52 uses the Contains method to check if an item is in the List. The Contains method returns true if the element is found in the List, and false otherwise. The method compares its argument to each element of the List in order until the item is found, so using Contains on a large List is inefficient.

Lines 55 and 58 display the List’s Count and Capacity. Recall that the Count property (line 55) indicates the number of items in the List. The Capacity property (line 58) indicates how many items the List can hold without growing. List is implemented using an array behind the scenes. When the List grows, it must create a larger internal array and copy each element to the new array. This is a time-consuming operation. It would be inefficient for the List to grow each time an element is added. Instead, the List grows only when an element is added and the Count and Capacity properties are equal—there’s no space for the new element.

Querying a Generic Collection Using LINQ

You can use LINQ to Objects to query Lists just as arrays. In Fig. 9.7, a List of strings is converted to uppercase and searched for those that begin with “R”.

Example 9.7. LINQ to Objects using a List<string>.

 1   // Fig. 9.7: LINQWithListCollection.cs
 2   // LINQ to Objects using a List< string >.
 3   using System;
 4   using System.Linq;
 5   using System.Collections.Generic;
 6
 7   public class LINQWithListCollection
 8   {
 9      public static void Main( string[] args )
10      {
11         // populate a List of strings
12         List< string > items = new List< string >();
13         items.Add( "aQua" ); // add "aQua" to the end of the List
14         items.Add( "RusT" ); // add "RusT" to the end of the List
15         items.Add( "yElLow" ); // add "yElLow" to the end of the List
16         items.Add( "rEd" ); // add "rEd" to the end of the List
17
18         // convert all strings to uppercase; select those starting with "R"
19         var startsWithR =
20            from item in items
21            let uppercaseString = item.ToUpper()   
22            where uppercaseString.StartsWith( "R" )
23            orderby uppercaseString
24            select uppercaseString;
25
26         // display query results
27         foreach ( var item in startsWithR )
28            Console.Write( "{0} ", item );
29
30         Console.WriteLine(); // output end of line
31
32         items.Add( "rUbY" ); // add "rUbY" to the end of the List
33         items.Add( "SaFfRon" ); // add "SaFfRon" to the end of the List
34
35         // display updated query results
36         foreach ( var item in startsWithR )
37            Console.Write( "{0} ", item );
38
39         Console.WriteLine(); // output end of line
40      } // end Main
41   } // end class LINQWithListCollection

RED RUST
RED RUBY RUST

Line 21 uses LINQ’s let clause to create a new range variable. This is useful if you need to store a temporary result for use later in the LINQ query. Typically, let declares a new range variable to which you assign the result of an expression that operates on the query’s original range variable. In this case, we use string method ToUpper to convert each item to uppercase, then store the result in the new range variable uppercaseString. We then use the new range variable uppercaseString in the where, orderby and select clauses. The where clause (line 22) uses string method StartsWith to determine whether uppercaseString starts with the character "R". Method StartsWith performs a case-sensitive comparison to determine whether a string starts with the string received as an argument. If uppercaseString starts with "R", method StartsWith returns true, and the element is included in the query results. More powerful string matching can be done using the regular-expression capabilities introduced in Chapter 16, Strings and Characters.

The query is created only once (lines 20–24), yet iterating over the results (lines 27–28 and 36–37) gives two different lists of colors. This demonstrates LINQ’s deferred execution—the query executes only when you access the results—such as iterating over them or using the Count method—not when you define the query. This allows you to create a query once and execute it many times. Any changes to the data source are reflected in the results each time the query executes.

There may be times when you do not want this behavior, and want to retrieve a collection of the results immediately. LINQ provides extension methods ToArray and ToList for this purpose. These methods execute the query on which they’re called and give you the results as an array or List<T>, respectively. These methods can also improve efficiency if you’ll be iterating over the results multiple times, as you execute the query only once.

C# has a feature called collection initializers, which provide a convenient syntax (similar to array initializers) for initializing a collection. For example, lines 12–16 of Fig. 9.7 could be replaced with the following statement:

List< string > items =
   new List< string > { "aQua", "RusT", "yElLow", "rEd" };

Wrap-Up

This chapter introduced LINQ (Language Integrated Query), a powerful feature for querying data. We showed how to filter an array or collection using LINQ’s where clause, and how to sort the query results using the orderby clause. We used the select clause to select specific properties of an object, and the let clause to introduce a new range variable to make writing queries more convenient. The StartsWith method of class string was used to filter strings starting with a specified character or series of characters. We used several LINQ extension methods to perform operations not provided by the query syntax—the Distinct method to remove duplicates from the results, the Any method to determine if the results contain any items, and the First method to retrieve the first element in the results.

We introduced the List<T> generic collection, which provides all the functionality of arrays, along with other useful capabilities such as dynamic resizing. We used method Add to append new items to the end of the List, method Insert to insert new items into specified locations in the List, method Remove to remove the first occurrence of a specified item, method RemoveAt to remove an item at a specified index and method Contains to determine if an item was in the List. We used property Count to get the number of items in the List, and property Capacity to determine the size the List can grow to without reallocating the internal array. In Chapter 10 we take a deeper look at classes and objects.

Deitel LINQ Resource Center

We use more advanced features of LINQ in later chapters. We’ve also created a LINQ Resource Center (www.deitel.com/LINQ/) that contains many links to additional information, including blogs by Microsoft LINQ team members, books, sample chapters, FAQs, tutorials, videos, webcasts and more. We encourage you to browse the LINQ Resource Center to learn more about this powerful technology.

Summary

Section 9.1 Introduction

  • .NET’s collection classes provide reusable data structures that are reliable, powerful and efficient.

  • Lists automatically increase their size to accommodate additional elements.

  • Large amounts of data are often stored in a database—an organized collection of data. Today’s most popular database systems are relational databases. SQL is the international standard language used almost universally with relational databases to perform queries (i.e., to request information that satisfies given criteria).

  • LINQ allows you to write query expressions (similar to SQL queries) that retrieve information from a wide variety of data sources. You can query arrays and Lists, selecting elements that satisfy a set of conditions—this is known as filtering.

  • A LINQ provider is a set of classes that implement LINQ operations and enable programs to interact with data sources to perform tasks such as sorting, grouping and filtering elements.

Section 9.2 Querying an Array of int Values Using LINQ

  • Repetition statements focus on the process of iterating through elements and checking whether they satisfy the desired criteria. LINQ specifies the conditions that selected elements must satisfy, not the steps necessary to get the results.

  • The System.Linq namespace contains the classes for LINQ to Objects.

  • A from clause specifies a range variable and the data source to query. The range variable represents each item in the data source (one at a time), much like the control variable in a foreach statement.

  • If the condition in the where clause evaluates to true for an element, it’s included in the results.

  • The select clause determines what value appears in the results.

  • A C# interface describes a set of methods and properties that can be used to interact with an object.

  • The IEnumerable<T> interface describes the functionality of any object that’s capable of being iterated over and thus offers methods to access each element in some order.

  • A class that implements an interface must define each method in the interface.

  • Arrays and collections implement the IEnumerable<T> interface.

  • A foreach statement can iterate over any object that implements the IEnumerable<T> interface.

  • A LINQ query returns an object that implements the IEnumerable<T> interface.

  • The orderby clause sorts query results in ascending order by default. Results can also be sorted in descending order using the descending modifier.

  • C# provides implicitly typed local variables, which enable the compiler to infer a local variable’s type based on the variable’s initializer.

  • To distinguish such an initialization from a simple assignment statement, the var keyword is used in place of the variable’s type.

  • You can use local type inference with control variables in the header of a for or foreach statement.

  • Implicitly typed local variables can be used to initialize arrays without explicitly giving their type. To do so, use new[] to specify that the variable is an array.

Section 9.3 Querying an Array of Employee Objects Using LINQ

  • LINQ can be used with collections of most data types.

  • Any boolean expression can be used in a where clause.

  • An orderby clause can sort the results according to multiple properties specified in a comma-separated list.

  • Method Any returns true if there’s at least one element in the result; otherwise, it returns false.

  • The First method returns the first element in the query result. You should check that the query result is not empty before calling First.

  • The Count method returns the number of elements in the query result.

  • The Distinct method removes duplicate values from query results.

  • You can select any number of properties in a select clause by specifying them in a comma-separated list in braces after the new keyword. The compiler automatically creates a new class having these properties—called an anonymous type.

Section 9.4 Introduction to Collections

  • The .NET collection classes provide efficient methods that organize, store and retrieve data without requiring knowledge of how the data is being stored.

  • Class List<T> is similar to an array but provides richer functionality, such as dynamic resizing.

  • The Add method appends an element to the end of a List.

  • The Insert method inserts a new element at a specified position in the List.

  • The Count property returns the number of elements currently in a List.

  • Lists can be indexed like arrays by placing the index in square brackets after the List object’s name.

  • The Remove method is used to remove the first element with a specific value.

  • The RemoveAt method removes the element at the specified index.

  • The Contains method returns true if the element is found in the List, and false otherwise.

  • The Capacity property indicates how many items a List can hold without growing.

Section 9.5 Querying a Generic Collection Using LINQ

  • LINQ to Objects can query Lists.

  • LINQ’s let clause creates a new range variable. This is useful if you need to store a temporary result for use later in the LINQ query.

  • The StartsWith method of the string class determines whether a string starts with the string passed to it as an argument.

  • A LINQ query uses deferred execution—it executes only when you access the results, not when you create the query.

Terminology

Self-Review Exercises

9.1

Fill in the blanks in each of the following statements:

  1. Use the________property of the List class to find the number of elements in the List.

  2. The LINQ________clause is used for filtering.

  3. ________are classes specifically designed to store groups of objects and provide methods that organize, store and retrieve those objects.

  4. To add an element to the end of a List, use the________method.

  5. To get only unique results from a LINQ query, use the________method.

9.1

  1. Count.

  2. where.

  3. Collections.

  4. Add.

  5. Distinct.

9.2

State whether each of the following is true or false. If false, explain why.

  1. The orderby clause in a LINQ query can sort only in ascending order.

  2. LINQ queries can be used on both arrays and collections.

  3. The Remove method of the List class removes an element at a specific index.

9.2

  1. False. The descending modifier is used to make orderby sort in descending order.

  2. True.

  3. False. Remove removes the first element equal to its argument. RemoveAt removes the element at a specific index.

Answers to Self-Review Exercises

Exercises

9.3

(Querying an Array of Invoice Objects) Use the class Invoice provided in the ex09_03 folder with this chapter’s examples to create an array of Invoice objects. Use the sample data shown in Fig. 9.8. Class Invoice includes four properties—a PartNumber (type int), a PartDescription (type string), a Quantity of the item being purchased (type int) and a Price (type decimal). Perform the following queries on the array of Invoice objects and displays the results:

  1. Use LINQ to sort the Invoice objects by PartDescription.

  2. Use LINQ to sort the Invoice objects by Price.

  3. Use LINQ to select the PartDescription and Quantity and sort the results by Quantity.

  4. Use LINQ to select from each Invoice the PartDescription and the value of the Invoice (i.e., Quantity * Price). Name the calculated column InvoiceTotal. Order the results by Invoice value. [Hint: Use let to store the result of Quantity * Price in a new range variable total.]

  5. Using the results of the LINQ query in Part d, select the InvoiceTotals in the range $200 to $500.

Table 9.8. Sample data for Exercise 9.3.

Part number

Part description

Quantity

Price

83

Electric sander

7

57.98

24

Power saw

18

99.99

7

Sledge hammer

11

21.50

77

Hammer

76

11.99

39

Lawn mower

3

79.50

68

Screwdriver

106

6.99

56

Jig saw

21

11.00

3

Wrench

34

7.50

9.4

(Duplicate Word Removal) Write a console application that inputs a sentence from the user (assume no punctuation), then determines and displays the nonduplicate words in alphabetical order. Treat uppercase and lowercase letters the same. [Hint: You can use string method Split with no arguments, as in sentence.Split(), to break a sentence into an array of strings containing the individual words. By default, Split uses spaces as delimiters. Use string method ToLower in the select and orderby clauses of your LINQ query to obtain the lowercase version of each word.]

9.5

(Sorting Letters and Removing Duplicates) Write a console application that inserts 30 random letters into a List<char>. Perform the following queries on the List and display your results: [Hint: Strings can be indexed like arrays to access a character at a specific index.]

  1. Use LINQ to sort the List in ascending order.

  2. Use LINQ to sort the List in descending order.

  3. Display the List in ascending order with duplicates removed.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.144.96.105