Chapter 5. Standard Query Operators

Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Chapter 5. Standard Query Operators

Goals of this chapter:

• Introduce the standard query operators not covered so far.

• Show examples for each operator to help in real world application.

Up to this point, we have explored the main operators that filter, project, order, group, and join. This chapter introduces the remaining operators and demonstrates how they are used. By the end of this chapter (and Chapter 6, “Working with Set Data,” which details the set-based operators), you will have seen all of the standard query operators.

The Built-In Operators

Microsoft .NET Framework 4 has 52 built-in standard query operators. These operators are in the System.Linq namespace and are made available in each class file by adding the following using clause:

Many of the operators were discussed in Chapters 3 and 4. This chapter covers the remaining operators, except set-based operators, which are covered in Chapter 6 (see Table 5-1 for a summary). These operators form the basis of most LINQ queries, and the remaining operators introduced in this chapter build on these capabilities to make more complex queries possible. Table 5-2 lists the standard query operators discussed in this chapter.

Table 5-1. LINQ Standard Query Operator Operators Introduced in Other Chapters of This Book

Table 5-2. Standard Query Operators in the .NET Framework 4 Release Discussed in This Chapter

Aggregation Operators—Working with Numbers

LINQ’s aggregation operators enumerate a sequence of values (normally numeric, but not mandatory), perform some operation for each element, and ultimately return a result (normally numeric, but not mandatory). In purest form, Sum, Average, Min, and Max can all be built using the Aggregate operator; however, the specifically named operators make queries cleaner in syntax by spelling out exactly the operation being performed. Each of these operators are covered in detail in the following sections, starting with the Aggregate operator.

Aggregate Operator

The Aggregate operator performs a folding pattern on each element using a given lambda expression and passing the accumulated result to the next element when processed. This operator is powerful, both in building day-to-day queries to process an entire sequence and when building other operators that carry out numeric operations on a sequence (the Aggregate operator is used as the basis for a custom operator to calculate the standard deviation and variance, as discussed in Chapter 9, “Parallel LINQ to Objects”).

The Aggregate operator has three overloaded method signatures. These overloads allow you to specify a seed value and/or a selection function in addition to the aggregation function. The method signatures available are:

To understand how the Aggregate operator works, Listing 5-1 shows an aggregation operation on an integer array. The query computes the numeric sum of all integers in an array, then adds 100 to the final result. The Console output from this example is shown in Output 5-1. The Aggregate operator iterates the three values in the nums array. At each step, it takes the running total in the accumulator, adds that value, and stores that back in the accumulator. At the completion of the loop, the accumulator value is 6, and the select function adds an additional 100. The running watch at the start of each loop on the internal variables is listed in Table 5-3.

Table 5-3. Step-by-Step Aggregation Process

Listing 5-1. Aggregate operator example—see Output 5-1

Output 5-1

Simple aggregations are easy to understand. The seed and accumulator don’t have to be numeric types; they can be any type, including an array. Calculating the average for example needs two accumulated values to calculate a result in a single iteration of the source sequence. The general technique is to keep multiple values in an array and update these on each accumulation cycle. For example, to calculate the average of a sequence in a single pass, the code in Listing 5-2 can be used.

Listing 5-2. Calculating the average of an integer array; this example demonstrates how to use the Aggregate operator with more than a single accumulator value

Average, Max, Min, and Sum Operators

Average, Min, Max, and Sum do the expected operation given their names. Used in their simplest form, they return their numeric results when used on any numeric sequence of data as the example in Listing 5-3 shows. The Console output from this example is shown in Output 5-2.

Listing 5-3. Average, Min, Max and Sum operator examples—see Output 5-2

Output 5-2

The Aggregate, Min and Max operators don’t only work on numeric types, they also can be applied to any type. For the Min and Max operators, each element is tested against each other element using the < and > operator implementation for the element type, and either the lowest or highest element being returned respectively. For example, the following code finds the lowest (‘a’) and highest (‘x’) character in a character array:

Table 5-4 lists the various result type and element type overloads for the Average operator. Table 5-5 lists the overloads for Max, Min, and Sum. All of the built-in numeric types and their nullable counterparts are catered for.

Table 5-4. Average Operator Overload Element and Return Types

Table 5-5. Max / Min / Sum Operator Overload Element and Return Types

Nullable Type Shorthand

Starting with .NET Framework 2.0, it was possible to specify a value type variable declaration allowing a null value, a feature called nullable types. Earlier versions of .NET didn’t support value types being null. To make it easier to define a value type as nullable, Microsoft added a shorthand way of declaring them, simply appending a question mark to the type. The following two lines both declare a nullable type of integer; the second (i2) uses the shorthand syntax:

For each operator, in addition to the various overload types for different numeric element inputs, there is an overload that works on the sequence elements themselves and an overload that takes a selector function to specify the values to be aggregated. The signatures for these operators follow a repetitive pattern, and the following example shows the Sum operator that returns a decimal value:

Aggregate operations using selector functions are often useful in queries that group data. Listing 5-4 demonstrates a fairly complex query that summarizes fictitious call log data. In this case, the query groups all incoming calls from the same number and then aggregates the total number of calls, call time, and average call time. This query generates the Console output shown in Output 5-3.

Listing 5-4. Aggregate operators are useful for generating summary data in a query that groups data—see Output 5-3

Output 5-3

Count and LongCount Operators

Count and LongCount return the number of elements in a sequence and only differ in their return types, with Count returning an int type result and LongCount returning a long type result.

There are two overloads for each operator. The first takes no arguments and returns the total number of elements in a sequence, and the second overload takes a predicate function and returns the count of elements that passed the predicates test. The method signatures available are:

In its most basic use, the Count and LongCount operators return the number of elements in a sequence. Although the built-in collection classes rarely offer support for more than the int.MaxValue of elements, sources streamed from other places can exceed this limit, and LongCount should be used. Listing 5-5 demonstrates how to use the Count and LongCount operators. The Console output for this example is shown in Output 5-4.

Listing 5-5. When it is possible for the Count operator to be larger than an integer, use the LongCount operator instead—see Output 5-4

Output 5-4

The second overload of Count and LongCount that takes a filter predicate as an argument allows you to count the number of elements that pass that function (return the Boolean value of true). For example, to count the number of integers divisible by ten in an integer array, the following code can be used (in case you are interested, the result is 214,748,364):

The Count operator is optimized for sequences that implement the ICollection<T> interface. Collections implementing this interface support the Count property of their own and often avoid an iteration of their underlying collection to return their counts. For these collections, Count returns source.Count as the result. This optimization is not possible when the overload specifying a predicate is used; a full loop of the sequence is unavoidable to apply the predicate function.

Conversion Operators—Changing Types

LINQ to Objects has many operators that facilitate casting elements in a sequence and the sequence itself. This is often necessary to access specific instance methods that occur in one collection type from another or to access an instance property of an element.

AsEnumerable Operator

The AsEnumerable operator changes the compile time type of the source sequence, casting a sequence or collection back to System.Collections.Generic.IEnumerable<T>. This allows control over what LINQ provider will be called and used extensively when choosing between LINQ providers—most commonly at this time when choosing to execute part of a LINQ to SQL query in-memory rather than at the database. This technique is demonstrated in Chapter 9, where it is used to segment queries into sequential and parallel implementations.

AsEnumerable has a single overload with the following method signature:

To demonstrate this operator, the following sample code creates a List<int> called list and then using the AsEnumerable operator, turns this collection into an IEnumerable<int>. Figure 5-1 shows the Intellisense pop-up in Visual Studio 2010. All instance and extension methods are available on the list instance. Figure 5-2 shows that once this collection has an AsEnumerable operation applied, it is type-cast down to an IEnumerable<T> type, and only the extension methods are available (the instance property of List<T>, Count can no longer be seen unless it is cast back into an ICollection<int> or List<int>).

Figure 5-1. Visual Studio 2010 Intellisense for List<T>. Notice the Count instance property in addition to the extension method Count().

Figure 5-2. Visual Studio 2010 Intellisense for IEnumerable<T>, created by calling the AsEnumerable() operator. Notice the Count instance property is no longer accessible.

Casting a fuller-featured collection type back to a vanilla IEnumerable<T> can be useful when testing custom operators’ handling of nonindexed collections. For example, when testing an operator begin by testing using a collection in a List<T> (which supports index access to element access) to prove the IList<T> indexible path and then call AsEnumerable() on the collection to allow writing tests that exercise the vanilla IEnumerable<T> path, which cannot benefit from element access via index.

Cast Operator

The Cast operator yields all elements in a nongeneric IEnumerable collection as a given type. This is mainly to cater to the .NET 1.0 style of collections, which pre-date the generic type features of .NET Framework 2.0, such as the ArrayList collection, which holds references as Object types. If the type of the elements cannot be cast to the new type, a System.InvalidCastException is thrown.

Cast has a single overload with the following method signature:

To demonstrate using the Cast<T> operator, the following code initializes a nongeneric ArrayList collection (introduced in .NET Framework 1.0). This collection only has access to a few operators (namely Cast, OfType), and to gain access to all the LINQ standard query operators, you need to use the Cast operator as shown in Listing 5-6.

Listing 5-6. Example LINQ query over the .NET 1.0 ArrayList type, which doesn’t implement IEnumerable<T>

OfType Operator

The OfType operator filters the elements of an IEnumerable collection, returning an IEnumerable<T> with only elements of the type specified. This is similar to the Cast operator, however where the elements cannot be cast to the specified type, the OfType operator simply omits those elements; the Cast operator throws an exception.

OfType has a single overload with the following method signature:

The OfType operator will simply skip elements that can’t be safely cast to the specified type. To demonstrate how to use the OfType operator, Listing 5-7 shows how to implement a type-safe query over an ArrayList collection. The collection type ArrayList implements IEnumerable but not IEnumerable<T>, which precludes it using most of the LINQ standard query operators. The ArrayList collection holds references to System.Object types, and since all types inherit from System.Object, type safety of the elements cannot be assumed (as in this example, elements can be mixed type). Generics introduced in .NET 2.0 fixed this situation, and it’s unlikely that newer code doesn’t use a generic collection type.

Listing 5-7. The OfType operator allows working safely with nongeneric collection types like ArrayList

A more day-to-day use of the OfType operator is for filtering collections to isolate elements based on specific type. This is common in many types of applications, for example in drawing programs where each element in a collection can be of different “shape” types, which inherit from a base, as the following code demonstrates:

The OfType operator allows you to write queries over a collection holding all shapes and isolate based on the subtypes, Rectangle or Circle. Listing 5-8 demonstrates how to restrict collection elements to a certain type, with this example returning the number of Rectangles in the source collection.

Listing 5-8. The OfType operator can be used to filter collection elements based on their type

ToArray Operator

The ToArray operator returns an array (<T>) of a given type from any IEnumerable<T> sequence. The most common reason to use this operator is to force the immediate execution of a query in order to capture the resulting data as a snapshot (forcing the execution of the query that is waiting to be enumerated because of deferred execution).

ToArray has a single overload with the following method signature:

To demonstrate the use of the ToArray operator, Listing 5-9 executes a query over an IEnumerable<int> sequence and captures the results in an int[] array.

Listing 5-9. The ToArray operator enumerates an IEnumerable<T> sequence and returns the results in an array

ToDictionary Operator

The ToDictionary operator converts an IEnumerable<TValue> sequence into a Dictionary<TKey,TValue> collection. The System.Collections.Generic.Dictionary<TKey,TValue> collection type keeps a list of key value pairs and demands that each key value is unique. A System.ArgumentException is thrown if a duplicate key insert is attempted during the conversion process. A Dictionary collection allows efficient access to elements based on key value, and converting a sequence to a dictionary is a way of improving lookup performance if code frequently needs to find elements by key value.

There are four method overloads of the ToDictionary operator. The simplest overload takes only a delegate for determining the key selection function, and the most complex takes a delegate for key selection, element selection, and an IEqualityComparer that is used for determining if key values are equal. When the element selector function is not passed, the element type itself is used, and when an IEqualityComparer is not passed, EqualityComparer<TSource>.Default is used. The method signatures available are:

To demonstrate the use of the ToDictionary operator, Listing 5-10 creates a Dictionary of sample contacts based on the key of last name concatenated with first name, allowing efficient lookup of a contact by name. The sample demonstrates this by looking up a contact from the resulting Dictionary by key value.

Listing 5-10. The ToDictionary operator creates a dictionary of elements in a sequence

Note

One confusing aspect of using the ToDictionary operator is the reversal of the generic type definition between the operator (element type and then key type) and the Dictionary type (key type and then element type). You will notice in the example shown in Listing 5-10, that the Dictionary is defined as Dictionary<string, Contact> and the ToDictionary operator as ToDictionary<Contact, string>. It makes sense from the method declaration perspective (source type always comes first and then the return type), it just needs to be remembered. This is often a source of confusion when learning to use this operator.

The ToDictionary operator isn’t limited to projecting the sequence elements as their own type. Passing in an element selection function allows the value held in the Dictionary against a key value to be any type. The example shown in Listing 5-11 projects the elements to an anonymous type and uses case-insensitive string comparison to ensure that keys are absolutely unique (not just different by character casing), using the built-in static string comparison instances (creating custom IEqualityComparers is covered in Chapter 4 in the “Specifying Your Own Key Comparison Function” section and in Chapter 6 under “Custom EqualityComparers When Using LINQ Set Operators”).

Listing 5-11. The ToDictionary operator also accepts a custom equality comparer and element projection expression

ToList Operator

The ToList operator returns a System.Collections.Generic.List<T> of a given type from any IEnumerable<T> sequence. The most common reason to use this operator is to force the immediate execution of a query in order to capture the resulting data as a snapshot, executing the query by enumerating over all elements.

ToList has a single overload with the following method signature:

To demonstrate the ToList operator, Listing 5-12 executes a query over an IEnumerable<int> sequence and captures the results in an List<T> and then uses the ForEach method on that type to write the results to the Console window.

Listing 5-12. The ToList operator enumerates an IEnumerable<T> sequence and returns the results in a List<T>

ToLookup Operator

The ToLookup operator was first covered in Chapter 4, “Grouping and Joining Data,” where it was used to facilitate very efficient one-to-many joins between two sequences. ToLookup creates subsequences based on elements that share a common key value. ToLookup is similar in usage and behavior to ToDictionary, the difference being how they handle duplicate elements with the same key value. ToLookup allows multiple elements for a given key value, whereas ToDictionary throws an exception if this ever occurs.

The method signatures available for the ToLookup operator are:

To demonstrate the ToLookup operator, the example shown in Listing 5-13 builds a look-up list of all calls made from or to the same phone number (calls_lookup). The look-up groupings are then used in a second query to project a sequence of calls against each contact. The resulting Console output is (shortened to the first three records) is shown in Output 5-5.

Listing 5-13. Sample usage of the ToLookup operator to achieve a one-to-many outer join—see Output 5-5

Output 5-5

The result of a ToLookup operator is a System.Linq.ILookup<TKey, TElement>. The definition for the ILookup<TKey, TElement> interface and the IGrouping<TKey, TElement> interface are shown in Listing 5-14.

Listing 5-14. Definition for the ILookup and IGrouping interfaces

Element Operators

DefaultIfEmpty, ElementAt, ElementAtOrDefault, First, FirstOrDefault, Last, LastOrDefault, Single, and SingleOrDefault are all types of operators that return an individual element from a sequence or a default value if no element of a specific type exists.

The OrDefault variation of each of these operators (and DefaultIfEmpty) neatly handles the situation where an element cannot be found that satisfies the criteria, returning a default(T) result in these cases. The variations that don’t end with OrDefault throw an exception if no element satisfies the criteria.

DefaultIfEmpty Operator

The DefaultIfEmpty operator allows the graceful handling of when a sequence is empty (has no elements). There are two method overloads; the first handles an empty sequence by returning a single element of default(TSource) within an IEnumerable<T>. The second overload takes an argument specifying a default value to return if the source sequence is empty. In both cases, the default value is only returned if the source sequence is empty. The method signatures available are:

To demonstrate the use of DefaultIfEmpty, Listing 5-15 defines two arrays, one empty (empty) and one with a sequence of data (nums). DefaultIfEmpty is called on these arrays to demonstrate the behavior. The Console output from this example is shown in Output 5-6.

Listing 5-15. The DefaultIfEmpty operator allows safe handling of potentially empty source sequences—see Output 5-6

Output 5-6

ElementAt and ElementAtOrDefault Operators

The ElementAt operator returns the element at a given zero-based index position in a sequence. An ArgumentOutOfRangeException is thrown if the index position is less than zero or beyond the end of the sequence. To avoid this exception being thrown, use the ElementAtOrDefault operator instead, and a default(T) instance will be returned when the index is out of range.

The method signatures available for the ElementAt and ElementAtOrDefault operators are:

To demonstrate the ElementAt and ElementAtOrDefault operators, Listing 5-16 calls these operators on a simple array of integers. To avoid an exception being thrown where an index is out of bounds, as for the sample query called error, the ElementAtOrDefault operator is used instead. The Console output from this example is shown in Output 5-7.

Listing 5-16. The ElementAt operator allows elements to be accessed by zero-based index position—see Output 5-7

Output 5-7

First and FirstOrDefault Operators

The First and FirstOrDefault operators return the first element in a sequence. There are two overloads for each operator. The first overload takes no arguments and returns the first element from the sequence; the second takes a predicate argument and returns the first element that satisfies that predicate. If no elements are in the sequence or pass the predicate function, a System.InvalidOperationException is thrown, or if using the FirstOrDefault operator, an instance of default(T) is returned.

The method signatures available for the First and FirstOrDefault operators are:

To demonstrate the use of the First and FirstOrDefault operators, Listing 5-17 calls these operators with a simple array of integers and an empty array. To avoid an exception being thrown when there is no first element or no first element that passes the predicate function, the FirstOrDefault operator is used instead. The Console output from this example is shown in Output 5-8.

Listing 5-17. The First operator returns the first element in a sequence—see Output 5-8

Output 5-8

Last and LastOrDefault Operators

The Last and LastOrDefault operators return the last element in a sequence. There are two overloads for each operator. The first overload takes no arguments and returns the last element from the sequence; the second takes a predicate argument and returns the last element that satisfies that predicate. If no elements are in the sequence or pass the predicate function, a System.InvalidOperationException is thrown, or if using the LastOrDefault operator, an instance of default(T) is returned.

The method signatures available for the Last and LastOrDefault operators are:

To demonstrate the use of the Last and LastOrDefault operators, Listing 5-18 calls these operators with a simple array of integers and an empty array. To avoid an exception being thrown when there is no last element or no last element that passes the predicate function, the LastOrDefault operator is used instead. The Console output from this example is shown in Output 5-9.

Listing 5-18. The Last operator returns the last element in a sequence—see Output 5-9

Output 5-9

Single and SingleOrDefault Operators

The Single and SingleOrDefault operators return the single element from a sequence or the single element that passes a predicate function. The Single operator throws a System.InvalidOperationException if there is zero or more than one element in the sequence. The SingleOrDefault operator returns an instance of default(T) if there are zero elements in a sequence and still throws a System.InvalidOperationException if there is more than one element in the sequence or more than one element that passes the predicate.

If there are situations when more than one element might pass a filter or query and you want to avoid the chance of an exception being raised (considering the implications of just choosing one of the elements over another), consider using the First or Last operators instead.

The method signatures available for the Single and SingleOrDefault operators are:

To demonstrate the Single and SingleOrDefault operators, Listing 5-19 calls these operators with a simple array of integers, an array with a single element, and an empty array. To avoid an exception being thrown when there is no element or no element that passes the predicate function, the SingleOrDefault operator is used instead. If there is ever more than a single element in the sequence or that passes the predicate function, a System.InvalidOperationException is thrown in all cases. The Console output from this example is shown in Output 5-10.

Listing 5-19. The Single operator returns a single element from a sequence—see Output 5-10

Output 5-10

Equality Operator—SequenceEqual

There is a single operator that can be used for testing sequence equality—the SequenceEqual operator.

SequenceEqual Operator

The SequenceEqual operator compares one sequence with another and only returns true if the sequences are equal in element values, element order, and element count. There are two method signatures for SequenceEqual, one that takes the sequence to compare to as an argument, and one that also allows a custom IEqualityComparer to be passed to control how element equality testing is carried out. When no IEqualityComparer is passed in, the default equality comparer for the source element type is used.

The method signatures available for the SequenceEqual operator are:

To demonstrate the use of the SequenceEqual operator, Listing 5-20 calls this operator to compare three string array sequences. The Console output from this example is shown in Output 5-11.

Listing 5-20. The SequenceEqual operator compares two sequences for equal element values—see Output 5-11

Output 5-11

Generation Operators—Generating Sequences of Data

Generating sequences of data to combine with LINQ to Object queries is often necessary when building unit tests and when working with indexible data sources, like iterating through the x and y coordinates of a bitmap image. The generation methods aren’t extension methods; they are simply static method calls in the System.Linq.Enumerable class and are included with the LINQ operators because they fulfill the important task of creating sequences of data without using a for-loop construct.

Empty Operator

The Empty operator doesn’t appear very useful at first sight. So why return an empty IEnumerable<T> sequence? The main use for this operator is for testing the empty sequence error handling behavior of other operators.

An empty sequence is not a collection type initialized to null; it is an array of zero elements of the specified type. Pseudo-code would be longer than the implementation of this method, and the following code example reproduces the Empty operator:

The most common use I’ve found for this operator is for unit testing other operators. The following is an example of testing that an operator correctly throws the correct exception when called on an empty sequence of int types. (This sample used NUnit, but any testing framework has similar capabilities. The RandomElement operator is built as an example in Chapter 7, “Extending LINQ to Objects.”)

Range Operator

The Range operator builds a sequence of integers, starting for a given integer and increments of one, a given number of times. It has the following method signature:

This is useful for quickly generating a sequence of numbers. The following example generates and writes the integer numbers 1900 through to 1904 to the Console window (1900 1901 1902 1903 1904):

Sequences of numbers can be useful for data-binding to controls. One example of this is binding a list of years to a combo-box control on a form. This can simply be achieved by assigning the Datasource property of any bindable control to a List<int> generated by Range, and in this particular case, reversing the order (so the most recent years are at the top):

Listing 5-21. The Range operator generates numeric sequences, in this case all years from 1900 to 2010—Form output is shown in Figure 5-3

Figure 5-3. Using the Range operator to populate years in a ComboBox control.

This operator shows its true practical potential for generating sequences within queries to target indexible locations. This is often the case when working with data in a bitmap. The example shown in Listing 5-22 demonstrates how to address every x and y location in a bitmap loaded from a file, and return the luminance of each pixel as shown in Output 5-12.

Listing 5-22. Using the Range operator to address every pixel in a bitmap image—see Output 5-12

Output 5-12

Repeat Operator

The Repeat operator replicates a given data value any number of times you specify. The simplest example of using the Repeat operator is to initialize an array. The following code creates an array of integer values, with each element in the array initialized to -1:

The Repeat operator isn’t limited to integer value; any type can be used. For instance, the same array initialization using a string type initializing an array of string values to “No data” for three elements is

Value types work as expected, and caution must be used when the type of the repeating value argument is a reference type. It is obvious once pointed out, but the following code will initialize a bitmap array with all elements pointing to the same single bitmap instance as shown in the first example in Listing 5-23 (b1). It is likely the intention of this code is to create five separate bitmaps with dimensions five pixels by five pixels. The LINQ Repeat operator way of achieving this is to use the operator to cause a loop of five Select projections using the second example in Listing 5-23 (b2).

Listing 5-23. Be careful when using the Repeat operator for reference types

Merging Operators

Zip, a single merging operator, was added into .NET Framework 4.

Zip Operator

The Zip operator merges the corresponding elements of two sequences using a specified selector function. The selector function takes an element from the first sequence and the corresponding element from the second sequence and projects a result using the function supplied. This process continues until one of the sequences (or both) has no more elements.

Zip has a single overload with the following method signature:

To demonstrate the use of the Zip operator, Listing 5-24 calls this operator to merge a string array with an integer array. The first element in both sequences is passed to the result selector function, then the second elements, and then the third elements. Even though the string array has two more elements, they are skipped because the integer array has only three elements. The Console output from this example is shown in Output 5-13.

Listing 5-24. The Zip operator merges (combines) two sequences—see Output 5-13

Output 5-13

Partitioning Operators—Skipping and Taking Elements

Paging data involves restricting the amount of data returned to segments of the entire sequence. For instance, it is common for websites to return the results of a search in groups of a certain increment to improve performance by not transferring all one million items, just the first ten, then the next ten when asked, and so on. The Skip and Take operators form the basis for achieving this using LINQ queries. SkipWhile and TakeWhile allow conditional statements to control the segmentation of data, skipping data until a predicate fails or taking data until a predicate fails.

Skip and Take Operators

Skip and Take do as their names suggest; Skip jumps over the number of elements in a sequence (or skips the entire sequence if the count of elements is less than the number to skip), and Take returns the given number of elements (or as many elements as possible until the end of the sequence is reached).

The method signatures available for the Skip and Take operators are:

Although the operators are independent, they are often combined to achieve paging a result sequence. The pattern generally used for paging is shown in Listing 5-25, which demonstrates retrieving elements 21 to 30 (the third page), where each page is ten records long (normally the page number is passed in as a variable). The Console output from this example is shown in Output 5-14.

Listing 5-25. The Skip and Take operator combine to return paged segments of a sequence—see Output 5-14

Output 5-14

The Take operator also allows the topmost records (the first) to be returned similar to SQL’s SELECT TOP(n) statement. This can be to protect against extreme result set sizes or to limit the result set size during development and testing to a manageable quantity. If the page size is being retrieved from user input (like the query string in a URL), then this input should be checked and limited to avoid denial of service attacks by someone setting page size to an unrealistic extreme. Listing 5-26 shows a custom extension method operator that has safeguards to avoid paging exploitation through malicious input. In this case, this operator limits page size to 100 elements, and handles the translation of page number and page size into Skip and Take operations making repetitive code like that shown in Listing 5-25 centralized in one place. When this operator is called with the following line, the identical results shown in Output 5-14 are returned.

Listing 5-26. Custom paging extension method with safeguards from extreme inputs taken from untrusted sources

SkipWhile and TakeWhile Operators

The SkipWhile and TakeWhile operators skip or return elements from a sequence while a predicate function passes (returns True). The first element that doesn’t pass the predicate function ends the process of evaluation.

There are two overloads for each operator—one that takes a predicate to determine if an element should be skipped, and the second that takes the predicate but also provides an element index position that can be used in the predicate function for any given purpose. The method signatures available are:

A simple example of where these operators are useful is when parsing text files from the beginning. The example shown in Listing 5-27 skips any lines at the start of a file (in this case, this data is in string form for simplicity) and then returns all lines until an empty line is found. All lines after this blank line are ignored. The Console output from this example is shown in Output 5-15.

Listing 5-27. The SkipWhile and TakeWhile operators skip or return elements while their predicate argument returns true—see Output 5-15

Output 5-15

Quantifier Operators—All, Any, and Contains

The Quantifier operators, All, Any, and Contains return Boolean result values based on the presence or absence of certain data within a sequence. These operators could all be built by creating queries that filter the data using a Where clause and comparing the resulting sequence element count to see if it is greater than zero. But these operators are optimized for their purpose and return a result at the first possible point, avoiding iterating an entire sequence (often referred to as short-circuit evaluation).

All Operator

The All operator returns the Boolean result of true if all elements in a sequence pass the predicate function supplied.

All has a single overload with the following method signature:

To demonstrate the All operator, Listing 5-28 tests three arrays of integer values and returns the value of true if all elements are even (are divisible by two with no remainder). The Console output from this example is shown in Output 5-16.

Listing 5-28. The All operator returns true if all objects pass the predicate—see Output 5-16

Output 5-16

The predicate can be as complex as necessary; it can contain any number of logical ands (&&) and ors (||), but it must however return a Boolean value. Listing 5-29 demonstrates querying to determine if all contacts (from the same sample data introduced in Chapter 2, “Introducing LINQ to Objects,” Table 2-1) are over the age of 21 and that each contact has a phone number and an email address. The Console output from this example is shown in Output 5-17.

Listing 5-29. Example of using the All operator over a sample set of contact data—see Output 5-17

Output 5-17

Any Operator

The Any operator returns the Boolean result of true if there are any elements in a sequence or if there are any elements that pass a predicate function in a sequence.

The Any operator has two overloads. The first overload takes no arguments and tests the sequence for at least one element, while the second takes a predicate function and tests that at least one element passes that predicates logic. The method signatures are:

To demonstrate the Any operator, Listing 5-30 tests three sequences—one that is empty (has no elements), one that has a single element, and a third that has many elements. Calling the Any operation on these sequences returns a Boolean result. The Console output from this example is shown in Output 5-18.

Listing 5-30. The Any operator returns true if there are any elements in a sequence—see Output 5-18

Output 5-18

The Any operator is the most efficient way to test if any elements are in a sequence, and although the following code is equivalent, it is highly discouraged because it potentially requires iterating the entire sequence if the source doesn’t implement ICollection<T>, which has a Count property. This is only shown here to explain the basic logic of Any:

The second overload of Any that takes a predicate function is equally easy to use. It takes a predicate function and returns the Boolean result of true if any element in the sequence passes (returns true). The example shown in Listing 5-31, looks for certain strings in an array of strings to determine if there are any cats or fish-like animals (albeit in a simplistic fashion). The Console output from this example is shown in Output 5-19.

Listing 5-31. The Any operator returns true if there are any elements in a sequence that pass the predicate function—see Output 5-19

Output 5-19

Again, Any is the most efficient way of determining if there are any elements in a sequence that match a predicate function; however, to demonstrate the basic logic this operator employs, the following is functionally equivalent code that is highly discouraged due to its potential performance impact:

The predicate employed by Any can be as complex as required. Listing 5-32 demonstrates how to use Any on a set of contact records to determine if there are any people under the age of 21 and a second test to see if there are any people without an email address or a phone number (from the same sample data introduced in Chapter 2, Table 2-1). The Console output from this example is shown in Output 5-20.

Listing 5-32. Example of using the Any operator over a sample set of contact data—see Output 5-20

Output 5-20

Contains Operator

The Contains operator returns the Boolean value of true if the sequence contains an equivalent element as the test argument passed into this operator.

There are two overloads for this operator—one that takes a single argument of the test value and the second that takes the test value and an IEqualityComparer<T> instance that is employed for equality testing. The method signatures for the Contains operator are:

To demonstrate the Contains operator, Listing 5-33 looks to see if various versions of the name Peter are contained in the elements of a string array. The test b1 will be false because the default comparison of a string is case sensitive. Test b2 will be true because it is an exact string match, and test b3 will be true even though the string case is different because a custom IEqualityComparer<T> was used to determine string equality. (Creating custom IEqualityComparers is covered in Chapter 4 in the “Specifying Your Own Key Comparison Function” section and in Chapter 6 under “Custom EqualityComparers When Using LINQ Set Operators.”) The Console output from this example is shown in Output 5-21.

Listing 5-33. The Contains operator returns true if the test element is in the sequence—see Output 5-21

Output 5-21

Some collection types that implement IEnumerable<T> and are in scope for the Contains extension method implement their own instance method called Contains (List<T> for instance). This doesn’t cause any problems in most cases, but if you explicitly want to call the extension Contains, specify the generic type. Listing 5-34 demonstrates how to explicitly control what Contains gets called (instance method or extension method) when using a collection of type List<T>. The Console output from this example is shown in Output 5-22.

Listing 5-34. How to explicitly specify which Contains you want to call when a collection has an instance method called Contains—see Output 5-22

Output 5-22

Summary

This chapter introduced many of the remaining standard query operators that are available to you in .NET Framework 4. The next chapter covers the last few operators that specifically apply set-based functions over data sources.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for Chapter 5. Standard Query Operators

Create new playlist

Sign In

Sign Up

Chapter 5. Standard Query Operators

The Built-In Operators

Aggregation Operators—Working with Numbers

Aggregate Operator

Average, Max, Min, and Sum Operators

Count and LongCount Operators

Conversion Operators—Changing Types

AsEnumerable Operator

Cast Operator

OfType Operator

ToArray Operator

ToDictionary Operator

Note

ToList Operator

ToLookup Operator

Element Operators

DefaultIfEmpty Operator

ElementAt and ElementAtOrDefault Operators

First and FirstOrDefault Operators

Last and LastOrDefault Operators

Single and SingleOrDefault Operators

Equality Operator—SequenceEqual

SequenceEqual Operator

Generation Operators—Generating Sequences of Data

Empty Operator

Range Operator

Repeat Operator

Merging Operators

Zip Operator

Partitioning Operators—Skipping and Taking Elements

Skip and Take Operators

SkipWhile and TakeWhile Operators

Quantifier Operators—All, Any, and Contains

All Operator

Any Operator

Contains Operator

Summary

Table of Contents for
Chapter 5. Standard Query Operators