Goals of this chapter:
• Introduce the standard query operators not covered so far.
• Show examples for each operator to help in real world application.
Up to this point, we have explored the main operators that filter, project, order, group, and join. This chapter introduces the remaining operators and demonstrates how they are used. By the end of this chapter (and Chapter 6, “Working with Set Data,” which details the set-based operators), you will have seen all of the standard query operators.
Microsoft .NET Framework 4 has 52 built-in standard query operators. These operators are in the System.Linq
namespace and are made available in each class file by adding the following using
clause:
Many of the operators were discussed in Chapters 3 and 4. This chapter covers the remaining operators, except set-based operators, which are covered in Chapter 6 (see Table 5-1 for a summary). These operators form the basis of most LINQ queries, and the remaining operators introduced in this chapter build on these capabilities to make more complex queries possible. Table 5-2 lists the standard query operators discussed in this chapter.
LINQ’s aggregation operators enumerate a sequence of values (normally numeric, but not mandatory), perform some operation for each element, and ultimately return a result (normally numeric, but not mandatory). In purest form, Sum
, Average
, Min
, and Max
can all be built using the Aggregate
operator; however, the specifically named operators make queries cleaner in syntax by spelling out exactly the operation being performed. Each of these operators are covered in detail in the following sections, starting with the Aggregate
operator.
The Aggregate
operator performs a folding pattern on each element using a given lambda expression and passing the accumulated result to the next element when processed. This operator is powerful, both in building day-to-day queries to process an entire sequence and when building other operators that carry out numeric operations on a sequence (the Aggregate
operator is used as the basis for a custom operator to calculate the standard deviation and variance, as discussed in Chapter 9, “Parallel LINQ to Objects”).
The Aggregate
operator has three overloaded method signatures. These overloads allow you to specify a seed value and/or a selection function in addition to the aggregation function. The method signatures available are:
To understand how the Aggregate
operator works, Listing 5-1 shows an aggregation operation on an integer array. The query computes the numeric sum of all integers in an array, then adds 100 to the final result. The Console output from this example is shown in Output 5-1. The Aggregate
operator iterates the three values in the nums
array. At each step, it takes the running total in the accumulator, adds that value, and stores that back in the accumulator. At the completion of the loop, the accumulator value is 6, and the select function adds an additional 100. The running watch at the start of each loop on the internal variables is listed in Table 5-3.
Listing 5-1. Aggregate
operator example—see Output 5-1
Output 5-1
Simple aggregations are easy to understand. The seed and accumulator don’t have to be numeric types; they can be any type, including an array. Calculating the average for example needs two accumulated values to calculate a result in a single iteration of the source sequence. The general technique is to keep multiple values in an array and update these on each accumulation cycle. For example, to calculate the average of a sequence in a single pass, the code in Listing 5-2 can be used.
Listing 5-2. Calculating the average of an integer array; this example demonstrates how to use the Aggregate
operator with more than a single accumulator value
Average
, Min
, Max
, and Sum
do the expected operation given their names. Used in their simplest form, they return their numeric results when used on any numeric sequence of data as the example in Listing 5-3 shows. The Console output from this example is shown in Output 5-2.
Listing 5-3. Average, Min
, Max
and Sum
operator examples—see Output 5-2
Output 5-2
The Aggregate
, Min
and Max
operators don’t only work on numeric types, they also can be applied to any type. For the Min
and Max
operators, each element is tested against each other element using the <
and >
operator implementation for the element type, and either the lowest or highest element being returned respectively. For example, the following code finds the lowest (‘a’) and highest (‘x’) character in a character array:
Table 5-4 lists the various result type and element type overloads for the Average
operator. Table 5-5 lists the overloads for Max
, Min
, and Sum
. All of the built-in numeric types and their nullable counterparts are catered for.
For each operator, in addition to the various overload types for different numeric element inputs, there is an overload that works on the sequence elements themselves and an overload that takes a selector function to specify the values to be aggregated. The signatures for these operators follow a repetitive pattern, and the following example shows the Sum
operator that returns a decimal value:
Aggregate operations using selector functions are often useful in queries that group data. Listing 5-4 demonstrates a fairly complex query that summarizes fictitious call log data. In this case, the query groups all incoming calls from the same number and then aggregates the total number of calls, call time, and average call time. This query generates the Console output shown in Output 5-3.
Listing 5-4. Aggregate
operators are useful for generating summary data in a query that groups data—see Output 5-3
Output 5-3
Count
and LongCount
return the number of elements in a sequence and only differ in their return types, with Count
returning an int
type result and LongCount
returning a long
type result.
There are two overloads for each operator. The first takes no arguments and returns the total number of elements in a sequence, and the second overload takes a predicate function and returns the count of elements that passed the predicates test. The method signatures available are:
In its most basic use, the Count
and LongCount
operators return the number of elements in a sequence. Although the built-in collection classes rarely offer support for more than the int.MaxValue
of elements, sources streamed from other places can exceed this limit, and LongCount
should be used. Listing 5-5 demonstrates how to use the Count
and LongCount
operators. The Console output for this example is shown in Output 5-4.
Listing 5-5. When it is possible for the Count
operator to be larger than an integer, use the LongCount
operator instead—see Output 5-4
Output 5-4
The second overload of Count
and LongCount
that takes a filter predicate as an argument allows you to count the number of elements that pass that function (return the Boolean value of true). For example, to count the number of integers divisible by ten in an integer array, the following code can be used (in case you are interested, the result is 214,748,364):
The Count
operator is optimized for sequences that implement the ICollection<T>
interface. Collections implementing this interface support the Count
property of their own and often avoid an iteration of their underlying collection to return their counts. For these collections, Count
returns source.Count
as the result. This optimization is not possible when the overload specifying a predicate is used; a full loop of the sequence is unavoidable to apply the predicate function.
LINQ to Objects has many operators that facilitate casting elements in a sequence and the sequence itself. This is often necessary to access specific instance methods that occur in one collection type from another or to access an instance property of an element.
The AsEnumerable
operator changes the compile time type of the source sequence, casting a sequence or collection back to System.Collections.Generic.IEnumerable<T>
. This allows control over what LINQ provider will be called and used extensively when choosing between LINQ providers—most commonly at this time when choosing to execute part of a LINQ to SQL query in-memory rather than at the database. This technique is demonstrated in Chapter 9, where it is used to segment queries into sequential and parallel implementations.
AsEnumerable
has a single overload with the following method signature:
To demonstrate this operator, the following sample code creates a List<int>
called list
and then using the AsEnumerable
operator, turns this collection into an IEnumerable<int>
. Figure 5-1 shows the Intellisense pop-up in Visual Studio 2010. All instance and extension methods are available on the list
instance. Figure 5-2 shows that once this collection has an AsEnumerable
operation applied, it is type-cast down to an IEnumerable<T>
type, and only the extension methods are available (the instance property of List<T>
, Count
can no longer be seen unless it is cast back into an ICollection<int>
or List<int>
).
Casting a fuller-featured collection type back to a vanilla IEnumerable<T>
can be useful when testing custom operators’ handling of nonindexed collections. For example, when testing an operator begin by testing using a collection in a List<T>
(which supports index access to element access) to prove the IList<T>
indexible path and then call AsEnumerable()
on the collection to allow writing tests that exercise the vanilla IEnumerable<T>
path, which cannot benefit from element access via index.
The Cast
operator yields all elements in a nongeneric IEnumerable
collection as a given type. This is mainly to cater to the .NET 1.0 style of collections, which pre-date the generic type features of .NET Framework 2.0, such as the ArrayList
collection, which holds references as Object
types. If the type of the elements cannot be cast to the new type, a System.InvalidCastException
is thrown.
Cast
has a single overload with the following method signature:
To demonstrate using the Cast<T>
operator, the following code initializes a nongeneric ArrayList
collection (introduced in .NET Framework 1.0). This collection only has access to a few operators (namely Cast
, OfType
), and to gain access to all the LINQ standard query operators, you need to use the Cast
operator as shown in Listing 5-6.
Listing 5-6. Example LINQ query over the .NET 1.0 ArrayList
type, which doesn’t implement IEnumerable<T>
The OfType
operator filters the elements of an IEnumerable
collection, returning an IEnumerable<T>
with only elements of the type specified. This is similar to the Cast
operator, however where the elements cannot be cast to the specified type, the OfType
operator simply omits those elements; the Cast
operator throws an exception.
OfType
has a single overload with the following method signature:
The OfType
operator will simply skip elements that can’t be safely cast to the specified type. To demonstrate how to use the OfType
operator, Listing 5-7 shows how to implement a type-safe query over an ArrayList
collection. The collection type ArrayList
implements IEnumerable
but not IEnumerable<T>
, which precludes it using most of the LINQ standard query operators. The ArrayList
collection holds references to System.Object
types, and since all types inherit from System.Object
, type safety of the elements cannot be assumed (as in this example, elements can be mixed type). Generics introduced in .NET 2.0 fixed this situation, and it’s unlikely that newer code doesn’t use a generic collection type.
Listing 5-7. The OfType
operator allows working safely with nongeneric collection types like ArrayList
A more day-to-day use of the OfType
operator is for filtering collections to isolate elements based on specific type. This is common in many types of applications, for example in drawing programs where each element in a collection can be of different “shape” types, which inherit from a base, as the following code demonstrates:
The OfType
operator allows you to write queries over a collection holding all shapes and isolate based on the subtypes, Rectangle
or Circle
. Listing 5-8 demonstrates how to restrict collection elements to a certain type, with this example returning the number of Rectangles
in the source collection.
Listing 5-8. The OfType
operator can be used to filter collection elements based on their type
The ToArray
operator returns an array (<T>
) of a given type from any IEnumerable<T>
sequence. The most common reason to use this operator is to force the immediate execution of a query in order to capture the resulting data as a snapshot (forcing the execution of the query that is waiting to be enumerated because of deferred execution).
ToArray
has a single overload with the following method signature:
To demonstrate the use of the ToArray
operator, Listing 5-9 executes a query over an IEnumerable<int>
sequence and captures the results in an int[]
array.
Listing 5-9. The ToArray
operator enumerates an IEnumerable<T>
sequence and returns the results in an array
The ToDictionary
operator converts an IEnumerable<TValue>
sequence into a Dictionary<TKey,TValue>
collection. The System.Collections.Generic.Dictionary<TKey,TValue
> collection type keeps a list of key value pairs and demands that each key value is unique. A System.ArgumentException
is thrown if a duplicate key insert is attempted during the conversion process. A Dictionary
collection allows efficient access to elements based on key value, and converting a sequence to a dictionary is a way of improving lookup performance if code frequently needs to find elements by key value.
There are four method overloads of the ToDictionary
operator. The simplest overload takes only a delegate for determining the key selection function, and the most complex takes a delegate for key selection, element selection, and an IEqualityComparer
that is used for determining if key values are equal. When the element selector function is not passed, the element type itself is used, and when an IEqualityComparer
is not passed, EqualityComparer<TSource>.Default
is used. The method signatures available are:
To demonstrate the use of the ToDictionary
operator, Listing 5-10 creates a Dictionary of sample contacts based on the key of last name concatenated with first name, allowing efficient lookup of a contact by name. The sample demonstrates this by looking up a contact from the resulting Dictionary
by key value.
Listing 5-10. The ToDictionary
operator creates a dictionary of elements in a sequence
One confusing aspect of using the ToDictionary
operator is the reversal of the generic type definition between the operator (element type and then key type) and the Dictionary type (key type and then element type). You will notice in the example shown in Listing 5-10, that the Dictionary is defined as Dictionary<string, Contact>
and the ToDictionary
operator as ToDictionary<Contact, string>
. It makes sense from the method declaration perspective (source type always comes first and then the return type), it just needs to be remembered. This is often a source of confusion when learning to use this operator.
The ToDictionary
operator isn’t limited to projecting the sequence elements as their own type. Passing in an element selection function allows the value held in the Dictionary against a key value to be any type. The example shown in Listing 5-11 projects the elements to an anonymous type and uses case-insensitive string comparison to ensure that keys are absolutely unique (not just different by character casing), using the built-in static string comparison instances (creating custom IEqualityComparers
is covered in Chapter 4 in the “Specifying Your Own Key Comparison Function” section and in Chapter 6 under “Custom EqualityComparers When Using LINQ Set Operators”).
Listing 5-11. The ToDictionary
operator also accepts a custom equality comparer and element projection expression
The ToList
operator returns a System.Collections.Generic.List<T>
of a given type from any IEnumerable<T>
sequence. The most common reason to use this operator is to force the immediate execution of a query in order to capture the resulting data as a snapshot, executing the query by enumerating over all elements.
ToList
has a single overload with the following method signature:
To demonstrate the ToList
operator, Listing 5-12 executes a query over an IEnumerable<int>
sequence and captures the results in an List<T>
and then uses the ForEach
method on that type to write the results to the Console window.
Listing 5-12. The ToList
operator enumerates an IEnumerable<T>
sequence and returns the results in a List<T>
The ToLookup
operator was first covered in Chapter 4, “Grouping and Joining Data,” where it was used to facilitate very efficient one-to-many joins between two sequences. ToLookup
creates subsequences based on elements that share a common key value. ToLookup
is similar in usage and behavior to ToDictionary
, the difference being how they handle duplicate elements with the same key value. ToLookup
allows multiple elements for a given key value, whereas ToDictionary
throws an exception if this ever occurs.
The method signatures available for the ToLookup
operator are:
To demonstrate the ToLookup
operator, the example shown in Listing 5-13 builds a look-up list of all calls made from or to the same phone number (calls_lookup
). The look-up groupings are then used in a second query to project a sequence of calls against each contact. The resulting Console output is (shortened to the first three records) is shown in Output 5-5.
Listing 5-13. Sample usage of the ToLookup
operator to achieve a one-to-many outer join—see Output 5-5
Output 5-5
The result of a ToLookup
operator is a System.Linq.ILookup<TKey, TElement>
. The definition for the ILookup<TKey, TElement>
interface and the IGrouping<TKey, TElement>
interface are shown in Listing 5-14.
Listing 5-14. Definition for the ILookup
and IGrouping
interfaces
DefaultIfEmpty
, ElementAt
, ElementAtOrDefault
, First
, FirstOrDefault
, Last
, LastOrDefault
, Single
, and SingleOrDefault
are all types of operators that return an individual element from a sequence or a default value if no element of a specific type exists.
The OrDefault
variation of each of these operators (and DefaultIfEmpty
) neatly handles the situation where an element cannot be found that satisfies the criteria, returning a default(T)
result in these cases. The variations that don’t end with OrDefault
throw an exception if no element satisfies the criteria.
The DefaultIfEmpty
operator allows the graceful handling of when a sequence is empty (has no elements). There are two method overloads; the first handles an empty sequence by returning a single element of default(TSource)
within an IEnumerable<T>
. The second overload takes an argument specifying a default value to return if the source sequence is empty. In both cases, the default value is only returned if the source sequence is empty. The method signatures available are:
To demonstrate the use of DefaultIfEmpty
, Listing 5-15 defines two arrays, one empty (empty
) and one with a sequence of data (nums
). DefaultIfEmpty
is called on these arrays to demonstrate the behavior. The Console output from this example is shown in Output 5-6.
Listing 5-15. The DefaultIfEmpty
operator allows safe handling of potentially empty source sequences—see Output 5-6
Output 5-6
The ElementAt
operator returns the element at a given zero-based index position in a sequence. An ArgumentOutOfRangeException
is thrown if the index position is less than zero or beyond the end of the sequence. To avoid this exception being thrown, use the ElementAtOrDefault
operator instead, and a default(T)
instance will be returned when the index is out of range.
The method signatures available for the ElementAt
and ElementAtOrDefault
operators are:
To demonstrate the ElementAt
and ElementAtOrDefault
operators, Listing 5-16 calls these operators on a simple array of integers. To avoid an exception being thrown where an index is out of bounds, as for the sample query called error
, the ElementAtOrDefault
operator is used instead. The Console output from this example is shown in Output 5-7.
Listing 5-16. The ElementAt
operator allows elements to be accessed by zero-based index position—see Output 5-7
Output 5-7
The First
and FirstOrDefault
operators return the first element in a sequence. There are two overloads for each operator. The first overload takes no arguments and returns the first element from the sequence; the second takes a predicate argument and returns the first element that satisfies that predicate. If no elements are in the sequence or pass the predicate function, a System.InvalidOperationException
is thrown, or if using the FirstOrDefault
operator, an instance of default(T)
is returned.
The method signatures available for the First
and FirstOrDefault
operators are:
To demonstrate the use of the First
and FirstOrDefault
operators, Listing 5-17 calls these operators with a simple array of integers and an empty array. To avoid an exception being thrown when there is no first element or no first element that passes the predicate function, the FirstOrDefault
operator is used instead. The Console output from this example is shown in Output 5-8.
Listing 5-17. The First
operator returns the first element in a sequence—see Output 5-8
Output 5-8
The Last
and LastOrDefault
operators return the last element in a sequence. There are two overloads for each operator. The first overload takes no arguments and returns the last element from the sequence; the second takes a predicate argument and returns the last element that satisfies that predicate. If no elements are in the sequence or pass the predicate function, a System.InvalidOperationException
is thrown, or if using the LastOrDefault
operator, an instance of default(T)
is returned.
The method signatures available for the Last
and LastOrDefault
operators are:
To demonstrate the use of the Last
and LastOrDefault
operators, Listing 5-18 calls these operators with a simple array of integers and an empty array. To avoid an exception being thrown when there is no last element or no last element that passes the predicate function, the LastOrDefault
operator is used instead. The Console output from this example is shown in Output 5-9.
Listing 5-18. The Last
operator returns the last element in a sequence—see Output 5-9
Output 5-9
The Single
and SingleOrDefault
operators return the single element from a sequence or the single element that passes a predicate function. The Single
operator throws a System.InvalidOperationException
if there is zero or more than one element in the sequence. The SingleOrDefault
operator returns an instance of default(T)
if there are zero elements in a sequence and still throws a System.InvalidOperationException
if there is more than one element in the sequence or more than one element that passes the predicate.
If there are situations when more than one element might pass a filter or query and you want to avoid the chance of an exception being raised (considering the implications of just choosing one of the elements over another), consider using the First
or Last
operators instead.
The method signatures available for the Single
and SingleOrDefault
operators are:
To demonstrate the Single
and SingleOrDefault
operators, Listing 5-19 calls these operators with a simple array of integers, an array with a single element, and an empty array. To avoid an exception being thrown when there is no element or no element that passes the predicate function, the SingleOrDefault
operator is used instead. If there is ever more than a single element in the sequence or that passes the predicate function, a System.InvalidOperationException
is thrown in all cases. The Console output from this example is shown in Output 5-10.
Listing 5-19. The Single
operator returns a single element from a sequence—see Output 5-10
Output 5-10
There is a single operator that can be used for testing sequence equality—the SequenceEqual
operator.
The SequenceEqual
operator compares one sequence with another and only returns true if the sequences are equal in element values, element order, and element count. There are two method signatures for SequenceEqual
, one that takes the sequence to compare to as an argument, and one that also allows a custom IEqualityComparer
to be passed to control how element equality testing is carried out. When no IEqualityComparer
is passed in, the default equality comparer for the source element type is used.
The method signatures available for the SequenceEqual
operator are:
To demonstrate the use of the SequenceEqual
operator, Listing 5-20 calls this operator to compare three string array sequences. The Console output from this example is shown in Output 5-11.
Listing 5-20. The SequenceEqual
operator compares two sequences for equal element values—see Output 5-11
Output 5-11
Generating sequences of data to combine with LINQ to Object queries is often necessary when building unit tests and when working with indexible data sources, like iterating through the x and y coordinates of a bitmap image. The generation methods aren’t extension methods; they are simply static method calls in the System.Linq.Enumerable
class and are included with the LINQ operators because they fulfill the important task of creating sequences of data without using a for-loop construct.
The Empty
operator doesn’t appear very useful at first sight. So why return an empty IEnumerable<T>
sequence? The main use for this operator is for testing the empty sequence error handling behavior of other operators.
An empty sequence is not a collection type initialized to null
; it is an array of zero elements of the specified type. Pseudo-code would be longer than the implementation of this method, and the following code example reproduces the Empty
operator:
The most common use I’ve found for this operator is for unit testing other operators. The following is an example of testing that an operator correctly throws the correct exception when called on an empty sequence of int
types. (This sample used NUnit, but any testing framework has similar capabilities. The RandomElement
operator is built as an example in Chapter 7, “Extending LINQ to Objects.”)
The Range
operator builds a sequence of integers, starting for a given integer and increments of one, a given number of times. It has the following method signature:
This is useful for quickly generating a sequence of numbers. The following example generates and writes the integer numbers 1900 through to 1904 to the Console window (1900 1901 1902 1903 1904):
Sequences of numbers can be useful for data-binding to controls. One example of this is binding a list of years to a combo-box control on a form. This can simply be achieved by assigning the Datasource
property of any bindable control to a List<int>
generated by Range
, and in this particular case, reversing the order (so the most recent years are at the top):
Listing 5-21. The Range
operator generates numeric sequences, in this case all years from 1900 to 2010—Form
output is shown in Figure 5-3
This operator shows its true practical potential for generating sequences within queries to target indexible locations. This is often the case when working with data in a bitmap. The example shown in Listing 5-22 demonstrates how to address every x and y location in a bitmap loaded from a file, and return the luminance of each pixel as shown in Output 5-12.
Listing 5-22. Using the Range
operator to address every pixel in a bitmap image—see Output 5-12
Output 5-12
The Repeat
operator replicates a given data value any number of times you specify. The simplest example of using the Repeat
operator is to initialize an array. The following code creates an array of integer values, with each element in the array initialized to -1:
The Repeat operator isn’t limited to integer value; any type can be used. For instance, the same array initialization using a string type initializing an array of string values to “No data” for three elements is
Value types work as expected, and caution must be used when the type of the repeating value argument is a reference type. It is obvious once pointed out, but the following code will initialize a bitmap array with all elements pointing to the same single bitmap instance as shown in the first example in Listing 5-23 (b1
). It is likely the intention of this code is to create five separate bitmaps with dimensions five pixels by five pixels. The LINQ Repeat
operator way of achieving this is to use the operator to cause a loop of five Select projections using the second example in Listing 5-23 (b2
).
Listing 5-23. Be careful when using the Repeat
operator for reference types
Zip
, a single merging operator, was added into .NET Framework 4.
The Zip
operator merges the corresponding elements of two sequences using a specified selector function. The selector function takes an element from the first sequence and the corresponding element from the second sequence and projects a result using the function supplied. This process continues until one of the sequences (or both) has no more elements.
Zip
has a single overload with the following method signature:
To demonstrate the use of the Zip
operator, Listing 5-24 calls this operator to merge a string array with an integer array. The first element in both sequences is passed to the result selector function, then the second elements, and then the third elements. Even though the string array has two more elements, they are skipped because the integer array has only three elements. The Console output from this example is shown in Output 5-13.
Listing 5-24. The Zip
operator merges (combines) two sequences—see Output 5-13
Output 5-13
Paging data involves restricting the amount of data returned to segments of the entire sequence. For instance, it is common for websites to return the results of a search in groups of a certain increment to improve performance by not transferring all one million items, just the first ten, then the next ten when asked, and so on. The Skip
and Take
operators form the basis for achieving this using LINQ queries. SkipWhile
and TakeWhile
allow conditional statements to control the segmentation of data, skipping data until a predicate fails or taking data until a predicate fails.
Skip
and Take
do as their names suggest; Skip
jumps over the number of elements in a sequence (or skips the entire sequence if the count of elements is less than the number to skip), and Take
returns the given number of elements (or as many elements as possible until the end of the sequence is reached).
The method signatures available for the Skip
and Take
operators are:
Although the operators are independent, they are often combined to achieve paging a result sequence. The pattern generally used for paging is shown in Listing 5-25, which demonstrates retrieving elements 21 to 30 (the third page), where each page is ten records long (normally the page number is passed in as a variable). The Console output from this example is shown in Output 5-14.
Listing 5-25. The Skip
and Take
operator combine to return paged segments of a sequence—see Output 5-14
Output 5-14
The Take
operator also allows the topmost records (the first) to be returned similar to SQL’s SELECT TOP(n)
statement. This can be to protect against extreme result set sizes or to limit the result set size during development and testing to a manageable quantity. If the page size is being retrieved from user input (like the query string in a URL), then this input should be checked and limited to avoid denial of service attacks by someone setting page size to an unrealistic extreme. Listing 5-26 shows a custom extension method operator that has safeguards to avoid paging exploitation through malicious input. In this case, this operator limits page size to 100 elements, and handles the translation of page number and page size into Skip
and Take
operations making repetitive code like that shown in Listing 5-25 centralized in one place. When this operator is called with the following line, the identical results shown in Output 5-14 are returned.
Listing 5-26. Custom paging extension method with safeguards from extreme inputs taken from untrusted sources
The SkipWhile
and TakeWhile
operators skip or return elements from a sequence while a predicate function passes (returns True). The first element that doesn’t pass the predicate function ends the process of evaluation.
There are two overloads for each operator—one that takes a predicate to determine if an element should be skipped, and the second that takes the predicate but also provides an element index position that can be used in the predicate function for any given purpose. The method signatures available are:
A simple example of where these operators are useful is when parsing text files from the beginning. The example shown in Listing 5-27 skips any lines at the start of a file (in this case, this data is in string form for simplicity) and then returns all lines until an empty line is found. All lines after this blank line are ignored. The Console output from this example is shown in Output 5-15.
Listing 5-27. The SkipWhile
and TakeWhile
operators skip or return elements while their predicate argument returns true
—see Output 5-15
Output 5-15
The Quantifier operators, All
, Any
, and Contains
return Boolean result values based on the presence or absence of certain data within a sequence. These operators could all be built by creating queries that filter the data using a Where
clause and comparing the resulting sequence element count to see if it is greater than zero. But these operators are optimized for their purpose and return a result at the first possible point, avoiding iterating an entire sequence (often referred to as short-circuit evaluation).
The All
operator returns the Boolean result of true
if all elements in a sequence pass the predicate function supplied.
All
has a single overload with the following method signature:
To demonstrate the All
operator, Listing 5-28 tests three arrays of integer values and returns the value of true if all elements are even (are divisible by two with no remainder). The Console output from this example is shown in Output 5-16.
Listing 5-28. The All
operator returns true
if all objects pass the predicate—see Output 5-16
Output 5-16
The predicate can be as complex as necessary; it can contain any number of logical ands (&&
) and ors (||
), but it must however return a Boolean value. Listing 5-29 demonstrates querying to determine if all contacts (from the same sample data introduced in Chapter 2, “Introducing LINQ to Objects,” Table 2-1) are over the age of 21 and that each contact has a phone number and an email address. The Console output from this example is shown in Output 5-17.
Listing 5-29. Example of using the All
operator over a sample set of contact data—see Output 5-17
Output 5-17
The Any
operator returns the Boolean result of true
if there are any elements in a sequence or if there are any elements that pass a predicate function in a sequence.
The Any
operator has two overloads. The first overload takes no arguments and tests the sequence for at least one element, while the second takes a predicate function and tests that at least one element passes that predicates logic. The method signatures are:
To demonstrate the Any
operator, Listing 5-30 tests three sequences—one that is empty (has no elements), one that has a single element, and a third that has many elements. Calling the Any
operation on these sequences returns a Boolean result. The Console output from this example is shown in Output 5-18.
Listing 5-30. The Any
operator returns true
if there are any elements in a sequence—see Output 5-18
Output 5-18
The Any
operator is the most efficient way to test if any elements are in a sequence, and although the following code is equivalent, it is highly discouraged because it potentially requires iterating the entire sequence if the source doesn’t implement ICollection<T>
, which has a Count
property. This is only shown here to explain the basic logic of Any
:
The second overload of Any
that takes a predicate function is equally easy to use. It takes a predicate function and returns the Boolean result of true
if any element in the sequence passes (returns true
). The example shown in Listing 5-31, looks for certain strings in an array of strings to determine if there are any cats or fish-like animals (albeit in a simplistic fashion). The Console output from this example is shown in Output 5-19.
Listing 5-31. The Any
operator returns true
if there are any elements in a sequence that pass the predicate function—see Output 5-19
Output 5-19
Again, Any
is the most efficient way of determining if there are any elements in a sequence that match a predicate function; however, to demonstrate the basic logic this operator employs, the following is functionally equivalent code that is highly discouraged due to its potential performance impact:
The predicate employed by Any
can be as complex as required. Listing 5-32 demonstrates how to use Any
on a set of contact records to determine if there are any people under the age of 21 and a second test to see if there are any people without an email address or a phone number (from the same sample data introduced in Chapter 2, Table 2-1). The Console output from this example is shown in Output 5-20.
Listing 5-32. Example of using the Any
operator over a sample set of contact data—see Output 5-20
Output 5-20
The Contains
operator returns the Boolean value of true
if the sequence contains an equivalent element as the test argument passed into this operator.
There are two overloads for this operator—one that takes a single argument of the test value and the second that takes the test value and an IEqualityComparer<T>
instance that is employed for equality testing. The method signatures for the Contains
operator are:
To demonstrate the Contains
operator, Listing 5-33 looks to see if various versions of the name Peter are contained in the elements of a string array. The test b1
will be false because the default comparison of a string is case sensitive. Test b2
will be true because it is an exact string match, and test b3
will be true even though the string case is different because a custom IEqualityComparer<T>
was used to determine string equality. (Creating custom IEqualityComparers
is covered in Chapter 4 in the “Specifying Your Own Key Comparison Function” section and in Chapter 6 under “Custom EqualityComparers When Using LINQ Set Operators.”) The Console output from this example is shown in Output 5-21.
Listing 5-33. The Contains
operator returns true
if the test element is in the sequence—see Output 5-21
Output 5-21
Some collection types that implement IEnumerable<T>
and are in scope for the Contains
extension method implement their own instance method called Contains
(List<T>
for instance). This doesn’t cause any problems in most cases, but if you explicitly want to call the extension Contains
, specify the generic type. Listing 5-34 demonstrates how to explicitly control what Contains
gets called (instance method or extension method) when using a collection of type List<T>
. The Console output from this example is shown in Output 5-22.
Listing 5-34. How to explicitly specify which Contains
you want to call when a collection has an instance method called Contains
—see Output 5-22
Output 5-22
This chapter introduced many of the remaining standard query operators that are available to you in .NET Framework 4. The next chapter covers the last few operators that specifically apply set-based functions over data sources.
3.138.172.130