Chapter 2. C# and VB.NET language enhancements

This chapter covers:

  • Key C# 3.0 and VB.NET 9.0 languages features for LINQ
  • Implicitly typed local variables
  • Object initializers
  • Lambda expressions
  • Extension methods
  • Anonymous types

In chapter 1, we reviewed the motivation behind LINQ and introduced some code to give you an idea of what to expect. In this chapter, we’ll present the language extensions that make LINQ possible and allow queries to blend into programming languages.

LINQ extends C# and VB.NET with new constructs. We find it important that you discover these language features before we get back to LINQ content. This chapter is a stepping stone that explains how the C# and VB.NET languages have been enriched to make LINQ possible. Please note that the full-fledged features we present here can be used in contexts other than just LINQ.

We won’t go into advanced details about each feature, because we don’t want to lose our focus on LINQ for too long. You’ll be able to see all these features in action throughout this book, so you should grow accustomed to them as you read.

In chapter 3, we’ll focus on LINQ-specific concepts such as expression trees and query operators. You’ll then see how the features presented in this chapter are used by LINQ.

2.1. Discovering the new language enhancements

.NET 2.0 laid the groundwork for a lot of what LINQ needs to work. Indeed, it introduced a number of important language and framework enhancements. For example, .NET now supports generic types, and in order to achieve the deep data integration that LINQ targets, you need types that can be parameterized—otherwise the type system isn’t rich enough.

C# 2.0 also added anonymous methods and iterators. These features serve as cornerstones for the new level of integration between data and programming languages.

We expect readers of this book to know the basics about the features offered by .NET 2.0. We’ll provide you with a refresher on anonymous methods in section 2.4 when we present lambda expressions, and we’ll review iterators in chapter 3.

More features were required, though, for LINQ to expose query syntaxes natively to languages such as C# and VB.NET. C# 3.0 and VB.NET 9.0 (also known as VB 2008) build on generics, anonymous methods, and iterators as key components of the LINQ facility.

These features include

  • Implicitly typed local variables, which permit the types of local variables to be inferred from the expressions used to initialize them.
  • Object initializers, which ease construction and initialization of objects.
  • Lambda expressions, an evolution of anonymous methods that provides improved type inference and conversion to both delegate types and expression trees, which we’ll discuss in the next chapter.
  • Extension methods, which make it possible to extend existing types and constructed types with additional methods. With extension methods, types aren’t extended but look as if they were.
  • Anonymous types, which are types automatically inferred and created from object initializers.

Instead of merely listing these new language features and detailing them one by one, let’s discover them in the context of an ongoing example. This will help us clearly see how they can help us in our everyday coding.

We’ll start with the simplest code possible, using only .NET 2.0 constructs, and then we’ll improve it by progressively introducing the new language features. Each refactoring step will address one specific problem or syntax feature. First, let’s get acquainted with our simple example: an application that outputs a list of running processes.

2.1.1. Generating a list of running processes

Let’s say we want to get a list of the processes running on our computer. This can be done easily thanks to the System.Diagnostics.Process.GetProcesses API.

 

Note

We use the GetProcesses method in this example because it returns a generic list of results that are likely to be different each time the method is called. This makes our example more realistic than one that would be based on a static list of items.

 

Listing 2.1 shows sample C# 2.0 code that achieves our simple goal.

Listing 2.1. Sample .NET 2.0 code for listing processes (DotNet2.csproj)

Our processes variable points to a list of strings . The type we use is based on the generic type List<T>. Generics are a major addition to .NET that first appeared in .NET 2.0. They allow us to maximize code reuse, type safety, and performance. The most common use of generics is to create strongly typed collection classes, just like we’re doing here. As you’ll notice, LINQ makes extensive use of generics.

In the listing, we use a class named ObjectDumper to display the results . ObjectDumper is a utility class provided by Microsoft as part of the LINQ code samples. We’ll reuse ObjectDumper in many code samples throughout this book. (The complete source code for the samples is available for download at http://LinqInAction.net.) ObjectDumper can be used to dump an object graph in memory to the console. It’s particularly useful for debugging purposes; we’ll use it here to display the result of our processing.

This first version of the code is nothing more than a foreach loop that adds process names to a list , so a call to Console.WriteLine on each item would be enough. However, in the coming examples, we’ll have more complex results to display. ObjectDumper will then save us some code by doing the display work for us.

Here is some sample output produced by listing 2.1:

firefox
Skype
WINWORD
devenv
winamp
Reflector

This example is very simple. Soon, we’ll want to be able to filter this list, sort it, or perform other operations, such as grouping or projections.

Let’s improve our example a bit. For a start, what if we’d like more information about the process than just its name?

2.1.2. Grouping results into a class

Let’s say we’d like the list to contain the ID, name, and memory consumption of each process. For instance:

Id=2300      Name=firefox      Memory=78512128
Id=2636      Name=Skype        Memory=23478272
Id=2884      Name=WINWORD      Memory=78442496
Id=2616      Name=devenv       Memory=54296576
Id=1824      Name=winamp       Memory=29188096
Id=2940      Name=Reflector    Memory=83857408

This requires creating a class or structure to group the information we’d like to retain about a process. Listing 2.2 shows the code with a new class shown in bold named ProcessData.

 

Note

Here we use public fields in the ProcessData class for the sake of simplicity, but properties and private fields would be better. Read on and in a few pages you’ll discover how to easily use properties instead thanks to C# 3.0.

 

Listing 2.2. Improved .NET 2.0 code for listing processes (DotNet2Improved.csproj)

Although our code produces the output we want, it has some duplicate information in it. The type of our objects is specified twice : once for the declaration of the variables and once more for calling the constructor:

List<ProcessData> processes = new List<ProcessData>();
...
ProcessData data = new ProcessData();

New keywords will allow us to make our code shorter and avoid duplication, as you’ll see next.

2.2. Implicitly typed local variables

var i = 5;

C# 3.0 offers a new keyword that allows us to declare a local variable without having to specify its type explicitly: var. When the var keyword is used to declare a local variable, the compiler infers the type of this variable from the expression used to initialize it.

Let’s review the syntax proposed by this new keyword, and then we’ll revise our example with it.

2.2.1. Syntax

The var keyword is easy to use. It should be followed by the name of the local variable and then by an initializer expression. For example, the following two code snippets are equivalent. They produce the exact same Intermediate Language (IL) code once compiled.

Let’s compare some code with implicitly typed variables and some code without. Here is some code with implicitly typed variables:

var i = 12;
var s = "Hello";
var d = 1.0;
var numbers = new[] {1, 2, 3};
var process = new ProcessData();
var processes =
  new Dictionary<int, ProcessData>();

And here is equivalent code with the traditional syntax:

int i = 12;
string s = "Hello";
double d = 1.0;
int[] numbers = new int[] {1, 2, 3};
ProcessData process = new ProcessData();
Dictionary<int, ProcessData> processes =
  new Dictionary<int, ProcessData>();

Implicitly typed local variables can also be used in VB.NET, thanks to the Dim keyword. For example, here is the Dim keyword with implicitly typedvariables:

Dim processes =
  New List(Of ProcessData)()

And here it is with the traditional syntax:

Dim processes As List(Of ProcessData) =
  New List(Of ProcessData)()

This looks like variants in VB, but the new syntax and variants aren’t the same. Implicitly typed local variables are strongly typed. For example, the following VB.NET code isn’t valid and will return an error stating that conversion from type String to type Integer isn’t valid:

Dim someVariable = 12
someVariable = "Some string"

In the first line, someVariable is an Integer. The second line throws the error.

In comparison, the following code that uses a variant is valid:

Dim someVariable as Variant = 12
someVariable = "Some String"

2.2.2. Improving our example using implicitly typed local variables

Listing 2.3 shows how we could improve our DisplayProcesses method thanks to the var keyword. New code is shown in bold.

Listing 2.3. Our DisplayProcesses method using the var keyword (UsingVar.csproj)

 

Note

This time, we use auto-implemented properties to define the ProcessData class . This is a new feature of the C# 3.0 compiler that creates anonymous private variables to contain each of the values that the individual property will be using. Using this new syntax, we can eliminate the need for explicitly stating the private variables and repetitive property accessors.

 

Listing 2.3 does exactly the same thing as listing 2.2. It may not look like it at first, but the processes, process, and data variables are still strongly typed!

With implicitly typed local variables , we no longer have to write the types of local variables twice. The compiler infers the types automatically. This means that even though we use a simplified syntax, we still get all the benefits of strong types, such as compile-time validation and IntelliSense.

Notice that we can use the same var keyword in foreach to avoid writing the type of the iteration variable.

As you can see, the var and Dim keywords can be used extensively to write shorter code. In some cases, they’re required to use LINQ features. However, if you like to have the local variable declarations grouped at the top of method bodies instead of scattered all over the code statements, you’ll use var and Dim thoughtfully.

Let’s improve our example a bit more. Initializing a new ProcessData object requires lengthy code. It’s time to introduce a new improvement to fix this.

2.3. Object and collection initializers

new Point {X = 1, Y = 2}

As we continue to make progress in our journey through the new C# and VB.NET features, the features we introduce in this section will be useful when you start to write query expressions in the next chapter.

We’ll start this section with an introduction to object and collection initializers. We’ll then update our running example to use an object initializer.

2.3.1. The need for object initializers

Object initializers allow us to specify values for one or more fields or properties of an object in one statement. They allow declarative initializations for all kinds of objects.

 

Note

This is possible only for accessible fields and properties. The expression after the equals sign is processed the same way as an assignment to the field or property.

 

Until now, we have been able to initialize objects of primitive or array types, as follows:

int i = 12;
string s = "abc"
string[] names = new string[] {"LINQ", "In", "Action"}

It wasn’t possible to use a simple instruction to initialize other objects, though. We had to use code like this:

ProcessData data = new ProcessData();
data.Id = 123;
data.Name = "MyProcess";
data.Memory = 123456;

Starting with C# 3.0 and VB.NET 9.0, we can initialize all objects using an initializer approach.

In C#

var data = new ProcessData {Id = 123, Name = "MyProcess",
                            Memory = 123456};

In VB.NET

Dim data = New ProcessData With {.Id = 123, .Name = "MyProcess", _
                                 .Memory = 123456}

The pieces of code with and without object initializers produce the same IL code. Object initializers simply offer a shortcut.

In cases where a constructor is required or useful, it’s still possible to use object initializers. In the following example, we use a constructor in combination with an object initializer:

throw new Exception("message") { Source = "LINQ in Action" };

Here, we initialize two properties in one line of code: Message (through the constructor) and Source (through an object initializer). Without the new syntax, we would have to declare a temporary variable like this:

var exception = new Exception("message");
exception.Source = "LINQ in Action";
throw exception;

2.3.2. Collection initializers

Another kind of initializer has been added: the collection initializer. This new syntax allows us to initialize different types of collections, provided they implement System.Collections.IEnumerable and provide suitable Add methods.

Here’s an example:

var digits = new List<int> {0, 1, 2, 3, 4, 5, 6, 7, 8, 9};

This line of code is equivalent to the following code, which is generated by the compiler transparently:

List<int> digits = new List<int>();
digits.Add(0);
digits.Add(1);
digits.Add(2);
...
digits.Add(9);

Object and collection initializers are particularly useful when used together in the same piece of code. The following two equivalent code blocks show how initializers allow us to write shorter code. Let’s compare some code with object and collection initializers to code without. Here is the code with object and collection initializers:

var processes = new List<ProcessData> {
  new ProcessData {Id=123, Name="devenv"},
  new ProcessData {Id=456, Name="firefox"}
}

Here is the same code without initializers. Note that it’s much longer:

ProcessData tmp;
var processes = new List<ProcessData>();
tmp = new ProcessData();
tmp.Id = 123;
tmp.Name = "devenv";
processes.Add(tmp);
tmp = new ProcessData();
tmp.Id = 456;
tmp.Name = "firefox";
processes.Add(tmp);

We can initialize collections represented by a class that implements the IEnumerable interface and provides an Add method. We can use syntax of the form {x, y, z} to describe arguments that match the Add method’s signature if there is more than one argument. This enables us to initialize many preexisting collection classes in the framework and third-party libraries.

This generalization allows us to initialize a dictionary with the following syntax, for example:

new Dictionary<int, string> {{1, "one"}, {2, "two"}, {3, "three"}}

2.3.3. Improving our example using an object initializer

As you can see in the following code snippet, we have to write several lines of code and use a temporary variable in order to create a ProcessData object:

ProcessData data = new ProcessData();
data.Id = process.Id;
data.Name = process.ProcessName;
data.Memory = process.WorkingSet64;
processes.Add(data);

We could add a constructor to our ProcessData class to be able to initialize an object of this type in just one statement. This would allow us to write listing 2.4.

Listing 2.4. DisplayProcesses method using a constructor for ProcessData
static void DisplayProcesses()
{
  var processes = new List<ProcessData>();
  foreach (var process in Process.GetProcesses())
  {
    processes.Add( new ProcessData(process.Id,
      process.ProcessName, process.WorkingSet64) );
  }
  ObjectDumper.Write(processes);
}

Adding a constructor requires adding code to the ProcessData type. In addition, the constructor we add may not be suitable for every future use of this class. An alternative solution is to adapt our code to use the new object initializer syntax, as in listing 2.5.

Listing 2.5. DisplayProcesses method using an object initializer (ObjectInitializer.csproj)
static void DisplayProcesses()
{
  var processes = new List<ProcessData>();
  foreach (var process in Process.GetProcesses())
  {
    processes.Add( new ProcessData { Id=process.Id,
      Name=process.ProcessName, Memory=process.WorkingSet64 } );
  }
  ObjectDumper.Write(processes);
}

Although the two syntaxes are similar, the latter doesn’t require us to add a constructor!

We can see several advantages to the object initializer notation:

  • We can initialize an object within just one instruction.
  • We don’t need to provide a constructor to be able to initialize simple objects.
  • We don’t need several constructors to initialize different properties of objects.

This doesn’t mean that object initializers are an alternative to writing good constructors. Object initializers and constructors are language features that complement each other. You should still define the appropriate set of constructors for your types. Constructors help prevent the creation of objects that aren’t completely initialized and define the correct initialization order for an object’s members.

After these syntactic improvements, let’s add new functionality to our example. We’ll do this with the help of lambda expressions.

2.4. Lambda expressions

address => address.City == "Paris"

As a part of our tour of the new language features that are enablers for LINQ, we’ll now introduce lambda expressions, which come from the world of the lambda calculus. Many functional programming languages such as Lisp use lambda notations to define functions. In addition to allowing the expression of LINQ queries, the introduction of lambda expressions into C# and VB.NET can be seen as a step toward functional languages.

 

Lambda calculus

In mathematical logic and computer science, the lambda calculus (λ-calculus) is a formal system designed to investigate function definition, function application, and recursion. It was introduced by Alonzo Church in the 1930s. Lambda calculus has greatly influenced functional programming languages, such as Lisp, ML, and Haskell. (Source: Wikipedia.)

 

Let’s get back to our example. Suppose we want to improve it by adding filtering capabilities. In order to do this, we can use delegates, which allow us to pass one method as a parameter to another, for example.

We’ll start with a refresher on delegates and anonymous methods before using lambda expressions.

2.4.1. A refresher on delegates

Let’s build on the code of our DisplayProcesses method as we left it in listing 2.5. Here, we’ve added a hard-coded filtering condition, as you can see in listing 2.6.

Listing 2.6. DisplayProcesses method with a hard-coded filtering condition

WorkingSet64 is the amount of physical memory allocated for the associated process. Here we search for processes with more than 20 megabytes of allocated memory.

In order to make our code more generic, we’ll try to provide the filter information as a parameter of our method instead of keeping it hard-coded. In C# 2.0 and earlier, this was possible thanks to delegates. A delegate is a type that can store a pointer to a method.

Our filtering method should take a Process object as an argument and return a Boolean value to indicate whether a process matches some criteria. Here is how to declare such a delegate:

delegate Boolean FilterDelegate(Process process);

Instead of creating our own delegate type, we can also use what .NET 2.0 provides: the Predicate<T> type. Here is how this type is defined:

delegate Boolean Predicate<T>(T obj);

The Predicate<T> delegate type represents a method that returns true or false, based on its input. This type is generic, so we need to specify that it will work on Process objects. The exact delegate type we’ll use is Predicate<Process>.

Listing 2.7 shows our DisplayProcesses method adapted to take a predicate as a parameter.

Listing 2.7. DisplayProcesses method that uses a delegate for filtering
static void DisplayProcesses(Predicate<Process> match)
{
  var processes = new List<ProcessData>();
  foreach (var process in Process.GetProcesses())
  {
    if (match(process))
    {
      processes.Add(new ProcessData { Id=process.Id,
        Name=process.ProcessName, Memory=process.WorkingSet64 });
    }
  }
  ObjectDumper.Write(processes);
}

With the DisplayProcesses method updated as in the listing, it’s now possible to pass any “filter” to it. In our case, the filtering method contains our condition and returns true if the criterion is matched:

static Boolean Filter(Process process)
{
  return process.WorkingSet64 >= 20*1024*1024;
}

To use this method, we provide it as an argument to the DisplayProcesses method, as in listing 2.8.

Listing 2.8. Calling the DisplayProcesses method using a standard delegate
DisplayProcesses(Filter);

2.4.2. Anonymous methods

Delegates existed in C# 1.0, but C# 2.0 was improved to allow working with delegates through anonymous methods. Anonymous methods allow you to write shorter code and avoid the need for explicitly named methods.

Thanks to anonymous methods, we don’t need to declare a method like Filter. We can directly pass the code to DisplayProcesses, as in listing 2.9.

Listing 2.9. Calling the DisplayProcesses method using an anonymous method
DisplayProcesses( delegate (Process process)
  { return process.WorkingSet64 >= 20*1024*1024; }  );

 

Note

VB.NET doesn’t offer support for anonymous methods.

 

Those who have dealt with C++’s Standard Template Library (STL) may compare anonymous methods to functors. Similarly to functors, anonymous methods can be used to elegantly tweak a collection with a single line of code.

.NET 2.0 introduced a set of methods in System.Collections.Generic.List<T> and System.Array that are designed especially to be used with anonymous methods. These methods include ForEach, Find, and FindAll. They can operate on a list or an array with relatively little code.

For example, here is how the Find method can be used with an anonymous method to find a specific process:

var visualStudio = processes.Find(delegate (Process process)
  { return process.ProcessName == "devenv"; } );

2.4.3. Introducing lambda expressions

Instead of using an anonymous method, like in listing 2.9, starting with C# 3.0 we can use a lambda expression.

Listing 2.10 is strictly equivalent to the previous piece of code.

Listing 2.10. Calling the DisplayProcesses method using a lambda expression (LambdaExpressions.csproj)
DisplayProcesses(process => process.WorkingSet64 >= 20*1024*1024);

Notice how the code is simplified when using a lambda expression. This lambda expression reads like this: “Given a process, return true if the process consumes 20 megabytes of memory or more.”

As you can see, in the case of lambda expressions, we don’t need to provide the type of the parameter. Again, this was duplicated information in the previous code: The new C# compiler is able to deduce the type of the parameters from the method signature.

Comparing lambda expressions with anonymous methods

C# 2.0 introduced anonymous methods, which allow code blocks to be written “inline” where delegate values are expected. The anonymous method syntax is verbose and imperative in nature. In contrast, lambda expressions provide a more concise syntax, providing much of the expressive power of functional programming languages.

Lambda expressions can be considered as a functional superset of anonymous methods, providing the following additional functionality:

  • Lambda expressions can infer parameter types, allowing you to omit them.
  • Lambda expressions can use both statement blocks and expressions as bodies, allowing for a terser syntax than anonymous methods, whose bodies can only be statement blocks.
  • Lambda expressions can participate in type argument inference and method overload resolution when passed in as arguments. Note: anonymous methods can also participate in type argument inference (inferred return types).
  • Lambda expressions with an expression body can be converted into expression trees. (We’ll introduce expression trees in the next chapter.)

Lambda expressions introduce new syntaxes in C# and VB.NET. In the next section, we’ll look at the structure of lambda expressions and review some samples so you can grow accustomed to them.

How to express lambda expressions

In C#, a lambda expression is written as a parameter list, followed by the => token, followed by an expression or a statement block, as shown in figure 2.1.

Figure 2.1. Structure of a lambda expression in C#

 

Note

The => token always follows the parameter list. It should not be confused with comparison operators such as <= and >=.

 

The lambda operator can be read as “goes to.” The left side of the operator specifies the input parameters (if any), and the right side holds the expression or statement block to be evaluated.

There are two kinds of lambda expressions. A lambda expression with an expression on the right side is called an expression lambda. The second kind is a statement lambda, which looks similar to an expression lambda except that its right part consists of any number of statements enclosed in curly braces.

To give you a better idea of what lambda expressions look like in C#, see listing 2.11 for some examples.

Listing 2.11. Sample lambda expressions in C#

x => x + 1     
x => { return x + 1; }     
(int x) => x + 1     
(int x) => { return x + 1; }     
(x, y) => x * y     
() => 1     
() => Console.WriteLine()     
customer => customer.Name
person => person.City == "Paris"
(person, minAge) => person.Age >= minAge

Implicitly typed, expression body

Implicitly typed, statement body

Explicitly typed, expression body

Explicitly typed, statement body

Multiple parameters

No parameters, expression body

No parameters, statement body

 

Note

The parameters of a lambda expression can be explicitly or implicitly typed.

 

In VB.NET, lambda expressions are written differently. They start with the Function keyword, as shown in figure 2.2:

Figure 2.2. Structure of a lambda expression in VB.NET

 

Note

VB.NET 9.0 doesn’t support statement lambdas.

 

Listing 2.12 shows the sample expressions we provided for C#, but in VB.NET this time.

Listing 2.12. Sample lambda expressions in VB.NET

Function(x) x + 1     
Function(x As Integer) x + 1     
Function(x, y) x * y     
Function() 1     
Function(customer) customer.Name
Function(person) person.City = "Paris"
Function(person, minAge) person.Age >= minAge

Implicitly typed

Explicitly typed

Multiple parameters

No parameters

As you saw in the example, lambda expressions are compatible with delegates. To give you a feel for lambda expressions as delegates, we’ll use some delegate types.

The System.Action<T>, System.Converter<TInput, TOutput>, and System.Predicate<T> generic delegate types were introduced by .NET 2.0:

delegate void Action<T>(T obj);
delegate TOutput Converter<TInput, TOutput>(TInput input);
delegate Boolean Predicate<T>(T obj);

Another interesting delegate type from previous versions of .NET is MethodInvoker. This type represents any method that takes no parameters and returns no results:

delegate void MethodInvoker();

We regret that MethodInvoker has been declared in the System.Windows.Forms namespace even though it can be useful outside Windows Forms applications. This has been addressed in .NET 3.5. A new version of the Action delegate type that takes no parameter is added to the System namespace by the new System.Core.dll assembly:

delegate void Action();

 

Note

The System.Core.dll assembly comes with .NET 3.5. We’ll describe its content and the content of the other LINQ assemblies in chapter 3.

 

A whole set of additional delegate types is added to the System namespace by the System.Core.dll assembly:

delegate void Action<T1, T2>(T1 arg1, T2 arg2);
delegate void Action<T1, T2, T3>(T1 arg1, T2 arg2);
delegate void Action<T1, T2, T3, T4>(T1 arg1, T2 arg2,
  T3 arg3, T4 arg4);
delegate TResult Func<TResult>();
delegate TResult Func<T, TResult>(T arg);
delegate TResult Func<T1, T2, TResult>(T1 arg1, T2 arg2);
delegate TResult Func<T1, T2, T3, TResult>(T1 arg1, T2 arg2);
delegate TResult Func<T1, T2, T3, T4, TResult>(T1 arg1, T2 arg2,
  T3 arg3, T4 arg4);

A lambda expression is compatible with a delegate if the following rules are respected:

  • The lambda must contain the same number of parameters as the delegate type.
  • Each input parameter in the lambda must be implicitly convertible to its corresponding delegate parameter.
  • The return value of the lambda (if any) must be implicitly convertible to the delegate’s return type.

To give you a good overview of the various possible combinations, we have prepared a set of sample lambda expressions declared as delegates. These samples demonstrate the compatibility between the delegate types we have just introduced and some lambda expressions. Listings 2.13 and 2.14 contain the samples, which include lambda expressions and delegates with and without parameters, both with and without result, as well as expression lambdas and statement lambdas.

Listing 2.13. Sample lambda expressions declared as delegates in C# (LambdaExpressions.csproj)

No parameter

Implicitly typed string parameter

Explicitly typed string parameter

Two implicitly typed parameters

Equivalent but not compatible

Same lambda expression but different delegate types

Statement body and explicitly typed parameters

Listing 2.14 shows similar lambda expressions declared as delegates in VB.

Listing 2.14. Sample lambda expressions declared as delegates in VB.NET (LambdaExpressions.vbproj)

No parameter

Implicitly typed string parameter

Explicitly typed string parameter

Two implicitly typed parameters

Equivalent but not compatible

Same lambda expression but different delegate types

The statement lambda isn’t reproduced in VB in the listing because VB.NET doesn’t support this kind of lambda expression. Furthermore, we use Func(Of String, String) instead of Action(Of String) because it would require a statement lambda.

Let’s continue improving our example. This time, we’ll work on the list of processes.

2.5. Extension methods

static void Dump(this object o);

The next topic we’d like to cover is extension methods. You’ll see how this new language feature allows you to add methods to a type after it has been defined. You’ll also see how extension methods compare to static methods and instance methods.

We’ll start by creating a sample extension method, before going through more examples and using some predefined extension methods. Before jumping onto the next subject, we’ll give you some warnings and show you the limitations of extension methods.

2.5.1. Creating a sample extension method

In our continuing effort to improve our example that displays information about the running processes, let’s say we want to compute the total memory used by a list of processes. We could define a standard static method that accepts an enumeration of ProcessData objects as a parameter. This method would loop on the processes and sum the memory used by each process.

For an example, see listing 2.15.

Listing 2.15. The TotalMemory helper method coded as standard static method
static Int64 TotalMemory(IEnumerable<ProcessData> processes)
{
  Int64 result = 0;

  foreach (var process in processes)

    result += process.Memory;

  return result;
}

We could then use this method this way:

Console.WriteLine("Total memory: {0} MB",
  TotalMemory(processes)/1024/1024);

One thing we can do to improve our code is convert our static method into an extension method. This new language feature makes it possible to treat existing types as if they were extended with additional methods.

Declaring extension methods in C#

In order to transform our method into an extension method, all we have to do is add the this keyword to the first parameter, as shown in listing 2.16.

Listing 2.16. The TotalMemory helper method declared as an extension method (ExtensionMethods.csproj)

static Int64 TotalMemory(this IEnumerable<ProcessData> processes)   
{
  Int64 result = 0;

  foreach (var process in processes)
    result += process.Memory;

  return result;
}

If we examine this new version of the method, it still looks more or less exactly like any run-of-the-mill helper routine, with the notable exception of the first parameter being decorated with the this keyword .

The this keyword instructs the compiler to treat the method as an extension method. It indicates that this is a method that extends objects of type IEnumerable<ProcessData>.

 

Note

In C#, extension methods must be declared on a non-generic static class. In addition, an extension method can take any number of parameters, but the first parameter must be of the type that is extended and preceded by the keyword this.

 

We can now use the TotalMemory method as if it were an instance method defined on the type of our processes object. Here is the syntax it allows:

Console.WriteLine("Total memory: {0} MB",
  processes.TotalMemory()/1024/1024);

See how we have extended, in appearance at least, the IEnumerable<ProcessData> type with a new method. The type remains unchanged. The compiler converts the code to a static method call, comparable to what we used in listing 2.15.

It may not appear that using an extension method makes a big difference, but it helps when writing code because our TotalMemory method is now listed by IntelliSense for the types supported by this method, as shown in figure 2.3.

Figure 2.3. IntelliSense displays extension methods with a specific icon in addition to instance methods.

Notice how a specific icon with a blue arrow is used for extension methods. The figure shows the ToList and ToLookup standard query operators (more on these in section 2.5.2), as well as our TotalMemory extension method. Now, when writing code, we clearly see that we can get a total of the memory used by the processes contained in an enumeration of ProcessData objects. Extension methods are more easily discoverable through IntelliSense than classic static helper methods are.

Another advantage of extension methods is that they make it much easier to chain operations together. Let’s consider that we want to do the following:

  1. Filter out some processes from a collection of ProcessData objects using a helper method.
  2. Compute the total memory consumption of the processes using TotalMemory.
  3. Convert the memory consumption into megabytes using another helper method.

We would end up writing code that looks like this with classical helper methods:

BytesToMegaBytes(TotalMemory(FilterOutSomeProcesses(processes)));

One problem with this kind of code is that the operations are specified in the opposite of the order in which they are executed. This makes the code both harder to write and more difficult to understand.

In comparison, if the three fictitious helper methods were defined as extension methods, we could write:

processes
  .FilterOutSomeProcesses()
  .TotalMemory()
  .BytesToMegaBytes();

In this latter version, the operations are specified in the same order they execute in. This is much easier to read, don’t you think?

 

Note

Notice in the code sample that we insert line breaks and whitespace between method calls. We’ll do this often in our code samples in order to improve code readability. This isn’t a new feature offered by C# 3.0, because it’s supported by all versions of C#.

 

You’ll see more examples of chaining constructs in the next sections. As you’ll see in the next chapter, this is a key feature for writing LINQ queries. For the moment, let’s see how to declare extension methods in VB.NET.

Declaring extension methods in VB.NET

In VB.NET, extension methods are shared methods decorated with a custom attribute(System.Runtime.CompilerServices.ExtensionAttribute) that allow them to be invoked with instance-method syntax. (An extension method can be a Sub procedure or a Function procedure.) This attribute is provided by the new System.Core.dll assembly.

 

Note

In VB.NET, extension methods should be declared in a module.

 

The first parameter in a VB.NET extension method definition specifies which data type the method extends. When the method is run, the first parameter is bound to the instance of the data type against which the method is applied.

Listing 2.17 shows how we would declare our TotalMemory extension method in VB.NET.

Listing 2.17. Sample extension method in VB.NET (ExtensionMethods.vbproj)

<System.Runtime.CompilerServices.Extension()> _
Public Function TotalMemory( _
  ByVal processes As IEnumerable(Of ProcessData)) _
  As Int64
  Dim result As Int64 = 0
  For Each process In processes
    result += process.Memory
  Next
  Return Result
End Function

 

Note

Extension members of other kinds, such as properties, events, and operators, are being considered by Microsoft for the future but are currently not supported in C# 3.0 and VB.NET 9.0.

 

To give you a better idea of what can be done with extension methods and why they are useful, we’ll now use some standard extension methods provided with LINQ.

2.5.2. More examples using LINQ’s standard query operators

LINQ comes with a set of extension methods you can use like any other extension method. We’ll use some of them to show you more extension methods in action and give you a preview of the standard query operators, which we’ll cover in the next chapter.

OrderByDescending

Let’s say that we’d like to sort the list of processes by their memory consumption, memory hogs first. We can use the OrderByDescending extension method defined in the System.Linq.Enumerable class. Extension methods are imported through using namespace directives. For example, to use the extension methods defined in the Enumerable class, we need to add the following line of code to the top of our code file if it’s not already there:

using System.Linq;

 

Note

Your project also needs a reference to System.Core.dll, but this is added by default for new projects.

 

We’re now able to call OrderByDescending as follows to sort our processes:

ObjectDumper.Write(
  processes.OrderByDescending(process => process.Memory));

You can see that we provide the extension method with a lambda expression to decide how the sort operation will be performed. Here we indicate that we want to compare the processes based on their memory consumption.

It’s important to note that type inference is used automatically to simplify the code. Although OrderByDescending is defined as a generic method, we don’t need to explicitly indicate the types we’re dealing with. The C# compiler deduces from the method call that OrderByDescending works here on Process objects and returns an enumeration of Int64 objects.

When a generic method is called without specifying type arguments, a type inference process attempts to infer type arguments for the call. The presence of type inference allows a more convenient syntax to be used for calling a generic method, and allows the programmer to avoid specifying redundant type information.

Here is how OrderByDescending is defined:

public static IOrderedSequence<TSource>
  OrderByDescending<TSource, TKey>(
    this IEnumerable<TSource> source,
    Func<TSource, TKey> keySelector)

Here is how we would have to use it if type inference weren’t occurring:

processes.OrderByDescending<Process, Int64>(
  (Process process) => process.Memory));

The code would be more difficult to read without type inference because we’d have to specify types everywhere in LINQ queries.

Let’s now look at other query operators.

Take

If we’re interested only in the two processes that consume the most memory, we can use the Take extension method:

ObjectDumper.Write(
  processes
    .OrderByDescending(process => process.Memory)
    .Take(2));

The Take method returns the first n elements in an enumeration. Here we want two elements.

Sum

If we want to sum the amount of memory used by the two processes, we can use another standard extension method: Sum. The Sum method can be used in place of the extension method we created, TotalMemory. Here is how to use it:

ObjectDumper.Write(
  processes
    .OrderByDescending(process => process.Memory)
    .Take(2)
    .Sum(process => process.Memory)/1024/1024);

2.5.3. Extension methods in action in our example

Listing 2.18 shows what our DisplayProcess method looks like after all the additions we made.

Listing 2.18. The DisplayProcesses methods with extension methods (ExtensionMethods.csproj)
static void DisplayProcesses(Func<Process, Boolean> match)
{
  var processes = new List<ProcessData>();
  foreach (var process in Process.GetProcesses())
  {
    if (match(process))
    {
      processes.Add(new ProcessData { Id=process.Id,
        Name=process.ProcessName, Memory=process.WorkingSet64 });
    }
}

Console.WriteLine("Total memory: {0} MB",
  processes.TotalMemory()/1024/1024);

var top2Memory =
  processes
    .OrderByDescending(process => process.Memory)
    .Take(2)
    .Sum(process => process.Memory)/1024/1024;
Console.WriteLine(
  "Memory consumed by the two most hungry processes: {0} MB",
  top2Memory);

 ObjectDumper.Write(processes);
}

You can see how extension methods are especially useful when you combine them. Without extension methods, we would have to write code that is more difficult to comprehend. For example, compare the following code snippets that use the same methods.

Note these methods used as classic static methods:

var top2Memory =
  Enumerable.Sum(
    Enumerable.Take(
      Enumerable.OrderByDescending(processes,
        process => process.Memory),
      2),
    process => process.Memory)/1024/1024;

Compare that to these methods used as extension methods:

var top2Memory =
  processes
    .OrderByDescending(process => process.Memory)
    .Take(2)
    .Sum(process => process.Memory)/1024/1024;

As you can see, extension methods facilitate a chaining pattern because they can be strung together using dot notation. This looks like a pipeline and could be compared to Unix pipes. This is important for working with query operators, which we’ll cover in chapter 3.

 

Pipelines

In Unix-like computer operating systems, a pipeline is a set of processes chained by their standard streams, so that the output of each process (stdout) feeds directly as input (stdin) of the next one. Example: who | grep "joe" | sort.

 

Notice how much easier it is to follow the latter code. The processing steps are clearly expressed: We want to order the processes by memory, then keep the first two, and then sum their memory consumption. With the first code, it’s not that obvious, because what happens first is nested in method calls.

2.5.4. Warnings

Let’s review some limitations of extension methods before returning to our example application.

An important question arises when encountering extension methods: What if an extension method conflicts with an instance method? It’s important to understand how the resolution of extension methods works.

Extension methods are less “discoverable” than instance methods, which means that they are always lower priority. An extension method can’t hide an instance method.

Let’s consider listing 2.19.

Listing 2.19. Sample code for demonstrating extension methods’ discoverability

This code produces the following results:

Extensions.Method1
Extensions.Method1
Class3.Method1
Class4.Method1

You can see that as soon as an instance method exists with matching parameter types, it gets executed. The extension method is called only when no method with the same signature exists.

 

Warning

In VB.NET, the behavior is a bit different. With code similar to listing 2.19, the results are as follows if Option Strict is Off:

Extensions.Method1
Class2.Method1
Class3.Method1
Class4.Method1

As you can see, the VB.NET compiler gives higher priority to instance methods by converting parameters if needed. Here, the integer we pass to Method1 is converted automatically to a string in order to call the method of Class2.

If Option Strict is On, the following compilation error happens: "Option Strict On disallows implicit conversions from 'Integer' to 'String'". In such a case, a classic shared method call can be used, such as Method1(New Class2(), 12).

See the sample ExtensionMethodsDiscoverability.vbproj project to experiment with this.

 

Extension methods are more limited in functionality than instance methods. They can’t access non-public members, for example. Also, using extension methods intensively can negatively affect the readability of your code if it’s not clear that an extension method is used. For those reasons, we recommend you use extension methods sparingly and only in situations where instance methods aren’t feasible. We’ll use and create extension methods in combination with LINQ, but that’s a story for later.

With all these new features, we have greatly improved our code. But wait a minute: We can do better than that! Don’t you think it would be a big improvement if we could get rid of the ProcessData class? As it stands, it’s a temporary class with no real value, and it accounts for several lines of code. Getting rid of all the extra code would be perfect. This is just what anonymous types will allow us to do!

2.6. Anonymous types

var contact = new { Name = "Bob", Age = 8 }

We’re approaching the end of this chapter. But we still have one language enhancement to introduce before we can focus again on LINQ in the next chapter, in which you’ll be able to employ everything you learned in this chapter.

Using a syntax similar to that of object initializers, we can create anonymous types. They are usually used to group data into an object without first declaring a new class.

We’ll start this section by demonstrating how to use anonymous types in our example. We’ll then show you how anonymous types are real types, and point out some of their limitations.

2.6.1. Using anonymous types to group data into an object

Let’s say we want to collect the results of our processing together. We want to group information into an object. Having to declare a specific type just for this would be a pain.

Here is how we can use an anonymous type in C#:

var results = new {
  TotalMemory = processes.TotalMemory()/1024/1024,
  Top2Memory = top2Memory,
  Processes = processes };

 

Note

To output content of the Processes property, which is created as part of our new object, we should instruct ObjectDumper to process the data one level deeper. In order to do this, call ObjectDumper.Write(results, 1) instead of ObjectDumper.Write(results).

 

The syntax for anonymous types in VB.NET is similar:

Dim results = New With { _
  .TotalMemory = processes.TotalMemory()/1024/1024, _
  .Top2Memory = top2Memory, _
  .Processes = processes }

 

Note

Objects declared using an anonymous type can be used only with the var or Dim keywords. This is because an anonymous type doesn’t have a name we could use in our code!

 

2.6.2. Types without names, but types nonetheless

Anonymous types are types without names,[1] but types anyway. This means that a real type is created by the compiler. Our results variable points to an instance of a class that is created automatically based on our code. This class has three properties: TotalMemory, Top2Memory, and Processes. The types of the properties are deduced from the initializers.

1 Without names we can use, at least.

Figure 2.4 shows what the anonymous type that is created for us looks like in the produced assembly.

Figure 2.4. Sample anonymous type produced by the compiler, as displayed by .NET Reflector

The figure is a screenshot of .NET Reflector displaying the decompiled code of an anonymous type generated for the code we wrote in the previous section. (.NET Reflector is a free tool we highly recommend, available at http://aisto.com/roeder/dotnet.)

Be aware that compilers consider two anonymous types that are specified within the same program with properties of the same names and types in the same order to be the same type. For example, if we write the following two lines of code, only one type is created by the compiler:

var v1 = new { Person = "Suzie", Age = 32, CanCode = true }
var v2 = new { Person = "Barney", Age = 29, CanCode = false }

After this code snippet is executed, the two variables v1 and v2 contain two different instances of the same class.

If we add a third line like the following one, a different type is created for v3 because the order of the properties is different:

var v3 = new { Age = 17, Person = "Bill", CanCode = false }

2.6.3. Improving our example using anonymous types

That’s all well and good, but we said that we could get rid of the ProcessData object, and we haven’t done so. Let’s get back to what we wanted to do. Listing 2.20 shows a version of our DisplayProcesses method that uses an anonymous type instead of the ProcessData class:

Listing 2.20. The DisplayProcesses method with an anonymous type (AnonymousTypes.csproj)
static void DisplayProcesses(Func<Process, Boolean> match)
{
  var processes = new List<Object>();
  foreach (var process in Process.GetProcesses())
  {
    if (match(process))
    {
      processes.Add( new {
        process.Id,   Name=process.ProcessName,
        Memory=process.WorkingSet64 } );
    }
  }

  ObjectDumper.Write(processes);
}

 

Note

If a name isn’t specified for a property, and the expression is a simple name or a member access, the result property takes the name of the original member. Here we don’t provide a name for the first member , so it will be named Id.

For the sake of clarity, you may consider explicitly naming the members even if it isn’t required.

 

The great advantage of using such code is that we don’t need to declare our ProcessData class. This makes anonymous types a great tool for quick and simple temporary results. We don’t have to declare classes to hold temporary results anymore—thanks to anonymous types.

Still, anonymous types suffer from a number of limitations.

2.6.4. Limitations

A problem with our new code is that now that we have removed the ProcessData class, we can’t use our TotalMemory method any longer because it’s defined to work with ProcessData objects. As soon as we use anonymous types, we lose the ability to work with your objects in a strongly typed manner outside of the method where they are defined. This means that we can pass an instance of an anonymous type to a method only if the method expects an Object as parameter, but not if it expects a more precise type. Reflection is the only way to work with an anonymous type outside of the method where it’s created.

Likewise, anonymous types can’t be used as method results, unless the method’s return type is Object. This is why anonymous types should be used only for temporary data and can’t be used like normal types in method signatures.

Well, that’s not entirely true. We can use anonymous types as method results from generic methods. Let’s consider the following method:

public static TResult ReturnAGeneric<TResult>(
  Func<TResult> creator)
{
  return creator();
}

The return type of the ReturnAGeneric method is generic. If we call it without explicitly specifying a type for the TResult type argument, it’s inferred automatically from the signature of the creator parameter. Now, let’s consider the following line of code that invokes ReturnAGeneric:

var obj = ReturnAGeneric(
  () => new {Time = DateTime.Now, AString = "abc"});

Because the creator function provided as an argument returns an instance of an anonymous type, ReturnAGeneric returns that instance. However, ReturnAGeneric isn’t defined to return an Object, but a generic type. This is why the obj variable is strongly typed. This means it has a Time property of type DateTime and an AString property of type String.

Our ReturnAGeneric method is pretty much useless. But as you’ll be able to see with the standard query operators, LINQ uses this extensively in a more useful way.

There is one more thing to keep in mind about anonymous types. In C#, instances of anonymous types are immutable. This means that once you create an anonymous type instance, its field and property values are fixed forever. If you look at the sample anonymous type the compiler creates in figure 2.4, you can see that properties have getters but no setters. The only way to assign values to the properties and their underlying fields is through the constructor of the class. When you use the syntax to initialize an instance of an anonymous type, the constructor of that type is invoked automatically and the values are set once and for all.

Because they are immutable, instances of anonymous types have stable hash codes. If an object can’t be altered, then its hash code will never change either (unless the hash code of one of its fields isn’t stable). This is useful for hash tables and data-binding scenarios, for example.

You may wonder why anonymous types in C# are designed to be immutable. What may appear to be a limitation is in fact a feature. It enables value-based programming, which is used in functional languages to avoid side effects. Objects that never change allow concurrent access to work much better. This will be useful to enable PLINQ (Parallel LINQ), a project Microsoft has started to introduce concurrency in LINQ queries. You’ll learn more about PLINQ in chapter 13. Immutable anonymous types take .NET one step closer to a more functional programming world where we can use snapshots of state and side-effect-free code.

Keyed anonymous types

We wrote that anonymous types are immutable in C#. The behavior is different in VB.NET. By default, instances of anonymous types are mutable in VB.NET. But we can specify a Key modifier on the properties of an anonymous type, as shown in listing 2.21.

Listing 2.21. Testing keyed anonymous types (AnonymousTypes.csproj)
Dim v1 = New With {Key .Id = 123, .Name = "Fabrice"}
Dim v2 = New With {Key .Id = 123, .Name = "Céline"}
Dim v3 = New With {Key .Id = 456, .Name = "Fabrice"}
Console.WriteLine(v1.Equals(v2))
Console.WriteLine(v1.Equals(v3))

The Key modifier does two things: It makes the property on which it’s applied read-only (keys have to be stable), and it causes the GetHashCode method to be overridden by the anonymous type so it calls GetHashCode on the key properties. You can have as many key properties as you like.

A consequence of using Key is that it affects the comparison of objects. For example, in the listing, v1.Equals(v2) returns True because the keys of v1 and v2 are equal. In contrast, v1.Equals(v3) returns False.

2.7. Summary

In this chapter, we have covered several language extensions provided by C# 3.0 and VB.NET 9.0:

  • Implicitly typed local variables
  • Object and collection initializers
  • Lambda expressions
  • Extension methods
  • Anonymous types

All these new features are cornerstones for LINQ, but they are integral parts of the C# and VB.NET languages and can be used separately. They represent a move by Microsoft to bring some of the benefits that exist with dynamic and functional languages to .NET developers.

 

Feature notes

We also used auto-implemented properties in this chapter, but this new feature exists only for C# and isn’t required for LINQ to exist. If you want to learn more about the new C# features and C# in general, we suggest you read another book from Manning: C# in Depth.

VB.NET 9.0 introduces more language features, but they aren’t related to LINQ, and we won’t cover them in this book. This includes If as a ternary operator similar to C#’s ?: operator and as a replacement for IIf. Other VB improvements include relaxed delegates and improved generic type inferencing.

It’s interesting to note that Visual Studio 2008 lets us write code that uses C# 3.0 or VB.NET 9.0 features but target .NET 2.0. In other words, we can run code that uses what we introduced in this chapter on .NET 2.0 without needing .NET 3.0 or 3.5 installed on the client or host machine, because all the features are provided by the compiler and don’t require runtime or library support. One notable exception is extension methods, which require the System.Runtime.CompilerServices.ExtensionAttribute class; but we can introduce it ourselves or deliver the System.Core assembly that contains it with our .NET 2.0 program.

 

To sum up what we have introduced in this chapter, listing 2.22 shows the complete source code of the example we built. You can see all the new language features in action, as highlighted in the annotations.

Listing 2.22. Complete code demonstrating the new language features (CompleteCode.csproj)

After this necessary digression, in the next chapter you’ll see how all the language enhancements you have just discovered are used by LINQ to integrate queries into C# and VB.NET.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.141.197.251