CHAPTER 17

image

Generic Types

It is sometimes useful to separate the implementation of a class—the members and methods that it exposes—from the type it is using. A list of items, for example, behaves in the same way whether it is a list of Decimal items or a list of Employee items.

A generic type is used to create such an implementation. The word generic refers to the implementation being written using a generic type rather than a specific one.

A List of Integers

Consider the following class that stores integer values:

public class IntList
{
    int m_count = 0;
    int[] m_values;
    public IntList(int capacity)
    {
       m_values = new int[capacity];
    }
    public void Add(int value)
    {
       m_values[m_count] = value;
       m_count++;
    }
    public int this[int index]
    {
       get { return m_values[index];}
       set { m_values[index] = value; }
    }
    public int Count { get { return m_count; } }
}

This class deals with only int values, so if you want to store a list of other types, you need to create a separate list class for each type of data (ShortList, FloatList, and so on). That doesn’t make much sense, so you can look for alternatives.

One alternative is to note that all types in C# derive from the object base class, and if you make a list that can store objects (ObjectList), you can use it with any type. This approach works and is in fact the approach taken with C# 1.0, but it has a few disadvantages.

  • It’s not typesafe at compile time; you can add a string and an Employee to an ObjectList instance, and everything works fine. When you access an item in the list, you have to specify which type you are expecting, and you will get an exception if the type isn’t the one you expect.
  • Any value types that you insert into the list have to be boxed into object instances to be added and unboxed when you pull them out again.
  • The resulting code is ugly.

It works, but it’s not really what you want.

WHY WERE GENERICS MISSING IN C# 1.0?

The preceding approach is exactly the approach that was used in C# 1.0.

The answer is pretty simple; because of the way that generics are implemented (more on that in the near future), they required a considerable amount of work, both in the C# language and the .NET Runtime. As a work item, it just didn’t fit in the schedule, and it was decided that it was better to ship a version of the .NET stack (C#, VB, libraries, and so on) that didn’t have generics than to wait for generics to be done. Given the amount of time it took to get C# 2.0 out the door, this seems to have been an excellent choice.

A close examination of the IntList class shows that there isn’t anything special about the fact that the class stores integer values; the class code would be identical if it stored floats. What you need is a way to generate different implementations for each specific type from one standard implementation. You can start by modifying the class so that all of the instances of int are replaced with a placeholder.

class MyList
{
    int m_count = 0;
    T[] m_values;
    public MyList(int capacity)
    {
       m_values = new T[capacity];
    }
    public void Add(T value)
    {
       m_values[m_count] = value;
       m_count++;
    }
    public T this[int index]
    {
       get { return m_values[index]; }
       set { m_values[index] = value; }
    }
    public int Count { get { return m_count; } }
}

This placeholder is known as a type parameter.1 Now, you just need a way to replace those instances of T with the real type you want. That’s a bit complicated; the code doesn’t show you what placeholder to replace, and it’s valid to have a type named T. You need a way to tell which identifiers in the code are type parameters. You’ll do this by adding a decoration to the class name.

class MyList <T>

Now it is simple to find the T in the class name, and when somebody writes this:

MyList <int>

you know that you can create the class you want by just substituting all instances of T with int and compiling the resulting code. The type MyList <T> is known as a generic type, while the use of MyList <int> is known as a constructed type. The use of int is known as a type argument.2

There are two different ways in which the transformation from generic type to constructed type can be architected.

The first is to do it in a single step. When compiling code with a use of a generic type:

MyList <int>

you can find the definition of MyList <T>, do the substitution, and compile the resulting code. This is the approach that C++ templates use; templates are purely a compiler feature, and in this example, the MyList <int> class is what gets compiled.

C# and .NET use a two-step approach. In the first step, the generic type (in this case, MyList <T>) is compiled, just like any nongeneric type. When you want to create the constructed type MyList <int>, the compiled definition of MyList <T> is referenced, and the constructed type is created from that.

C++ TEMPLATES VS. C# GENERICS

When the .NET teams were designing generics, there were two important requirements.

  • A generic type written in one language has to be consumable in any other .NET language3 that supports generics.
  • Generic types must work as expected when accessed at runtime. Among other things, that means being able to tell that a type is a generic type and being able to construct instances of generic types from their names.

Neither of these would make sense in the C++ world, since C++ doesn’t interoperate with other languages and doesn’t work in a managed environment.4

Supporting generics through the two-step approach has one big disadvantage. When compiling a type such as MyList <T>, the C# compiler does not know what type will ultimately be used instead of T, and therefore it can generate code based only on what it does know.

In many cases, if you have questions about why generics look the way they do in C# or why they can’t do something that C++ can do, it will help to ask yourself what the compiler knows at the time the generic type is compiled.

Constraints

As described in the previous section, generics are quite limited. Returning to the example, perhaps you want to create the MyConstructedList <T> class, which will initialize each element when the class is created. In this class, you write the following constructor:

public MyConstructedList(int capacity)
{
    m_values = new T[capacity];
    for (int i = 0; i < capacity; i++)
    {
       m_values[i] = new T();
    }
}

That doesn’t compile. The type T could be any type in .NET, which means that all the compiler knows is that type T can do what type object can. It does not know that it has a parameterless constructor, so trying to write this:

new T();

is illegal. What is needed is a way to specify that MyConstructedList <T> can be used for only those types that have such a constructor. This is done by introducing a constraint5 on the declaration of the generic type.

class MyConstructedList <T> where T : new()

At this point, the compiler knows that a parameterless constructor will always be there for type T.

Interface Constraints

You now want to extend your list so that it is sortable. To do so, you’re going to have to write code that compares two values.

if (m_values[x].CompareTo(m_values[y]) > 0)

This is, of course, illegal, because the compiler doesn’t know if there is a way to compare two T values. You can address this by adding an interface constraint.

class MySortedList <T> where T: IComparable

The compiler will now require that T implements the IComparable interface. Since a class can implement more than one interface, a generic class can specify more than one interface constraint.

Base Class Constraints

It is also possible to specify that a type parameter be a specific class or a class derived from that class.

class Processor <T> where T: Employee

Any instance method that is defined on Employee can now be used through T.

image Note  I’m not a big fan of base class constraints. The point of creating a generic class is to write code that is generic, and tying that to a specific class seems to make things less generic. I think that constraints on interfaces are generally a better idea.

Class and Struct Constraints

If you want to constrain your class so that the type parameter is only a class or only a struct, a class or struct constraint can be used.

class Processor <T> where T: class
class Executor <T> where T: struct

Multiple Constraints

It is possible to put multiple constraints on a single type parameter or to add constraints to more than one type parameter.

class Storer <T, U>
    where T: IComparable, IEnumerable
    where U: class

The contraints for a given type parameter are listed in a comma-separated list, and each type parameter has a separate where clause.

The Default Value of a Type

It is sometimes necessary to write code that initializes a variable. If the generic type is unconstrained, the type argument could be either a struct or a class, and you therefore need a way to do the appropriate thing. You can write the following:

value = default(T);

which will set the value to null if the generic type is a class and zero it out if the type is a struct.

Generic Interfaces and Inheritance

Since classes can be generic, interfaces can also be generic. Here’s an example:

interface IMyList <T>
{
    void Add(T value);
}

Specifying the generic interface imposes a requirement that classes that implement the interface contain an appropriate method. For example, a generic class would be a match.

class MyList <T>: IMyList <T>
{
    public void Add(T value) {...}
}

You can also match with a nongeneric class.

class NewIntList : IMyList <int>
{
    public void Add(int value) { }
}

Here’s another example:

class NewIntList : MyList <int>, IMyList <int> {}

Generic Methods

Generic methods are used when the thing you want to make generic is an algorithm rather than a class. Consider the following simple method in the Shuffle class:

public static List <string> Shuffle(List <string> list1, List <string> list2)
{
    List <string> shuffled = new List <string> ();
    for (int i = 0; i < list1.Count; i++)
    {
       shuffled.Add(list1[i]);
       shuffled.Add(list2[i]);
    }
    return shuffled;
}

This method is called as follows:

List <string> shuffledList = Shuffler.Shuffle(list1, list2);

The method that is used to perform the shuffle is not dependent on the type being string, so it can easily be made generic by replacing all the instances of string with T.

public static List <T> Shuffle <T> (List <T> list1, List <T> list2)
{
    List <T> shuffled = new List <T> ();
    for (int i = 0; i < list1.Count; i++)
    {
       shuffled.Add(list1[i]);
       shuffled.Add(list2[i]);
    }
    return shuffled;
}

This method is called as follows:

List <string> shuffledList = Shuffler.Shuffle <string> (list1, list2);

The use of <string> tells the compiler what type to use to replace T in the generic method. If the generic type parameter (T in this case) is used in the arguments, the compiler is able to infer the generic type argument, and the call can be simplified to the following:

List <string> shuffledList = Shuffler.Shuffle(list1, list2);

The first parameter of the Shuffle() method is a List <T>, and you are passing a List <string>, so T must be string in this call.

Generic Delegates

For an introduction to delegates, see Chapter 22.

Generic delegates can be declared in a way similar to generic methods. In a generic class, the generic type parameter can be used in the declaration of a delegate.

public class Stack <T> 
{
    public delegate void ItemAdded(T newItem);
}

A delegate can also be declared with its own type parameters. For example, the base class library contains the following delegate:

public delegate void EventHandler <TEventArgs> (object sender, TEventArgs e)
                                              where TEventArgs : EventArgs

This delegate requires that the second argument must be a class derived from the EventArgs class. It is now simple to declare events that follow the .NET convention without having to define your own type-specific delegate.

public event EventHandler <StackChangeEventArgs> StackChanged;

Covariance and Contravariance

Covariance and contravariance are big terms that describe how conversions are performed between types.6

Consider the following:

class Auto
{
}
class Sedan: Auto
{
}
void ReferenceCovariance()
{
    Sedan dodgeDart = new Sedan();
    Auto currentCar = dodgeDart;
}

This works exactly as you would expect; because Sedan is derived from Auto, every Sedan is an Auto, and therefore you can safely make this assignment.

When you extend this to arrays of reference types, it gets more interesting.

void ArrayCovariance()
{
    Sedan[] sedans = new Sedan[1];
    sedans[0] = new Sedan();
    Auto[] autos = sedans;
    autos[0] = new Roadster();
}

It is useful to be able to assign an array of Sedan instances to an array of Auto instances; this allows you to write methods that take an array of Auto instances as a parameter. Unfortunately, it isn’t typesafe; the last statement in the method assigns a Roadster instance to the autos array. That would be fine if the autos array was actually of type Auto[], but it is in fact of type Sedan[], and the assignment fails at runtime.7

This behavior is a bit unfortunate. It would be nice if generics provided a better solution. Consider the following example:

interface IFirstItem <T>
{
    T GetFirstItem();
}
class MyFirstList <T> : List <T>, IFirstItem <T>
{
    public MyFirstList () { }
    public T GetFirstItem()
    {
       return this[0];
    }
}

Here you define an interface named IFirstItem < T > and a list class that implements it. You then write some code to use it.

void TestService()
{
    MyFirstList <Sedan> sedans = new MyFirstList <Sedan> ();
    sedans.Add(new Sedan());
    PerformService(sedans);
}
void PerformService(IFirstItem <Auto> autos)
{
}

You are passing an IFirstItem <Sedan> to a function that takes an IFirstItem <Auto>, and that’s not allowed. The compiler is worried that PerformService() will lose the fact that the Auto is really a Sedan and try to do something that will generate an exception. It’s the same situation you had with the array.

If you examine the IFirstItem <T> interface, you will realize that there is no issue; the only thing that it does is pull an instance of type T out, and there is no way for that to cause an issue. What you need is a way to tell the compiler that the type parameter T is used only as output.

You can do that through the following:

interface IFirstItem <out T>
{
    T GetFirstItem();
}

The code now works. This is an example of generic covariance; the compiler now knows that it is safe to convert from the type of T to a less-derived type, so it allows you to do the conversion.

You now try to extend the interface by adding an additional method.

interface IFirstItem <out T>
{
    T GetFirstItem();
    void NotLegal(T parameter);
}

This generates an error.8 You said that you are going to use the generic parameter T only for output, but the NotLegal() method uses it for input.

Contravariance

Contravariance applies in a different case. Consider the following:

interface IEqual <in T>
{
    bool IsEqual(T x, T y);
}
class Comparer : IEqual <object>
{
    public bool IsEqual(object x, object y)
    {
       return true;
    }
}
class GenericContravariance
{
    void Example()
    {
       Comparer comparer = new Comparer();
       TestEquality(comparer);
    }
    void TestEquality(IEqual <Auto> equalizer)
    {
    }
}

In this case, instances of type T flow only into the interface and are never visible outside of the interface. That allows you to do something that seems a bit surprising; you can pass an IEqual <object> for use as an IEqual <Auto>. That just seems wrong. However, if you look a bit closer, you will figure out that if you have an IEqual <Auto>, you will want to use it in code such as this:

Auto auto1 = ...;
Auto auto2 = ...;
bool equal = equalizer.IsEqual(auto1, auto2);

In that situation, it is perfectly safe to use an IEqual <object>, since you can safely convert the Auto arguments into object arguments. You indicate this situation by adding the in keyword to the type parameter.

Generics and Efficiency

As you learned earlier in this chapter, the runtime will replace all generic type parameters with their appropriate argument types when constructing instances of those types. Such an implementation could result in a considerable amount of memory use, with separate implementations for List <string>, List <Employee>, and all other uses of List <T> .

The .NET Runtime will take advantage of the fact that variables of type string and Employee are the same size, and therefore the generated code is identical (except for the type of the arguments) for all reference types and generate it only once.

Value types are of differing sizes, and the runtime therefore generates a different implementation for each use of a value type as a generic type argument.

Generic Naming Guidelines

Generic type names show up in two places.

  • In the declaration of the generic type and therefore any time a developer is writing code using the generic type
  • In the implementation of the generic type

It is helpful to choose generic type parameter names that aid in the understanding of both of these cases. I suggest the following guidelines when naming generic type parameters:

  • If there is a single type parameter that can be any type, name it T.
  • If a single type parameter has a nongeneric meaning, include that meaning in the name, and name it something like TEntity or TComparable. This will make it much more understandable for the user of the generic type.
  • If there are multiple type parameters, give them useful names, such as in Dictionary <TKey, TValue>.9

Beyond that, consider whether a particular name improves readability.

1 Technically, a generic type parameter.

2 This is symmetrical with how methods work; methods define parameters and are called with arguments.

3 There are the “big 3” Microsoft .NET languages (Visual Basic .NET, C#, and C++), but there are also numerous less common languages, and giving them access to generic types was an important goal.

4 I’m talking about standard C++, not the Microsoft version that includes .NET support.

5 This use of the term constraint is a bit odd; you add a constraint so that your generic code can do more than it could before, and generally, the tighter the constraint, the more you can do. If you look at it from the perspective of the generic type argument, it makes more sense.

6 The information in this section may make your head hurt. This is perfectly normal; it makes my head hurt as well.

7 The reason for this behavior is a bit complex. If you want all the details, Eric Lippert has an excellent series of blog posts on covariance and contravariance in C#.

8 Invalid variance: The type parameter T must be contravariantly valid on IFirstItem < T > .NotLegal(T). T is covariant.

9 There has been considerable discussion about the best naming convention for this case. The one I have given is consistent with the .NET Framework Design Guidelines, but the T prefix does seem out of place at times.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
13.59.134.218