Chapter 2. Types and Exceptions

Before you begin drilling down into the Microsoft .NET Framework class library (FCL) and the various programming models that it supports, it’s helpful to understand what the FCL is made of. The FCL is a library of “types,” which is a generic way of referring to classes, structs, interfaces, enumerations, and delegates. This chapter defines these terms and will make Chapter 3 more meaningful to developers who are new to the .NET Framework. This chapter also introduces some potential pitfalls related to types, including common errors that arise when using types that encapsulate file handles and other resources that aren’t managed by the garbage collector.

Understanding the .NET Framework’s type system and the differences between the various kinds of data types that it supports is important, but so is understanding how types are loaded, versioned, and deployed. Types are packaged in assemblies. The FCL is a set of many different assemblies, each one of them shared so that any application can use it. Applications, too, are deployed as assemblies. You already know that an assembly is a group of one or more files. You even created a single-file assembly (Hello.exe) in Chapter 1. What you don’t know—yet—is how to create assemblies of your own that contain types that can be used by other programs. You’ll remedy that in this chapter by building and deploying a multifile, multilanguage assembly, and then building a client that dynamically links to it. In addition to gaining valuable insight into how the FCL works, you’ll see what it takes to build class libraries of your own and learn how to use the assembly-based versioning mechanism in the common language runtime (CLR) to avoid DLL Hell—the term used to describe what happens when a fix made to a DLL for the benefit of one application breaks another application (or perhaps breaks the very application that the fix was intended to help).

Finally, you’ll learn about exception handling. Applications that use the .NET Framework employ a C++-like try/catch mechanism to achieve robustness without having to check the return value from each and every method call. In fact, checking return values does little good because the CLR and FCL don’t flag errors by returning error codes; they throw exceptions. A working knowledge of how exception handling works and how languages such as C# expose the CLR’s exception handling mechanism is essential to becoming a proficient .NET Framework programmer.

.NET Framework Data Types

The C in FCL stands for “class,” but the FCL isn’t strictly a class library; it’s a library of types. Types can mean any of the following:

  • Classes

  • Structs

  • Interfaces

  • Enumerations

  • Delegates

Understanding what a type is and how one type differs from another is crucial to understanding the FCL. The information in the next several sections will not only enrich your understanding of the FCL, but also help you when the time comes to build data types of your own.

Classes

A class in the .NET Framework is similar to a class in C++: a bundle of code and data that is instantiated to form objects. Classes in traditional object-oriented programming languages such as C++ contain member variables and member functions. Framework classes are richer and can contain the following members:

  • Fields, which are analogous to member variables in C++

  • Methods, which are analogous to member functions in C++

  • Properties, which expose data in the same way fields do but are in fact implemented using accessor (get and set) methods

  • Events, which define the notifications a class is capable of firing

Here, in C#, is a class that implements a Rectangle data type:

class Rectangle
{
    // Fields
    protected int width = 1;
    protected int height = 1;

    // Properties
    public int Width
    {
        get { return width; }
        set
        {
            if (value > 0)
                width = value;
            else
                throw new ArgumentOutOfRangeException (
                    "Width must be 1 or higher");
        }
    }

    public int Height
    {
        get { return height; }
        set
        {
            if (value > 0)
                height = value;
            else
                throw new ArgumentOutOfRangeException (
                    "Height must be 1 or higher");
        }
    }

    public int Area
    {
        get { return width * height; }
    }

    // Methods (constructors)
    public Rectangle () {}
    public Rectangle (int cx, int cy)
    {
        Width = cx;
        Height = cy;
    }
}

Rectangle has seven class members: two fields, three properties, and two methods, which both happen to be constructors—special methods that are called each time an instance of the class is created. The fields are protected, which means that only Rectangle and Rectangle derivatives can access them. To read or write a Rectangle object’s width and height, a client must use the Width and Height properties. Notice that these properties’ set accessors throw an exception if an illegal value is entered, a protection that couldn’t be afforded had Rectangle’s width and height been exposed through publicly declared fields. Area is a read-only property because it lacks a set accessor. A compiler will flag attempts to write to the Area property with compilation errors.

Many languages that target the .NET Framework feature a new operator for instantiating objects. The following statements create instances of Rectangle in C#:

Rectangle rect = new Rectangle ();     // Use first constructor
Rectangle rect = new Rectangle (3, 4); // Use second constructor

Once the object is created, it might be used like this:

rect.Width *= 2;      // Double the rectangle’s width
int area = rect.Area; // Get the rectangle’s new area

Significantly, neither C# nor any other .NET programming language has a delete operator. You create objects, but the garbage collector deletes them.

In C#, classes define reference types, which are allocated on the garbage-collected heap (which is often called the managed heap because it’s managed by the garbage collector) and accessed through references that abstract underlying pointers. The counterpart to the reference type is the value type, which you’ll learn about in the next section. Most of the time you don’t have to be concerned about the differences between the two, but occasionally the differences become very important and can actually be debilitating to your code if not accounted for. See the section "Boxing and Unboxing" later in this chapter for details.

All classes inherit a virtual method named Finalize from System.Object, which is the ultimate root class for all data types. Finalize is called just before an object is destroyed by the garbage collector. The garbage collector frees the object’s memory, but classes that wrap file handles, window handles, and other unmanaged resources (“unmanaged” because they’re not freed by the garbage collector) must override Finalize and use it to free those resources. This, too, has some important implications for developers. I’ll say more later in this chapter in the section entitled "Nondeterministic Destruction.”

Incidentally, classes can derive from at most one other class, but they can derive from one class and any number of interfaces. When you read the documentation for FCL classes, don’t be surprised if you occasionally see long lists of base “classes,” which really aren’t classes at all, but interfaces. Also be aware that if you don’t specify a base class when declaring a class, your class derives implicitly from System.Object. Consequently, you can call ToString and other System.Object methods on any object.

Structs

Classes are intended to represent complex data types. Because class instances are allocated on the managed heap, some overhead is associated with creating and destroying them. Some types, however, are “simple” types that would benefit from being created on the stack, which lives outside the purview of the garbage collector and offers a high-performance alternative to the managed heap. Bytes and integers are examples of simple data types.

That’s why the .NET Framework supports value types as well as reference types. In C#, value types are defined with the struct keyword. Value types impose less overhead than reference types because they’re allocated on the stack, not the heap. Bytes, integers, and most of the other “primitive” data types that the CLR supports are value types.

Here’s an example of a simple value type:

struct Point
{
    public int x;
    public int y;
    public Point (int x, int y)
    {
        this.x = x;
        this.y = y;
    }
}

Point stores x and y coordinates in fields exposed directly to clients. It also defines a constructor that can be used to instantiate and initialize a Point in one operation. A Point can be instantiated in any of the following ways:

Point point = new Point (3, 4); // x==3, y==4
Point point = new Point ();     // x==0, y==0
Point point;                    // x==0, y==0

Note that even though the first two statements appear to create a Point object on the heap, in reality the object is created on the stack. If you come from a C++ heritage, get over the notion that new always allocates memory on the heap. Also, despite the fact that the third statement creates a Point object whose fields hold zeros, C# considers the Point to be uninitialized and won’t let you use it until you explicitly assign values to x and y.

Value types are subject to some restrictions that reference types are not. Value types can’t derive from other types, although they implicitly derive from System.ValueType and can (and often do) derive from interfaces. They also shouldn’t wrap unmanaged resources such as file handles because value types have no way to release those resources when they’re destroyed. Even though value types inherit a Finalize method from System.Object, Finalize is never called because the garbage collector ignores objects created on the stack.

Interfaces

An interface is a group of zero or more abstract methods—methods that have no default implementation but that are to be implemented in a class or struct. Interfaces can also include properties and events, although methods are far more common.

An interface defines a contract between a type and users of that type. For example, many of the classes in the System.Collections namespace derive from an interface named IEnumerable. IEnumerable defines methods for iterating over the items in a collection. It’s because the FCL’s collection classes implement IEnumerable that C#’s foreach keyword can be used with them. At run time, the code generated from foreach uses IEnumerable’s GetEnumerator method to iterate over the collection’s contents.

Interfaces are defined with C#’s interface keyword:

interface ISecret
{
    void Encrypt (byte[] inbuf, out byte[] outbuf, Key key);
    void Unencrypt (byte[] inbuf, out byte[] outbuf, Key key);
}

A class or struct that wants to implement an interface simply derives from it and provides concrete implementations of its methods:

class Message : ISecret
{
    public void Encrypt (byte[] inbuf, out byte[] outbuf, Key key)
    {
      ...
    }

    public void Unencrypt (byte[] inbuf, out byte[] outbuf, Key key)
    {
      ...
    }
}

In C#, the is keyword can be used to determine whether an object implements a given interface. If msg is an object that implements ISecret, then in this example, is returns true; otherwise, it returns false:

if (msg is ISecret) {
    ISecret secret = (ISecret) msg;
    secret.Encrypt (...);
}

The related as operator can be used to test an object for an interface and cast it to the interface type with a single statement.

Enumerations

Enumerations in .NET Framework–land are similar to enumerations in C++. They’re types that consist of a set of named constants, and in C# they’re defined with the enum keyword. Here’s a simple enumerated type named Color:

enum Color
{
    Red,
    Green,
    Blue
}

With Color thusly defined, colors can be represented this way:

Color.Red    // Red
Color.Green  // Green
Color.Blue   // Blue

Many FCL classes use enumerated types as method parameters. For example, if you use the Regex class to parse text and want the parsing to be case-insensitive, you don’t pass a numeric value to Regex’s constructor; you pass a member of an enumerated type named RegexOptions:

Regex regex = new Regex (exp, RegexOptions.IgnoreCase);

Using words rather than numbers makes your code more readable. Nevertheless, because an enumerated type’s members are assigned numeric values (by default, 0 for the first member, 1 for the second, and so on), you can always use a number in place of a member name if you prefer.

The enum keyword isn’t simply a compiler keyword; it creates a bona fide type that implicitly derives from System.Enum. System.Enum defines methods that you can use to do some interesting things with enumerated types. For example, you can call GetNames on an enumerated type to enumerate the names of all its members. Try that in unmanaged C++!

Delegates

Newcomers to the .NET Framework often find delegates confusing. A delegate is a type-safe wrapper around a callback function. It’s rather simple to write an unmanaged C++ application that crashes when it performs a callback. It’s impossible to write a managed application that does the same, thanks to delegates.

Delegates are most commonly used to define the signatures of callback methods that are used to respond to events. For example, the FCL’s Timer class (a member of the System.Timers namespace) defines an event named Elapsed that fires whenever a preprogrammed timer interval elapses. Applications that want to respond to Elapsed events pass a Timer object a reference to the method they want called when an Elapsed event fires. The “reference” that they pass isn’t a raw memory address but rather an instance of a delegate that wraps the method’s memory address. The System.Timers namespace defines a delegate named ElapsedEventHandler for precisely that purpose.

If you could steal a look at the Timer class’s source code, you’d see something like this:

public delegate void ElapsedEventHandler (Object sender, ElapsedEventArgs e);

public class Timer
{
    public event ElapsedEventHandler Elapsed;
      .
      .
      .
}

Here’s how Timer fires an Elapsed event:

if (Elapsed != null) // Make sure somebody’s listening
    Elapsed (this, new ElapsedEventArgs (...)); // Fire!

And here’s how a client might use a Timer object to call a method named UpdateData every 60 seconds:

Timer timer = new Timer (60000);
timer.Elapsed += new ElapsedEventHandler (UpdateData);
  .
  .
  .
void UpdateData (Object sender, ElapsedEventArgs e)
{
    // Callback received!
}

As you can see, UpdateData conforms to the signature specified by the delegate. To register to receive Elapsed events, the client creates a new instance of ElapsedEventHandler that wraps UpdateData (note the reference to UpdateData passed to ElapsedEventHandler’s constructor) and wires it to timer’s Elapsed event using the += operator. This paradigm is used over and over in .NET Framework applications. Events and delegates are an important feature of the type system.

In practice, it’s instructive to know more about what happens under the hood when a compiler encounters a delegate definition. Suppose the C# compiler encounters code such as this:

public delegate void ElapsedEventHandler (Object sender, ElapsedEventArgs e);

It responds by generating a class that derives from System.MulticastDelegate. The delegate keyword is simply an alias for something that in this case looks like this:

public class ElapsedEventHandler : MulticastDelegate
{
    public ElapsedEventHandler (object target, int method)
    {
      ...
    }

    public virtual void Invoke (object sender, ElapsedEventArgs e)
    {
      ...
    }
  ...
}

The derived class inherits several important members from MulticastDelegate, including private fields that identify the method that the delegate wraps and the object instance that implements the method (assuming the method is an instance method rather than a static method). The compiler adds an Invoke method that calls the method that the delegate wraps. C# hides the Invoke method and lets you invoke a callback method simply by using a delegate’s instance name as if it were a method name.

Boxing and Unboxing

The architects of the .NET Framework could have made every type a reference type, but they chose to support value types as well to avoid imposing undue overhead on the use of integers and other primitive data types. But there’s a downside to a type system with a split personality. To pass a value type to a method that expects a reference type, you must convert the value type to a reference type. You can’t convert a value type to a reference type per se, but you can box the value type. Boxing creates a copy of a value type on the managed heap. The opposite of boxing is unboxing, which, in C#, duplicates a reference type on the stack. Common intermediate language (CIL) has instructions for performing boxing and unboxing.

Some compilers, the C# and Visual Basic .NET compilers among them, attempt to provide a unified view of the type system by hiding boxing and unboxing under the hood. The following code wouldn’t work without boxing because it stores an int in a Hashtable object, and Hashtable objects store references exclusively:

Hashtable table = new Hashtable (); // Create a Hashtable
table.Add ("First", 1);             // Add 1 keyed by "First"

Here’s the CIL emitted by the C# compiler:

newobj     instance void
           [mscorlib]System.Collections.Hashtable::.ctor()
stloc.0
ldloc.0
ldstr      "First"
ldc.i4.1
box        [mscorlib]System.Int32
callvirt   instance void
           [mscorlib]System.Collections.Hashtable::Add(object,
                                                       object)

Notice the BOX instruction that converts the integer value 1 to a boxed value type. The compiler emitted this instruction so that you wouldn’t have to think about reference types and value types. The string used to key the Hashtable entry (“First”) doesn’t have to be boxed because it’s an instance of System.String, and System.String is a reference type.

Many compilers are happy to box values without being asked to. For example, the following C# code compiles just fine:

int val = 1;      // Declare an instance of a value type
object obj = val; // Box it

But in C#, unboxing a reference value requires an explicit cast:

int val = 1;
object obj = val;
int val2 = obj;       // This won’t compile
int val3 = (int) obj; // This will

You lose a bit of performance when you box or unbox a value, but in the vast majority of applications, such losses are more than offset by the added efficiency of storing simple data types on the stack rather than in the garbage-collected heap.

Reference Types vs. Value Types

Thanks to boxing and unboxing, the dichotomy between value types and reference types is mostly transparent to the programmer. Sometimes, however, you must know which type you’re dealing with; otherwise, subtle differences between the two can impact your application’s behavior in ways that you might not expect.

Here’s an example. The following code defines a simple reference type (class) named Point. It also declares two Point references, p1 and p2. The reference p1 is initialized with a reference to a new Point object, and p2 is initialized by setting it equal to p1. Because p1 and p2 are little more than pointers in disguise, setting one equal to the other does not make a copy of the Point object; it merely copies an address. Therefore, modifying one Point affects both:

class Point
{
    public int x;
    public int y;
}
  .
  .
  .
Point p1 = new Point ();
p1.x = 1;
p1.y = 2;
Point p2 = p1; // Copies the underlying pointer
p2.x = 3;
p2.y = 4;

Console.WriteLine ("p1 = ({0}, {1})", p1.x, p1.y); // Writes "(3, 4)"
Console.WriteLine ("p2 = ({0}, {1})", p2.x, p2.y); // Writes "(3, 4)"

The next code fragment is identical to the first, save for the fact that Point is now a value type (struct). But because setting one value type equal to another creates a copy of the latter, the results are quite different. Changes made to one Point no longer affect the other:

struct Point
{
    public int x;
    public int y;
}
  .
  .
  .
Point p1 = new Point ();
p1.x = 1;
p1.y = 2;
Point p2 = p1; // Makes a new copy of the object on the stack
p2.x = 3;
p2.y = 4;
Console.WriteLine ("p1 = ({0}, {1})", p1.x, p1.y); // Writes "(1, 2)"
Console.WriteLine ("p2 = ({0}, {1})", p2.x, p2.y); // Writes "(3, 4)"

Sometimes differences between reference types and value types are even more insidious. For example, if Point is a value type, the following code is perfectly legal:

Point p;
p.x = 3;
p.y = 4;

But if Point is a reference type, the very same instruction sequence won’t even compile. Why? Because the statement

Point p;

declares an instance of a value type but only a reference to a reference type. A reference is like a pointer—it’s useless until it’s initialized, as in the following:

Point p = new Point ();

Programmers with C++ experience are especially vulnerable to this error because they see a statement that declares a reference and automatically assume that an object is being created on the stack.

The FCL contains a mixture of value types and reference types. Clearly, it’s sometimes important to know which type you’re dealing with. How do you know whether a particular FCL type is a value type or a reference type? Simple. If the documentation says it’s a class (as in “String Class”), it’s a reference type. If the documentation says it’s a structure (for example, “DateTime Structure”), it’s a value type. Be aware of the difference, and you’ll avoid frustrating hours spent in the debugger trying to figure out why code that looks perfectly good produces unpredictable results.

Nondeterministic Destruction

In traditional environments, objects are created and destroyed at precise, deterministic points in time. As an example, consider the following class written in unmanaged C++:

class File
{
protected:
    int Handle; // File handle

public:
    File (char* name)
    {
        // TODO: Open the file and copy the handle to Handle
    }

    ~File ()
    {
        // TODO: Close the file handle
    }
};

When you instantiate this class, the class constructor is called:

File* pFile = new File ("Readme.txt");

And when you delete the object, its destructor is called:

delete pFile;

If you create the object on the stack instead of the heap, destruction is still deterministic because the class destructor is called the moment the object goes out of scope.

Destruction works differently in the .NET Framework. Remember, you create objects, but you never delete them; the garbage collector deletes them for you. But therein lies a problem. Suppose you write a File class in C#:

class File
{
    protected IntPtr Handle = IntPtr.Zero;

    public File (string name)
    {
        // TODO: Open the file and copy the handle to Handle
    }

    ~File ()
    {
        // TODO: Close the file handle
    }
}

Then you create a class instance like this:

File file = new File ("Readme.txt");

Now ask yourself a question: when does the file handle get closed?

The short answer is that the handle gets closed when the object is destroyed. But when is the object destroyed? When the garbage collector destroys it. When does the garbage collector destroy it? Ah—there’s the key question. You don’t know. You can’t know because the garbage collector decides on its own when to run, and until the garbage collector runs, the object isn’t destroyed and its destructor isn’t called. That’s called nondeterministic destruction, or NDD. Technically, there’s no such thing as a destructor in managed code. When you write something that looks like a destructor in C#, the compiler actually overrides the Finalize method that your class inherits from System.Object. C# simplifies the syntax by letting you write something that looks like a destructor, but that arguably makes matters worse because it implies that it is a destructor, and to unknowing developers, destructors imply deterministic destruction.

Deterministic destruction doesn’t exist in framework applications unless your code does something really ugly, like this:

GC.Collect ();

GC is a class in the System namespace that provides a programmatic interface to the garbage collector. Collect is a static method that forces a collection. Garbage collecting impedes performance, so now that you know that this method exists, forget about it. The last thing you want to do is write code that simulates deterministic destruction by calling the garbage collector periodically.

NDD is a big deal because failure to account for it can lead to all sorts of run-time errors in your applications. Suppose someone uses your File class to open a file. Later on that person uses it to open the same file again. Depending on how the file was opened the first time, it might not open again because the handle is still open if the garbage collector hasn’t run.

File handles aren’t the only problem. Take bitmaps, for instance. The FCL features a handy little class named Bitmap (it’s in the System.Drawing namespace) that encapsulates bitmapped images and understands a wide variety of image file formats. When you create a Bitmap object on a Windows machine, the Bitmap object calls down to the Windows GDI, creates a GDI bitmap, and stores the GDI bitmap handle in a field. But guess what? Until the garbage collector runs and the Bitmap object’s Finalize method is called, the GDI bitmap remains open. Large GDI bitmaps consume lots of memory, so it’s entirely conceivable that after the application has run for a while, it’ll start throwing exceptions every time it tries to create a bitmap because of insufficient memory. End users won’t appreciate an image viewer utility (like the one you’ll build in Chapter 4) that has to be restarted every few minutes.

So what do you do about NDD? Here are two rules for avoiding the NDD blues. The first rule is for programmers who use (rather than write) classes that encapsulate file handles and other unmanaged resources. Most such classes implement a method named Close or Dispose that releases resources that require deterministic closure. If you use classes that wrap unmanaged resources, call Close or Dispose on them the moment you’re finished using them. Assuming File implements a Close method that closes the encapsulated file handle, here’s the right way to use the File class:

File file = new File ("Readme.txt");
  .
  .
  .
// Finished using the file, so close it
file.Close ();

The second rule, which is actually a set of rules, applies to developers who write classes that wrap unmanaged resources. Here’s a summary:

  • Implement a protected Dispose method (hereafter referred to as the “protected Dispose“) that takes a Boolean as a parameter. In this method, free any unmanaged resources (such as file handles) that the class encapsulates. If the parameter passed to the protected Dispose is true, also call Close or Dispose (the public Dispose inherited from IDisposable) on any class members (fields) that wrap unmanaged resources.

  • Implement the .NET Framework’s IDisposable interface, which contains a single method named Dispose that takes no parameters. Implement this version of Dispose (the “public Dispose“) by calling GC.SuppressFinalize to prevent the garbage collector from calling Finalize, and then calling the protected Dispose and passing in true.

  • Override Finalize. Finalize is called by the garbage collector when an object is “finalized”—that is, when an object is destroyed. In Finalize, call the protected Dispose and pass in false. The false parameter is important because it prevents the protected Dispose from attempting to call Close or the public Dispose on any encapsulated class members, which may already have been finalized if a garbage collection is in progress.

  • If it makes sense semantically (for example, if the resource that the class encapsulates can be closed in the manner of a file handle), implement a Close method that calls the public Dispose.

Based on these principles, here’s the right way to implement a File class:

class File : IDisposable
{
    protected IntPtr Handle = IntPtr.Zero;

    public File (string name)
    {
        // TODO: Open the file and copy the handle to Handle.
    }

    ~File ()
    {
        Dispose (false);
    }

    public void Dispose ()
    {
        GC.SuppressFinalize (this);
        Dispose (true);
    }

    protected virtual void Dispose (bool disposing)
    {
        // TODO: Close the file handle.
        if (disposing) {
            // TODO: If the class has members that wrap
            // unmanaged resources, call Close or Dispose on
            // them here.
        }
    }

    public void Close ()
    {
        Dispose ();
    }
}

Note that the “destructor”—actually, the Finalize method—now calls the protected Dispose with a false parameter, and that the public Dispose calls the protected Dispose and passes in true. The call to GC.SuppressFinalize is both a performance optimization and a measure to prevent the handle from being closed twice. Because the object has already closed the file handle, there’s no need for the garbage collector to call its Finalize method. It’s still important to override the Finalize method to ensure proper disposal if Close or Dispose isn’t called.

Dynamic Linking

When you ran the C# compiler in Chapter 1, you created an assembly named Hello.exe. Hello.exe is as simple as an assembly can be: it contains but one file and it lacks a strong name, meaning that the common language runtime performs no version checking when loading it.

Weakly named, single-file assemblies are fine for the majority of applications. But occasionally developers need more. For example, you might want to write a library of routines that other applications can link to, similar to a dynamic link library (DLL) in Windows. If you do, you’ll need to know more about assemblies. Perhaps you’d like to write a library for the private use of your application. Or maybe you’ve heard that Microsoft .NET solves the infamous DLL Hell problem and you’d like to know how. The next several sections walk you, tutorial-style, through the process of creating, deploying, and dynamically linking to a multifile assembly. During the journey, you’ll see firsthand how such assemblies are produced and what they mean to the design and operation of managed applications. And just to prove that the framework is language-agnostic, you’ll write half of the assembly in C# and half in Visual Basic .NET.

Creating a Multifile Assembly

The assembly that you’re about to create contains two classes: one named SimpleMath, written in Visual Basic .NET, and another named ComplexMath, written in C#. SimpleMath has two methods: Add and Subtract. ComplexMath has one method, Square, which takes an input value and returns the square of that value.

Physically, the assembly consists of three files: Simple.netmodule, which holds the SimpleMath class; Complex.netmodule, which holds ComplexMath; and Math.dll, which houses the assembly manifest. Because the managed modules containing SimpleMath and ComplexMath belong to the same assembly, clients neither know nor care about the assembly’s physical makeup. They simply see one entity—the assembly—that contains the types they’re interested in.

Here’s how to create the assembly:

  1. Create a new text file named Complex.cs and enter the source code shown in Figure 12-1.

  2. Compile Complex.cs into a managed module with the command

    csc /target:module complex.cs

    The /target switch tells the C# compiler to generate a managed module that is neither an EXE nor a DLL. Such a module can’t be used by itself, but it can be used if it’s added to an assembly. Because you didn’t specify a file name with a /out switch, the compiler names the output file Complex.netmodule.

  3. In the same directory, create a new text file named Simple.vb. Type in the source code shown in Figure 12-2.

  4. Compile Simple.vb with the following command:

    vbc /target:module simple.vb

    This command produces a managed module named Simple.netmodule, which makes up the Visual Basic .NET half of the assembly’s code.

  5. Create an assembly that binds the two managed modules together by running the SDK’s AL (Assembly Linker) utility as follows:

    al /target:library /out:Math.dll simple.netmodule
                complex.netmodule

    The resulting file—Math.dll—contains the assembly manifest. Inside the manifest is information identifying Simple.netmodule and Complex.netmodule as members of the assembly. Also encoded in the assembly manifest is the assembly’s name: Math.

    Example 2-1. The ComplexMath class.

    Complex.cs

    using System;
    
    public class ComplexMath
    {
        public int Square (int a)
        {
            return a * a;
        }
    }
    Example 2-2. The SimpleMath class.

    Simple.vb

    Imports System
    
    Public Class SimpleMath
        Function Add (a As Integer, b As Integer) As Integer
            Return a + b
        End Function
    
        Function Subtract (a As Integer, b As Integer) As Integer
            Return a - b
        End Function
    End Class

You just created the .NET Framework equivalent of a DLL. Now let’s write a client to test it with.

Dynamically Linking to an Assembly

Follow these simple steps to create a console client for the Math assembly:

  1. In the same directory that Math.dll, Simple.netmodule, and Complex.netmodule reside in, create a new text file named MathDemo.cs. Then enter the code shown in Example 2-3.

  2. Compile MathDemo.cs with the following command:

    csc /target:exe /reference:math.dll mathdemo.cs

    The compiler creates an EXE named MathDemo.exe. The /reference switch tells the compiler that MathDemo.cs uses types defined in the assembly whose manifest is stored in Math.dll. Without this switch, the compiler would complain that the types are undefined.

Notice that in step 2, you did not have to include a /reference switch pointing to Simple.netmodule or Complex.netmodule, even though that’s where SimpleMath and ComplexMath are defined. Why? Because both modules are part of the assembly whose manifest is found in Math.dll.

Example 2-3. Client for the Math assembly.

MathDemo.cs

using System;

class MyApp
{
    static void Main ()
    {
        SimpleMath simple = new SimpleMath ();
        int sum = simple.Add (2, 2);
        Console.WriteLine ("2 + 2 = {0}", sum);

        ComplexMath complex = new ComplexMath ();
        int square = complex.Square (3);
        Console.WriteLine ("3 squared = {0}", square);
    }
}

Now that you have a client ready, it’s time to test CLR-style dynamic linking. Here’s a script to serve as a guide:

  1. In a command prompt window, run MathDemo.exe. You should see the output shown in Figure 2-4, which proves that MathDemo.exe successfully loaded and used SimpleMath and ComplexMath.

  2. Temporarily rename Complex.netmodule to something like Complex.foo.

  3. Run MathDemo again. A dialog box appears informing you that a FileNotFoundException occurred. The exception was generated by the CLR when it was unable to find the module containing ComplexMath. Click the No button to acknowledge the error and dismiss the dialog box.

  4. Restore Complex.netmodule’s original name and run MathDemo again to verify that it works.

  5. Modify MathDemo.cs by commenting out the final three statements—the ones that use ComplexMath. Then rebuild MathDemo.exe by repeating the command you used to build it the first time.

  6. Run MathDemo. This time, the only output you should see is “2 + 2 = 4.”

  7. Temporarily rename Complex.netmodule again. Then run MathDemo.exe. No exception occurs this time because MathDemo.exe doesn’t attempt to instantiate ComplexMath. The CLR doesn’t load modules that it doesn’t need to. Had this code been deployed on the Internet, the CLR wouldn’t have attempted to download Complex.netmodule, either.

  8. Restore Complex.netmodule’s name, uncomment the statements that you commented out in step 5, and rebuild MathDemo.exe one more time.

    MathDemo output.
    Figure 2-4. MathDemo output.

You’ve now seen firsthand how dynamic linking works in the .NET Framework and demonstrated that the CLR loads only the parts of an assembly that it has to. But what if you wanted to install the assembly in a subdirectory of the application directory? Here’s how to deploy the assembly in a subdirectory named bin:

  1. Create a bin subdirectory in the application directory (the directory where MathDemo.exe is stored).

  2. Move Math.dll, Simple.netmodule, and Complex.netmodule to the bin directory. Run MathDemo.exe again. The CLR throws a FileNotFoundException because it can’t find the assembly in the application directory.

  3. Create a new text file named MathDemo.exe.config in the application directory, and then enter the statements shown in Example 2-5. MathDemo.exe.config is an XML application configuration file containing configuration data used by the CLR. The probing element tells the CLR to look in the bin subdirectory for assemblies containing types referenced by MathDemo.exe. You can include multiple subdirectory names by separating them with semicolons.

  4. Run MathDemo again and verify that it works even though the assembly is now stored in the bin directory.

These exercises demonstrate how assemblies containing types used by other applications are typically deployed. Most assemblies are private to a particular application, so they’re deployed in the same directory as the application that they serve or in a subdirectory. This model is consistent with the .NET Framework’s goal of “XCOPY installs,” which is synonymous with simplified install and uninstall procedures. Because MathDemo.exe doesn’t rely on any resources outside its own directory tree, removing it from the system is as simple as deleting the application directory and its contents.

Example 2-5. MathDemo.exe’s application configuration file.

MathDemo.exe.config

<configuration>
  <runtime>
    <assemblyBinding xmlns="urn:schemas-microsoft-com:asm.v1">
      <probing privatePath="bin" />
    </assemblyBinding>
  </runtime>
</configuration>

Versioning an Assembly

If you were to modify Simple.vb or Complex.cs right now and inadvertently introduce an error, the CLR would be happy to load the buggy assembly the next time you run MathDemo.exe. Why? Because the assembly lacks a strong name. The CLR’s versioning mechanism doesn’t work with weakly named assemblies. If you want to take advantage of CLR versioning, you must assign the assembly a strong name. Strong naming is the key to avoiding DLL Hell.

Use the following procedure to create a strongly named assembly containing the SimpleMath and ComplexMath classes:

  1. Go to the bin subdirectory and run the SDK’s SN (Strong Name) utility. The following command generates a “key file” named Keyfile.snk containing public and private keys that can be used for strong naming:

    sn /k Keyfile.snk
  2. Use AL to create a strongly named assembly that uses the keys found in Keyfile.snk:

    al /keyfile:keyfile.snk /target:library/out:Math.dll 
    /version:1.0.0.0 simple.netmodule complex.netmodule

    The /keyfile switch identifies the key file. The /version switch specifies the version number written to the assembly’s manifest. The four values in the version number, from left to right, are the major version number, the minor version number, the build number, and the revision number.

  3. Go to MathDemo.exe’s application directory and rebuild MathDemo.cs using the following command:

    csc /target:exe /reference:binmath.dll mathdemo.cs

    This time, MathDemo.exe is bound to the strongly named Math assembly. Moreover, the new build of MathDemo.exe contains metadata noting what version of the assembly it was compiled against.

  4. Verify that MathDemo.exe works as before by running it.

So far, so good. You’ve created a version of MathDemo.exe that is strongly bound to version 1.0.0.0 of a private assembly whose manifest is stored in Math.dll. Now use the following exercises to explore the ramifications:

  1. Execute the following command in the bin directory to increment the assembly’s version number from 1.0.0.0 to 1.1.0.0:

    al /keyfile:keyfile.snk /target:library/out:Math.dll 
    /version:1.1.0.0 simple.netmodule complex.netmodule
  2. Run MathDemo.exe. Because MathDemo.exe was compiled against version 1.0.0.0 of the assembly, the CLR throws a FileLoadException.

  3. Restore the assembly’s version number to 1.0.0.0 with the following command:

    al /keyfile:keyfile.snk /target:library /out:Math.dll 
    /version:1.0.0.0 simple.netmodule complex.netmodule
  4. Open Complex.cs and change the statement

    return a * a;

    to read

    return a + a;

    Clearly this is a buggy implementation because the Square method now doubles a rather than squaring it. But the version number has been reset to 1.0.0.0—the one MathDemo.exe was compiled against. What will the CLR do when you rebuild Complex.netmodule and run MathDemo again?

  5. Rebuild Complex.netmodule with the command

    csc /target:module /out:binComplex.netmodule complex.cs

    Run MathDemo.exe. Once again, the CLR throws an exception. Even though the version number is valid, the CLR knows that Complex.netmodule has changed because Math.dll’s manifest contains a cryptographic hash of each of the files in the assembly. When you modified Complex.netmodule, you modified the value it hashes to as well. Before loading Complex.netmodule, the CLR rehashed the file and compared the resulting hash to the hash stored in the assembly manifest. Upon seeing that the two hashes didn’t match, the CLR threw an exception.

Now suppose circumstances were reversed and that version 1.0.0.0 contained the buggy Square method. In that case, you’d want MathDemo.exe to use version 1.1.0.0. You have two options. The first is to recompile MathDemo.exe against version 1.1.0.0 of the assembly. The second is to use a binding redirect to tell the CLR to load version 1.1.0.0 of the assembly when MathDemo asks for version 1.0.0.0. A binding redirect is enacted by modifying MathDemo.exe.config as follows:

<configuration>
  <runtime>
    <assemblyBinding xmlns="urn:schemas-microsoft-com:asm.v1">
      <dependentAssembly>
        <assemblyIdentity name="Math"
          publicKeyToken="cd16a90001d313af" />
        <bindingRedirect oldVersion="1.0.0.0" newVersion="1.1.0.0" />
      </dependentAssembly>
      <probing privatePath="bin" />
    </assemblyBinding>
  </runtime>
</configuration>

The new dependentAssembly element and its subelements instruct the CLR to resolve requests for Math version 1.0.0.0 by loading version 1.1.0.0 instead. The publicKeyToken attribute is a tokenized representation (specifically, a 64-bit hash) of the public key encoded in Math’s assembly manifest; it was obtained by running SN with a /T switch against Math.dll:

sn /T math.dll

Your assembly’s public key token will be different from mine, so if you try this out on your code, be sure to plug your assembly’s public key token into MathDemo.exe.config’s publicKeyToken attribute.

Now you can have your cake and eat it too. The CLR enacts a strong versioning policy to prevent incorrect versions of the assembly from being loaded, but if you want to load another version, a simple configuration change makes it possible.

Sharing an Assembly: The Global Assembly Cache

Suppose that you build the Math assembly with the intention of letting any application, not just MathDemo.exe, use it. If that’s your goal, you need to install the assembly where any application can find it. That location is the global assembly cache (GAC), which is a repository for shared assemblies. The FCL is several shared assemblies. Only strongly named assemblies can be installed in the GAC. When the CLR attempts to load an assembly, it looks in the GAC even before it looks in the local application directory.

The .NET Framework SDK includes a utility named GacUtil that makes it easy to install and uninstall shared assemblies. To demonstrate, do this:

  1. Create a directory named Shared somewhere on your hard disk. (There’s nothing magic about the directory name; call it something else if you like.) Move the files in MathDemo.exe’s bin directory to the Shared directory. Then delete the bin directory.

  2. Go to the Shared directory and install the Math assembly in the GAC by executing the following command:

    gacutil /i math.dll
  3. Run MathDemo.exe. It should run fine, even though the assembly that it relies on is no longer in a subdirectory of the application directory.

  4. Remove the assembly from the GAC by executing this command:

    gacutil /u math
  5. Run MathDemo.exe again. This time, the CLR throws an exception because it can’t find the Math assembly in the GAC or in a local directory.

That’s shared assemblies in a nutshell. They must be strongly named, and the act of installing them in the GAC makes them shared assemblies. The downside to deploying shared assemblies is that doing so violates the spirit of XCOPY installs. Installing a shared assembly on an end user’s machine requires version 2 or later of the Windows Installer or a third-party installation program that is GAC-aware because GacUtil comes with the .NET Framework SDK and is not likely to be present on a nondeveloper’s PC. Uninstalling a shared assembly requires removing it from the GAC; simply deleting files won’t do the trick.

Applying Strong Names Using Attributes

The SDK’s AL utility is one way to create strongly named assemblies, but it’s not the only way, nor is it the most convenient. An easier way to produce a strongly named assembly is to attribute your code. Here’s a modified version of Complex.cs that compiles to a strongly named single-file assembly:

using System;
using System.Reflection
[assembly:AssemblyKeyFile ("Keyfile.snk")]
[assembly:AssemblyVersion ("1.0.0.0")]

public class ComplexMath
{
    public int Square (int a)
    {
        return a * a;
    }
}

And here’s how Simple.vb would look if it, too, were modified to build a strongly named assembly:

Imports System
Imports System.Reflection
<Assembly:AssemblyKeyFile ("Keyfile.snk")>
<Assembly:AssemblyVersion ("1.0.0.0")>

Public Class SimpleMath
    Function Add (a As Integer, b As Integer) As Integer
        Return a + b
    End Function

    Function Subtract (a As Integer, b As Integer) As Integer
        Return a - b
    End Function
End Class

AssemblyKeyFile and AssemblyVersion are attributes. Physically, they map to the AssemblyKeyFileAttribute and AssemblyVersionAttribute classes defined in the FCL’s System.Reflection namespace. Attributes are mechanisms for declaratively adding information to a module’s metadata. These particular attributes create a strongly named assembly by signing the assembly and specifying a version number.

Delayed Signing

Unless an assembly is strongly named, it can’t be installed in the GAC and its version number can’t be used to bind clients to a particular version of the assembly. Strongly naming an assembly is often referred to as “signing” the assembly because the crux of strong naming is adding a digital signature generated from the assembly manifest and the publisher’s private key. And therein lies a problem. In large corporations, private keys are often locked away in vaults or hardware devices where only a privileged few can access them. If you’re a rank-and-file programmer developing a strongly named assembly and you don’t have access to your company’s private key (which is exactly the situation that Microsoft developers find themselves in), how can you fully test the assembly if you can’t install it in the GAC or use its version number to do strong versioning?

The answer is delayed signing. Delayed signing embeds the publisher’s public key (which is available to everyone) in the assembly and reserves space for a digital signature to be added later. The presence of the public key allows the assembly to be installed in the GAC. It also enables clients to build into their metadata information denoting the specific version of the assembly that they were compiled against. The lack of a digital signature means the assembly is no longer tamperproof, but you can fix that by signing the assembly with the publisher’s private key before the assembly ships.

How does delayed signing work? If Public.snk holds the publisher’s public key, the following command creates and delay-signs a Math assembly (note the /delaysign switch):

al /keyfile:public.snk /delaysign /target:library /out:Math.dll
/version:1.1.0.0 simple.netmodule complex.netmodule

You can also delay-sign using attributes:

[assembly:AssemblyKeyFile ("Public.snk")]
[assembly:AssemblyVersion ("1.0.0.0")]
[assembly:DelaySign (true)]

In either event, the resultant assembly contains the publisher’s public key but lacks the signature generated with the help of the private key. To sign the assembly before releasing it, have someone who has access to the publisher’s private key do this:

sn /R Math.dll keyfile.snk

Using this statement assumes that Keyfile.snk holds the publisher’s public and private keys.

One trap to watch for regarding delayed signing is that neither the /delaysign switch nor the DelaySign attribute in and of itself enables the assembly to be installed in the GAC or strongly versioned. To enable both, run the SN utility against the assembly with a /Vr switch to enable verification skipping:

sn /Vr Math.dll

After signing the assembly with the publisher’s private key, disable verification skipping by running SN with a /Vu switch:

sn /Vu Math.dll

Verification skipping enables an assembly to be loaded without verifying that it hasn’t been tampered with. After all, verification can’t be performed if the assembly lacks the digital signature used for verification. Verification skipping doesn’t have to be enabled every time the assembly is built. Enabling it once is sufficient to enable verification skipping until it is explicitly disabled again by running SN with a /Vu switch.

Exception Handling

When something goes wrong during the execution of an application, the .NET Framework responds by throwing an exception. Some exceptions are thrown by the CLR. For example, if an application attempts to cast an object to a type that it’s not, the CLR throws an exception. Others are thrown by the FCL—for example, when an application attempts to open a nonexistent file. The types of exceptions that the .NET Framework throws are legion, so an application that targets the framework better be prepared to handle them.

The beauty of exceptions in the world of managed code is that they’re an intrinsic part of the .NET Framework. In the past, languages (and even individual language compilers) have used proprietary means to throw and handle exceptions. You couldn’t throw an exception in Visual Basic and catch it in C++. You couldn’t even throw an exception in a function compiled with one C++ compiler and catch it in a function compiled with another. Not so in a managed application. The CLR defines how exceptions are thrown and how they’re handled. You can throw an exception in any language and catch it in any other. You can even throw exceptions across machines. And to top it off, languages such as C# and Visual Basic .NET make exception handling extraordinarily easy.

Catching Exceptions

C# uses four keywords to expose the CLR’s exception handling mechanism: try, catch, finally, and throw. The general idea is to enclose code that might throw an exception in a try block and to include exception handlers in a catch block. Here’s an example:

try {
    Hashtable table = new Hashtable ();
    table.Add ("First", 1);
    string entry = (string) table["First"]; // Retrieve 1 and cast it
}
catch (InvalidCastException e) {
    Console.WriteLine (e.Message);
}

An integer is not a string, so attempting to cast it to one will generate an InvalidCastException. That will activate the InvalidCastException handler, which in this example writes the message encapsulated in the exception object to the console. To write a more generic catch handler that catches any exception thrown by the framework, specify Exception as the exception type:

catch (Exception e) {
    ...
}

And to respond differently to different types of exceptions, simply include a catch handler for each type you’re interested in:

try {
    ...
}
catch (InvalidCastException e) {
    ...
}
catch (FileNotFoundException e) {
    ...
}
catch (Exception e) {
    ...
}

The CLR calls the handler that most closely matches the type of exception thrown. In this example, an InvalidCastException or FileNotFoundException vectors execution to one of the first two catch handlers. Any other FCL exception type will activate the final handler. Notice that you don’t have to dispose of the Exception objects you catch because the garbage collector disposes of them for you.

All of the exception types defined in the FCL derive directly or indirectly from System.Exception, which defines a base set of properties common to FCL exception types. Thanks to System.Exception, for example, all FCL exception classes contain a Message property, which holds an error message describing what went wrong, and a StackTrace property, which details the call chain leading up to the exception. Derivative classes frequently add properties of their own. For instance, FileNotFoundException includes a FileName property that reveals what file caused the exception.

The FCL defines dozens of different exception classes. They’re not defined in any one namespace but are spread throughout the FCL’s roughly 100 namespaces. To help you get a handle on the different types of exceptions you’re liable to encounter, the following table lists some of the most common exception types.

Table 2-1. Common FCL Exception Classes

Class

Thrown When

ArgumentNullException

A null reference is illicitly passed as an argument

ArgumentOutOfRangeException

An argument is invalid (out of range)

DivideByZeroException

An attempt is made to divide by 0

IndexOutOfRangeException

An invalid array index is used

InvalidCastException

A type is cast to a type it’s not

NullReferenceException

A null reference is dereferenced

OutOfMemoryException

A memory allocation fails because of a lack of memory

WebException

An error occurs during an HTTP request

As in C++, exception handlers can be nested. If method A calls method B and method B throws an exception, the exception handler in method B is called provided a suitable handler exists. If method B lacks a handler for the type of exception that was thrown, the CLR looks for one in method A. If A too lacks a matching exception handler, the exception bubbles upward to the method that called A, then to the method that called the method that called A, and so on.

What does the .NET Framework do with unhandled exceptions? It depends on the application type. When a console application suffers an uncaught exception, the framework terminates the application and writes an error message to the console window. If the application is a Windows Forms application, the framework alerts the user with a message box. For a Web Forms application, the framework displays an error page. Generally speaking, it’s far preferable to anticipate exceptions and handle them gracefully than allow your users to witness an unhandled exception.

Guaranteeing Execution

Code in a finally block is guaranteed to execute, whether an exception is thrown or not. The finally keyword really comes in handy when you’re dealing with those pesky classes that wrap file handles and other unmanaged resources. If you write code like

File file = new File ("Readme.txt");
  .
  .
  .
file.Close ();

you’ve left a file open if an exception occurs after the file is opened but before Close is called. But if you structure your code this way, you’re safe:

File file = null;
try {
    file = new File ("Readme.txt");
      .
      .
      .
}
catch (FileNotFoundException e) {
    Console.WriteLine (e.Message);
}
finally {
    if (file != null)
        file.Close ();
}

Now Close is called regardless of whether an exception is thrown.Be aware that try blocks accompanied by finally blocks do not have to have catch blocks. In the previous example, suppose you want to make sure the file is closed, but you don’t really care to handle the exception yourself; you’d rather leave that to a method higher up the call stack. Here’s how to go about it:

File file = null;
try {
    file = new File ("Readme.txt");
      .
      .
      .
}
finally {
    if (file != null)
        file.Close ();
}

This code is perfectly legitimate and in fact demonstrates the proper way to respond to an exception that is best handled by the caller rather than the callee. Class library authors in particular should be diligent about not “eating” exceptions that callers should be aware of.

Throwing Exceptions

Applications can throw exceptions as well as catch them. Look again at the Width and Height properties in the Rectangle class presented earlier in this chapter. If a user of that class passes in an invalid Width or Height value, the set accessor throws an exception. You can also rethrow exceptions thrown to you by using the throw keyword with no arguments.

You can use throw to throw exception types defined in the FCL, and you can use it to throw custom exception types that you define. Although it’s perfectly legal to derive custom exception types from System.Exception (and even to declare exception classes that derive directly from System.Object), developers are encouraged to derive from System.ApplicationException instead, primarily because doing so enables applications to distinguish between exceptions thrown by the framework and exceptions thrown by user code.

That’s the theory, anyway. The reality is that the FCL derives some of its own exception classes from ApplicationException, meaning that having ApplicationException as a base type is not a reliable indicator that the exception wasn’t thrown by the framework. Don’t believe any documentation that says otherwise.

Next Up: The .NET Framework Class Library

The information presented in this chapter sets the stage for Chapter 3, which introduces the all-important .NET Framework class library. Now when you encounter the term “class” or “struct,” you’ll know precisely what it means. When you use a class that has a Close or Dispose method, you’ll realize that it probably wraps an unmanaged resource that shouldn’t wait to be freed until the garbage collector runs. You’ll understand how code that uses types defined in the FCL dynamically links to FCL assemblies. And you’ll know how to respond gracefully when the FCL throws an exception.

Without further delay, therefore, let’s peel the curtain away from the .NET Framework class library and learn how to use it to write great applications.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.17.79.20