10. Well-Formed Types

The previous chapters covered most of the constructs for defining classes and structs. However, several details remain to round out the type definition with fit-and-finish-type functionality. This chapter explains how to put the final touches on a type declaration.

A figure presents the various types of well formed types.

Overriding object Members

Chapter 6 discussed how all classes and structs derive from object. In addition, it reviewed each method available on object and discussed how some of them are virtual. This section discusses the details concerning overriding the virtual methods.

Overriding ToString()

By default, calling ToString() on any object will return the fully qualified name of the class. Calling ToString() on a System.IO.FileStream object will return the string System.IO.FileStream, for example. For some classes, however, ToString() can be more meaningful. On string, for example, ToString() returns the string value itself. Similarly, returning a Contact’s name would make more sense. Listing 10.1 overrides ToString() to return a string representation of Coordinate.

Listing 10.1: Overriding ToString()

public struct Coordinate
{
  public Coordinate(Longitude longitude, Latitude latitude)
  {
      Longitude = longitude;
      Latitude = latitude;
  }

  public Longitude Longitude { get; }
  public Latitude Latitude { get; }

  public override string ToString() =>                                    
      $"{ Longitude } { Latitude }";                                      

  // ...
}

Write methods such as Console.WriteLine() and System.Diagnostics.Trace.Write() call an object’s ToString() method,1 so overloading the method often outputs more meaningful information than the default implementation. Consider overloading the ToString() method whenever relevant diagnostic information can be provided from the output—specifically, when the target audience is developers, since the default object.ToString() output is a type name and is not end-user friendly. Regardless, avoid returning an empty string or null, as the lack of output will be very confusing. ToString() is useful for debugging from within a developer IDE or writing to a log file. For this reason, you should keep the strings relatively short (one screen width) so that they are not cut off. However, the lack of localization and other advanced formatting features make this approach less suitable for general end-user text display.

1. Unless there is an implicit cast operator, as described in Advanced Topic: Cast Operator.

Overriding GetHashCode()

Overriding GetHashCode() is more complex than overriding ToString().Even so, you should override GetHashCode() when you are overriding Equals(), and there is a compiler warning to indicate this step is recommended if you don’t. Overriding GetHashCode() is a good practice when you are using it as a key into a hash table collection (e.g., System.Collections.Hashtable and System.Collections.Generic.Dictionary).

The purpose of the hash code is to efficiently balance a hash table by generating a number that corresponds to the value of an object. Here are some implementation principles for a good GetHashCode() implementation:

  • Required: Equal objects must have equal hash codes (if a.Equals(b), then a.GetHashCode() == b.GetHashCode()).

  • Required: GetHashCode()’s returns over the life of a particular object should be constant (the same value), even if the object’s data changes. In many cases, you should cache the method return to enforce this constraint. However, when caching the value, be sure not to use the hash code when checking equality; if you do, two identical objects—one with a cached hash code of changed identity properties—will not return the correct result.

  • Required: GetHashCode() should not throw any exceptions; GetHashCode() must always successfully return a value.

  • Performance: Hash codes should be unique whenever possible. However, since hash codes return only an int, there inevitably will be an overlap in hash codes for objects that have potentially more values than an int can hold, which is virtually all types. (An obvious example is long, since there are more possible long values than an int could uniquely identify.)

  • Performance: The possible hash code values should be distributed evenly over the range of an int. For example, creating a hash that doesn’t consider the distribution of a string in Latin-based languages primarily centered on the initial 128 ASCII characters would result in a very uneven distribution of string values and would not be a strong GetHashCode() algorithm.

  • Performance: GetHashCode() should be optimized for performance. GetHashCode() is generally used in Equals() implementations to short-circuit a full equals comparison if the hash codes are different. As a result, it is frequently called when the type is used as a key type in dictionary collections.

  • Performance: Small differences between two objects should result in large differences between hash code values—ideally, a 1-bit difference in the object should result in approximately 16 bits of the hash code changing, on average. This helps ensure that the hash table remains balanced no matter how it is “bucketing” the hash values.

  • Security: It should be difficult for an attacker to craft an object that has a particular hash code. Such an attack seeks to flood a hash table with large amounts of data that all hash to the same value. The hash table implementation can then become inefficient, resulting in a denial-of-service attack.

These guidelines and rules are, of course, contradictory: It is very difficult to come up with a hash algorithm that is fast and meets all of these guidelines. As with any design problem, you’ll need to use a combination of good judgment and realistic performance measurements to come up with a good solution.

Consider the GetHashCode() implementation for the Coordinate type shown in Listing 10.2.

Listing 10.2: Implementing GetHashCode()

public struct Coordinate
{
  public Coordinate(Longitude longitude, Latitude latitude)
  {
      Longitude = longitude;
      Latitude = latitude;
  }

  public Longitude Longitude { get; }
  public Latitude Latitude { get; }

  public override int GetHashCode() => =>                                
      HashCode.Combine(                                                  
          Longitude.GetHashCode(), Latitude.GetHashCode());              

  // ...
}

There are numerous well-established algorithms for the GetHashCode() implementation, each with satisfactory results in terms of the guidelines described (see http://bit.ly/39yP8lm). However, the easiest approach is to call System.HashCode’s Combine() method, specifying a GetHashCode() result from each of the identifying fields—the fields that produce your object’s uniqueness. (If the identifying fields are numbers, be wary of mistakenly using the fields themselves rather than their hash code values.) ValueTuple invokes HashCode.Combine(); thus, it may be easier to remember that you can adequately create a ValueTuple with the same identifying fields (not their hash codes) and invoke the resulting tuple’s GetHashCode() member.

Note that Coordinate does not cache the value of the hash code. Since each field in the hash code calculation is readonly, the value can’t change. However, implementations should cache the hash code if the calculated values could change or if a cached value could offer a significant performance advantage. However, if you decide to cache the hash code, do not use the hash code when checking equality. Doing so may cause an object with immutable identity properties to fail an identity check because the hash code was calculated before the identity properties changed.

Overriding Equals()

Overriding Equals() without overriding GetHashCode() results in a warning such as that shown in Output 10.1.

Output 10.1

warning CS0659: '<Class Name>' overrides Object.Equals(object? o) but
does not override Object.GetHashCode()

Generally, developers expect overriding Equals() to be trivial, but it includes a surprising number of subtleties that require careful thought and testing.

Object Identity versus Equal Object Values

Two references are identical if both refer to the same instance. object includes a static method called ReferenceEquals() that explicitly checks for this object identity (see Figure 10.1).

The typical memory representation of the value types is shown.

Figure 10.1: Identity

However, reference equality is not the only type of equality. Two object instances can also be considered equal if the values of some or all of their members are equal. Consider the comparison of two ProductSerialNumbers shown in Listing 10.3.

Listing 10.3: Overriding the Equality Operator

public sealed class ProductSerialNumber
{
  // ...
}
class Program
{
  static void Main()
  {
      ProductSerialNumber serialNumber1 =
          new ProductSerialNumber("PV", 1000, 09187234);
      ProductSerialNumber serialNumber2 = serialNumber1;
      ProductSerialNumber serialNumber3 =
          new ProductSerialNumber("PV", 1000, 09187234);

      // These serial numbers ARE the same object identity
      if(!ProductSerialNumber.ReferenceEquals(serialNumber1,
          serialNumber2))
      {
          throw new Exception(
              "serialNumber1 does NOT " +
              "reference equal serialNumber2");
      }
      // And, therefore, they are equal
      else if(!serialNumber1.Equals(serialNumber2))
      {
          throw new Exception(
              "serialNumber1 does NOT equal serialNumber2");
      }
      else
      {
          Console.WriteLine(
              "serialNumber1 reference equals serialNumber2");
          Console.WriteLine(
              "serialNumber1 equals serialNumber2");
      }

      // These serial numbers are NOT the same object identity
      if (ProductSerialNumber.ReferenceEquals(serialNumber1,
              serialNumber3))
      {
          throw new Exception(
              "serialNumber1 DOES reference " +
              "equal serialNumber3");
      }

      // But they are equal (assuming Equals is overloaded)
      else if(!serialNumber1.Equals(serialNumber3) ||
          serialNumber1 != serialNumber3)
      {
          throw new Exception(
              "serialNumber1 does NOT equal serialNumber3");
      }

      Console.WriteLine( "serialNumber1 equals serialNumber3" );
  }
}

The results of Listing 10.3 appear in Output 10.2.

Output 10.2

serialNumber1 reference equals serialNumber2
serialNumber1 equals serialNumber3

As the last assertion demonstrates by its use of ReferenceEquals(), serialNumber1 and serialNumber3 are not the same reference. However, the code constructs them with the same values, and both are logically associated with the same physical product. If one instance was created from data in the database and another was created from manually entered data, you would expect the instances to be equal, so that the product would not be duplicated (reentered) in the database. Two identical references are obviously equal; however, two different objects could be equal but not reference equal. Such objects will not have identical object identities, but they may have key data that identifies them as being equal objects.

Only reference types can be reference equal, thereby supporting the concept of identity. Calling ReferenceEquals() on value types will always return false because value types are boxed when they are converted to object for the call. Even when the same variable is passed in both (value type) parameters to ReferenceEquals(), the result will still be false because the values are boxed independently. Listing 10.4 demonstrates this behavior: Because each argument is put into a “different box” in this example, they are never reference equal.

Note

Calling ReferenceEquals() on value types will always return false.

Listing 10.4: Value Types Never Reference Equal Themselves

public struct Coordinate
{
  public Coordinate(Longitude longitude, Latitude latitude)
  {
      Longitude = longitude;
      Latitude = latitude;
  }

  public Longitude Longitude { get; }
  public Latitude Latitude { get; }
  // ...
}
class Program
{
  public void Main()
  {
      //...

      Coordinate coordinate1 =
          new Coordinate( new Longitude(48, 52),
                          new Latitude(-2, -20));

      // Value types will never be reference equal
      if ( Coordinate.ReferenceEquals(coordinate1,
          coordinate1) )
      {
          throw new Exception(
              "coordinate1 reference equals coordinate1");
      }

      Console.WriteLine(
          "coordinate1 does NOT reference equal itself" );
  }
}

In contrast to the definition of Coordinate as a reference type in Chapter 9, the definition going forward is that of a value type (struct) because the combination of Longitude and Latitude data is logically thought of as a value and its size is less than 16 bytes. (In Chapter 9, Coordinate aggregated Angle rather than Longitude and Latitude.) A contributing factor to declaring Coordinate as a value type is that it is a (complex) numeric value that has operations on it. In contrast, a reference type such as Employee is not a value that you manipulate numerically, but rather refers to an object in real life.

Implementing Equals()

To determine whether two objects are equal (i.e., if they have the same identifying data), you use an object’s Equals() method. The implementation of this virtual method on object uses ReferenceEquals() to evaluate equality. Since this implementation is often inadequate, it is sometimes necessary to override Equals() with a more appropriate implementation.

Note

The implementation of object.Equals(), the default implementation on all objects before overloading, relies on ReferenceEquals() alone.

For objects to equal one another, the expectation is that the identifying data within them will be equal. For ProductSerialNumbers, for example, the ProductSeries, Model, and Id must be the same; however, for an Employee object, perhaps comparing EmployeeIds would be sufficient to determine equality. To correct the object.Equals() implementation, it is necessary to override it. Value types, for example, override the Equals() implementation to instead use the fields that the type includes.

The steps for overriding Equals() are as follows:

  1. Check for null.

  2. Check for equivalent types.

  3. Invoke a typed helper method that can treat the operand as the compared type rather than an object (see the Equals(Coordinate obj) method in Listing 10.5).

  4. Possibly check for equivalent hash codes to short-circuit an extensive, field-by-field comparison. (Two objects that are equal cannot have different hash codes.)

  5. Check base.Equals().

  6. Compare each identifying field for equality.

  7. Override GetHashCode().

  8. Override the == and != operators (see the next section).

Listing 10.5 shows a sample Equals() implementation.

Listing 10.5: Overriding Equals()

public struct Longitude
{
  // ...
}
public struct Latitude
{
  // ...
}
public struct Coordinate: IEquatable<Coordinate>
{
  public Coordinate(Longitude longitude, Latitude latitude)
  {
      Longitude = longitude;
      Latitude = latitude;
  }
  public Longitude Longitude { get; }
  public Latitude Latitude { get; }

  public override bool Equals(object? obj)
  {
      // STEP 1: Check for null
      if (obj is null)
      {
          return false;
      }
      // STEP 2: Equivalent data types;
      // can be avoided if type is sealed
      if (GetType() != obj.GetType())
      {
          return false;
      }
      // STEP 3: Invoked strongly type helper version of Equals()
      return Equals((Coordinate)obj);
  }
  public bool Equals(Coordinate obj)
  {
      // STEP 1: Check for null if a reference type
      // (e.g., a reference type)
      // if (ReferenceEquals(obj, null))
      // {
      //     return false;
      // }

      // STEP 4: Possibly check for equivalent hash codes
      // but not if the identity properties are mutable
      // and the hash code is cached.
      // if (GetHashCode() != obj.GetHashCode())
      // {
      //    return false;
      // }

      // STEP 5: Check base.Equals if base overrides Equals()
      if ( !base.Equals(obj) )
      {
          return false;
      }

      // STEP 6: Compare identifying fields for equality
      // using an overload of Equals on Longitude
      return ( (Longitude.Equals(obj.Longitude)) &&
          (Latitude.Equals(obj.Latitude)) );
  }

  // STEP 7: Override GetHashCode
  public override int GetHashCode() { /* ... */  }
}

In this implementation, the first two checks are relatively obvious. However, step 2 can be avoided if the type is sealed.

Steps 4 to 6 occur in an overload of Equals() that takes the Coordinate data type specifically. This way, a comparison of two Coordinates will avoid Equals(object? obj) and its GetType() check altogether.

Since GetHashCode() is not cached and is no more efficient than step 6, the GetHashCode() comparison is commented out. Regardless, because GetHashCode() does not necessarily return a unique value (it simply identifies when operands are different), on its own it does not conclusively identify equal objects. Furthermore, you should not compare the hash code when identity values are mutable and the hash code is cached; if you do, a comparison of equitable objects will return false.

If base.Equals() is not implemented, you could eliminate step 5. However, if base.Equals() was added later, you would be missing an important check. For this reason, you should consider adding it by default.

Like GetHashCode(), Equals() should never throw any exceptions. It is a valid choice to compare any object with any other object, and doing so should never result in an exception.

Overriding GetHashCode() and Equals() with Tuples

As shown in the previous two sections, the implementations of Equals() and GetHashCode() are fairly complex, yet the actual code is generally boilerplate. For Equals(), it’s necessary to compare all the contained identifying data structures while avoiding infinite recursion and null reference exceptions. For GetHashCode(), it’s necessary to combine the unique hash code of each of the non-null-contained identifying data structures in an exclusive OR operation. With C# 7.0 tuples, this turns out to be quite simple.

For Equals(Coordinate coordinate), you can group each of the identifying members into a tuple and compare them to the target argument of the same type:

public bool Equals(Coordinate? coordinate) =>
  return (Longitude, Latitude).Equals(
    (coordinate?.Longitude, coordinate?.Latitude));

(One might argue that this would be more readable if each identifying member were explicitly compared instead, but I leave that for the reader to arbitrate.) Internally, the tuple (System.ValueTuple<...>) uses EqualityComparer<T>, which relies on the type parameters implementation of IEquatable<T> (which contains only a single Equals<T>(T other) member). Therefore, to correctly override Equals, you need to follow this guideline: DO implement IEquatable<T> when overriding Equals(). That way, your own custom data types will leverage your custom implementation of Equals() rather than Object.Equals().

Perhaps the more compelling of the two overloads is GetHashCode() and its use of the tuple. Rather than engaging in the complex gymnastics of an exclusive OR operation of the non-null identifying members, you can simply instantiate a tuple of all identifying members and return the GetHashCode() value for the tuple, like so:

public override int GetHashCode() =>
  return (Radius, StartAngle, SweepAngle).GetHashCode();

Note that in C# 7.3, the tuple now implements == and !=, which it should have when it was first implemented—a topic we investigate next.

Operator Overloading

The preceding section looked at overriding Equals() and provided the guideline that the class should also implement == and !=. Implementing any operator is called operator overloading. This section describes how to perform such overloading not only for == and !=, but also for other supported operators.

For example, string provides a + operator that concatenates two strings. This is perhaps not surprising, because string is a predefined type, so it could possibly have special compiler support. However, C# provides for adding + operator support to a class or struct. In fact, all operators are supported except x.y, f(x), new, typeof, default, checked, unchecked, delegate, is, as, =, and =>. One particularly noteworthy operator that cannot be implemented is the assignment operator; there is no way to change the behavior of the = operator.

Before going through the exercise of implementing an operator overload, consider the fact that such operators are not discoverable through IntelliSense. Unless the intent is for a type to act like a primitive type (e.g., a numeric type), you should avoid overloading an operator.

Comparison Operators (==, !=, <, >, <=, >=)

Once Equals() is overridden, there is a possible inconsistency. That is, two objects could return true for Equals() but false for the == operator because == performs a reference equality check by default. To correct this flaw, it is important to overload the equals (==) and not equals (!=) operators as well.

For the most part, the implementation for these operators can delegate the logic to Equals(), or vice versa. However, for reference types, some initial null checks are required first (see Listing 10.6).

Listing 10.6: Implementing the == and != Operators

public sealed class ProductSerialNumber
{
  // ...

  public static bool operator ==(
      ProductSerialNumber leftHandSide,
      ProductSerialNumber rightHandSide)
  {

      // Check if leftHandSide is null
      // (operator == would be recursive)
      if(leftHandSide is null))
      {
          // Return true if rightHandSide is also null
          // and false otherwise
          return rightHandSide is null;
      }

      return leftHandSide.Equals(rightHandSide);
  }

  public static bool operator !=(
      ProductSerialNumber leftHandSide,
      ProductSerialNumber rightHandSide)
  {
      return !(leftHandSide == rightHandSide);
  }
}

Note that in this example, we use ProductSerialNumber rather than Coordinate to demonstrate the logic for a reference type, which has the added complexity of a null value.

You should avoid using the equality operator within an equality operator (leftHandSide == null). Doing so would recursively call back into the method, resulting in a loop that continues until the stack overflows. To avoid this problem, you can use is null (C# 7.0 or later) or ReferenceEquals() to check for null.

Binary Operators (+, -, *, /, %, &, |, ^, <<, >>)

You can add an Arc to a Coordinate. However, the code so far provides no support for the addition operator. Instead, you need to define such a method, as Listing 10.7 demonstrates.

Listing 10.7: Adding an Operator

struct Arc
{
  public Arc(
      Longitude longitudeDifference,
      Latitude latitudeDifference)
  {
      LongitudeDifference = longitudeDifference;
      LatitudeDifference = latitudeDifference;
  }

  public Longitude LongitudeDifference { get; }
  public Latitude LatitudeDifference { get; }
}
 struct Coordinate
{
  // ...
  public static Coordinate operator +(
      Coordinate source, Arc arc)
  {
      Coordinate result = new Coordinate(
          new Longitude(
              source.Longitude + arc.LongitudeDifference),
          new Latitude(
              source.Latitude + arc.LatitudeDifference));
      return result;
  }
 }

The +, -, *, /, %, &, |, ^, <<, and >> operators are implemented as binary static methods, where at least one parameter is of the containing type. The method name is the operator symbol prefixed by the keyword operator. As shown in Listing 10.8, given the definition of the - and + binary operators, you can add and subtract an Arc to and from the coordinate. Note that Longitude and Latitude will also require implementations of the + operator because they are called by source.Longitude + arc.LongitudeDifference and source.Latitude + arc.LatitudeDifference.

Listing 10.8: Calling the and + Binary Operators

public class Program
{
  public static void Main()
  {
      Coordinate coordinate1,coordinate2;
      coordinate1 = new Coordinate(
          new Longitude(48, 52), new Latitude(-2, -20));
      Arc arc = new Arc(new Longitude(3), new Latitude(1));

      coordinate2 = coordinate1 + arc;
      Console.WriteLine(coordinate2);

      coordinate2 = coordinate2 - arc;
      Console.WriteLine(coordinate2);

      coordinate2 += arc;
      Console.WriteLine(coordinate2);
  }
}

The results of Listing 10.8 appear in Output 10.3.

Output 10.3

51° 52' 0 E  -1° -20' 0 N
48° 52' 0 E  -2° -20' 0 N
51° 52' 0 E  -1° -20' 0 N

For Coordinate, you implement the and + operators to return coordinate locations after adding/subtracting Arc. This allows you to string multiple operators and operands together, as in result = ((coordinate1 +arc1) + arc2) + arc3. Moreover, by supporting the same operators (+/-) on Arc (see Listing 10.9 later in this chapter), you could eliminate the parentheses. This approach works because the result of the first operand (arc1 + arc2) is another Arc, which you can then add to the next operand of type Arc or Coordinate.

In contrast, consider what would happen if you provided a operator that had two Coordinates as parameters and returned a double corresponding to the distance between the two coordinates. Adding a double to a Coordinate is undefined, so you could not string together operators and operands. Caution is in order when defining operators that return a different type, because doing so is counterintuitive.

Combining Assignment with Binary Operators (+=, -=, *=, /=, %=, &=, …)

As previously mentioned, there is no support for overloading the assignment operator. However, assignment operators in combination with binary operators (+=, -=, *=, /=, %=, &=, |=, ^=, <<=, and >>=) are effectively overloaded when overloading the binary operator. Given the definition of a binary operator without the assignment, C# automatically allows for assignment in combination with the operator. Using the definition of Coordinate in Listing 10.7, therefore, you can have code such as

coordinate += arc;

which is equivalent to the following:

coordinate = coordinate + arc;

Conditional Logical Operators (&&, ||)

Like assignment operators, conditional logical operators cannot be overloaded explicitly. However, because the logical operators & and | can be overloaded, and the conditional operators comprise the logical operators, effectively it is possible to overload conditional operators. x && y is processed as x & y, where y must evaluate to true. Similarly, x || y is processed as x | y only if x is false. To enable support for evaluating a type to true or false—in an if statement, for example—it is necessary to override the true/false unary operators.

Unary Operators (+, -, !, ~, ++, --, true, false)

Overloading unary operators is very similar to overloading binary operators, except that they take only one parameter, also of the containing type. Listing 10.9 overloads the + and operators for Longitude and Latitude and then uses these operators when overloading the same operators in Arc.

Listing 10.9: Overloading the and + Unary Operators

public struct Latitude
{
  // ...
  public static Latitude operator -(Latitude latitude)               
  {                                                                 
      return new Latitude(-latitude.DecimalDegrees);                 
  }                                                                  
  public static Latitude operator +(Latitude latitude)               
  {                                                                  
      return latitude;                                               
  }                                                                  
}
public struct Longitude
{
  // ...
  public static Longitude operator -(Longitude longitude)             
  {                                                                  
      return new Longitude(-longitude.DecimalDegrees);               
  }                                                                  
  public static Longitude operator +(Longitude longitude)            
  {                                                                  
      return longitude;                                              
  }                                                                  
}
public struct Arc
{
  // ...
  public static Arc operator -(Arc arc)
  {
      // Uses unary – operator defined on
      // Longitude and Latitude
      return new Arc(-arc.LongitudeDifference,
          -arc.LatitudeDifference);
  }
  public static Arc operator +(Arc arc)
  {
      return arc;
  }
}

Just as with numeric types, the + operator in this listing doesn’t have any effect and is provided for symmetry.

Overloading true and false is subject to the additional requirement that both must be overloaded—not just one of the two. The signatures are the same as with other operator overloads; however, the return must be a bool, as demonstrated in Listing 10.10.

Listing 10.10: Overloading the true and false Operators

public static bool operator false(IsValid item)
{
    // ...
}
public static bool operator true(IsValid item)
{
    // ...
}

You can use types with overloaded true and false operators in if, do, while, and for controlling expressions.

Conversion Operators

Currently, there is no support in Longitude, Latitude, and Coordinate for casting to an alternative type. For example, there is no way to cast a double into a Longitude or Latitude instance. Similarly, there is no support for assigning a Coordinate using a string. Fortunately, C# provides for the definition of methods specifically intended to handle the converting of one type to another. Furthermore, the method declaration allows for specifying whether the conversion is implicit or explicit.

Defining a conversion operator is similar in style to defining any other operator, except that the “operator” is the resultant type of the conversion. Additionally, the operator keyword follows a keyword that indicates whether the conversion is implicit or explicit (see Listing 10.11).

Listing 10.11: Providing an Implicit Conversion between Latitude and double

public struct Latitude
{
  // ...

  public Latitude(double decimalDegrees)
  {
      DecimalDegrees = Normalize(decimalDegrees);
  }

  public double DecimalDegrees { get; }

  // ...

  public static implicit operator double(Latitude latitude)
  {
      return latitude.DecimalDegrees;
  }
  public static implicit operator Latitude(double degrees)
  {
      return new Latitude(degrees);
  }

  // ...
}

With these conversion operators, you now can convert doubles implicitly to and from Latitude objects. Assuming similar conversions exist for Longitude, you can simplify the creation of a Coordinate object by specifying the decimal degrees portion of each coordinate portion (e.g., coordinate = new Coordinate(43, 172);).

Note

When implementing a conversion operator, either the return or the parameter must be of the enclosing type—in support of encapsulation. C# does not allow you to specify conversions outside the scope of the converted type.

Guidelines for Conversion Operators

The difference between defining an implicit and an explicit conversion operator centers on preventing an unintentional implicit conversion that results in undesirable behavior. You should be aware of two possible consequences of using the explicit conversion operator. First, conversion operators that throw exceptions should always be explicit. For example, it is highly likely that a string will not conform to the format that a conversion from string to Coordinate requires. Given the chance of a failed conversion, you should define the particular conversion operator as explicit, thereby requiring that you be intentional about the conversion and ensure that the format is correct or, alternatively, that you provide code to handle the possible exception. Frequently, the pattern for conversion is that one direction (string to Coordinate) is explicit and the reverse (Coordinate to string) is implicit.

A second consideration is that some conversions will be lossy. Converting from a float (4.2) to an int is entirely valid, assuming an awareness of the fact that the decimal portion of the float will be lost. Any conversions that will lose data and will not successfully convert back to the original type should be defined as explicit. If an explicit cast is unexpectedly lossy or invalid, consider throwing a System.InvalidCastException.

Referencing Other Assemblies

Instead of placing all code into one monolithic binary file, C# and the underlying CLI framework allow you to spread code across multiple assemblies. This approach enables you to reuse assemblies across multiple executables.

Frequently, the code we write could be useful to more than one program. Imagine, for example, using the Longitude, Latitude, and Coordinate classes from a mapping program and a digital photo geocoding program or writing a command-line parser class. Classes and sets of classes like these can be written once and then reused from many different programs. As such, they need to be grouped together into an assembly called a library or class library and written for the purposes of reuse rather than only within a single program.

To create a library rather than a console project, follow the same directions as provided in Chapter 1, with one exception: For Dotnet CLI, use Class Library or classlib for the template.

Similarly, with Visual Studio 2019, from the File->New Project… menu item (Ctrl+Shift+N), use the Search text box to find all Class Library templates, and then select Class Library (.NET Standard)—the Visual C# version, of course. Use GeoCoordinates for the project name.

Next, place the source code from Listing 10.9 into separate files for each struct and name the file after the struct name and build the project. Building the project will compile the C# code into an assembly—a GeoCoordinates.dll file—and place it into a subdirectory of .in.

Referencing a Library

Given the library, we need to reference it from a program. For example, for a new console program using the Program class from Listing 10.8, we need to add a reference to the GeoCoordinates.dll assembly, identifying where the library is located and embedding metadata that uniquely identifies the library into the program. There are several ways to do this. First, you can reference the library project file (*.csproj), thus identifying which project contains the library source code and forming a dependency between the two projects. You can’t compile the program referencing the library until the library is compiled. This dependency causes the library to compile (if it isn’t compiled already) when the program compiles.

The second approach is to reference the assembly file itself. In other words, reference the compiled library (*.dll) rather than the project. This makes sense when the library is compiled separately from the program, such as by another team within your organization.

Third, you can reference a NuGet package, as described in the next section.

Note that it isn’t only console programs that can reference libraries and packages. In fact, any assembly can reference any other assembly. Frequently, one library will reference another library, creating a chain of dependencies.

Referencing a Project or Library with Dotnet CLI

In Chapter 1, we discussed creating a console program. Doing so created a program that included a Main method—the entry point at which the program will begin executing. To add a reference to the newly created assembly, we continue where we left off with an additional command for adding a reference:

dotnet add .HelloWorldHelloWord.csproj package .GeoCordinatesinDebug
netcoreapp2.0GeoCoordinates.dll

Following the add argument is a file path for the compiled assembly referenced by the project.

Rather than referencing the assembly, you can reference the project file. As already mentioned, this chains the projects together so that building the program will trigger the class library to compile first if it hasn’t compiled already. The advantage is that as the program compiles, it will automatically locate the compiled class library assembly—whether it be in the debug or release directory, for example. The command for referencing a project file is as follows:

dotnet add .HelloWorldHelloWord.csproj reference .GeoCoordinates 
GeoCoordinates.csproj

If you have the source code for a class library and that source code changes frequently, consider referencing the class library using the class library project file rather than the compiled assembly.

Upon completion of either the project or the compiled assembly reference, your project can compile with the Program class source code found in Listing 10.8.

Referencing a Project or Library with Visual Studio 2019

In Chapter 1, we also discussed creating a console program with Visual Studio. This created a program that included a Main method. To add a reference to the GeoCoordinates assembly, click the Project->Add Reference… menu item. Next, from the ProjectsSolution tab, select the GeoCoordinates project and OK to confirm the reference.

Similarly, to add an assembly reference, follow the same process as before, clicking the Project->Add Reference… menu item. However, this time click the Browse… button and navigate to and select the GeoCordinates.dll assembly.

As with Dotnet CLI, you can compile the program project with the Program class source code found in Listing 10.8.

Begin 4.0

NuGet Packaging

Starting with Visual Studio 2010, Microsoft introduced a library packaging system called NuGet. This system is intended to provide a means to easily share libraries across projects and between companies. Frequently, a library assembly is more than just a single compiled file. It might have configuration files, additional resources, and metadata associated with it. Unfortunately, before NuGet, there was no manifest that identified all the dependencies. Furthermore, there was no standard provider or package library for where the referenced assemblies could be found.

NuGet addresses both issues. Not only does NuGet include a manifest that identifies the author(s), companies, dependencies, and more, it also comes with a default package provider at NuGet.org where packages can be uploaded, updated, indexed, and then downloaded by projects that are looking to leverage them. With NuGet, you can reference a NuGet package (*.nupkg) and have it automatically installed from one of your preconfigured NuGet provider URLs.

The NuGet package is accompanied by a manifest (a *.nuspec file) that contains all the additional metadata included in the package. Additionally, it provides all the additional resources you may want—localization files, config files, content files, and so on. In the end, the NuGet package is an archive of all the individual resources combined into a single ZIP file—albeit with the .nupkg extension. If you rename the file with a *.zip extension, you can open and examine the file using any common compression utility.

Begin 7.0

NuGet References with Dotnet CLI

To add a NuGet package to your project using Dotnet CLI requires executing a single command:

>dotnet add .HelloWorldHelloWorld.csproj package Microsoft.Extensions.
Logging.Console

This command checks each of the registered NuGet package providers for the specified package and downloads it. (You can also trigger the download explicitly using the command dotnet restore.)

To create a local NuGet package, use the dotnet pack command. This command generates a GeoCoordinates.1.0.0.nupkg file, which you can reference using the add ... package command.

The digits following the assembly name correspond to the package version number. To specify the version number explicitly, edit the project file (*.csproj) and add a <Version>...</Version> child element to the PropertyGroup element.

End 7.0

NuGet References with Visual Studio 2019

If you followed the instructions laid out in Chapter 1, you already have a HelloWorld project. Starting with that project, you can add a NuGet package using Visual Studio 2019 as follows:

  1. Click the Project->Manage NuGet Packages… menu item (see Figure 10.2).

    4.0
    The project menu of the visual studio is depicted.

    Figure 10.2: The Project menu

  2. Select the Browse filter (generally the Installed filter is selected, so be sure to switch to Browse to add new package references), and then enter Microsoft.Extensions.Logging.Console into the Search (Ctrl+E) text box. Note that a partial name such as Logging.Console will also filter the list (see Figure 10.3).

    The browse filter tab of the visual studio is displayed.

    Figure 10.3: The Browse filter

    4.0
  3. Click the Install button to install the package into the project.

Upon completion of these steps, it is possible to begin using the Microsoft.Extensions.Logging.Console library, along with any dependencies that it may have (which are automatically added in the process).

As with Dotnet CLI, you can use Visual Studio to build your own NuGet package using the Build->Pack <Project Name> menu item. Similarly, you can specify the package version number from the Package tab of the Project Properties.

Invoking a Referenced Package or Project

Once the package or project is referenced, you can begin using it as though all the source code was included in the project. Listing 10.12 shows, for example, how to use the Microsoft.Extensions.Logging library, and Output 10.4 shows the sample output.

Listing 10.12: Invoking a NuGet Package Reference

public class Program
{
  public static void Main(string[] args)
  {
      using ILoggerFactory loggerFactory =
          LoggerFactory.Create(builder =>
              builder.AddConsole()/*.AddDebug()*/);

      ILogger logger = loggerFactory.CreateLogger(
          categoryName: "Console");

      logger.LogInformation($@"Hospital Emergency Codes: = '{
                string.Join("', '", args)}'");

      // ...
      logger.LogWarning("This is a test of the emergency...");

      // ...
  }
}

Output 10.4

>dotnet run -- black blue brown CBR orange purple red yellow
info: Console[0]
      Hospital Emergency Codes: = 'black', 'blue', 'brown', 'CBR',
'orange', 'purple', 'red', 'yellow'
warn: Console[0]
      This is a test of the emergency...
4.0

This library Microsoft.Extensions.Logging.Console NuGet package is used to log data to the console. In this case, we log both an information message and a warning and the messages appear in the console.

If you also referenced the Microsoft.Extensions.Logging.Debug library, you could add an .AddDebug() invocation after or before the AddConsole() invocation. The result would be that output similar to Output 10.4 would also appear in the debug output window of Visual Studio (select the Debug->Windows->Output menu) or Visual Studio Code (with the View->Debug Console menu).

The Microsoft.Extensions.Logging.Console NuGet package has three dependencies, including Microsoft.Extensions.Logging. Each of these is listed under the DependenciesPackages node of the project in the Visual Studio Explorer window of Visual Studio. By adding a NuGet package, all dependencies are automatically added.

End 4.0

Encapsulation of Types

Just as classes serve as an encapsulation boundary for behavior and data, so assemblies provide for similar boundaries among groups of types. Developers can break a system into assemblies and then share those assemblies with multiple applications or integrate them with assemblies provided by third parties.

public or internal Access Modifiers on Type Declarations

By default, a class or struct without any access modifier is defined as internal.2 The result is that the class is inaccessible from outside the assembly. Even if another assembly references the assembly containing the class, all internal classes within the referenced assemblies will be inaccessible.

2. Excluding nested types, which are private by default.

Just as private and protected provide levels of encapsulation to members within a class, so C# supports the use of access modifiers at the class level for control over the encapsulation of the classes within an assembly. The access modifiers available are public and internal. To expose a class outside the assembly, the assembly must be marked as public. Therefore, before compiling the Coordinates.dll assembly, it is necessary to modify the type declarations as public (see Listing 10.13).

Listing 10.13: Making Types Available outside an Assembly

public struct Coordinate
{
  // ...
}
public struct Latitude
{
  // ...
}
public struct Longitude
{
  // ...
}
public struct Arc
{
  // ...
}

Similarly, declarations such as class and enum can be either public or internal.3 The internal access modifier is not limited to type declarations; that is, it is also available on type members. Consequently, you can designate a type as public but mark specific methods within the type as internal so that the members are available only from within the assembly. It is not possible for the members to have a greater accessibility than the type. If the class is declared as internal, public members on the type will be accessible only from within the assembly.

3. You can decorate nested classes with any access modifier available to other class members (e.g., private). However, outside the class scope, the only access modifiers that are available are public and internal.

The protected internal Type Modifier

Another type member access modifier is protected internal. Members with an accessibility modifier of protected internal will be accessible from all locations within the containing assembly and from classes that derive from the type, even if the derived class is not in the same assembly. The default member access modifier is private, so when you add an access modifier (other than public), the member becomes slightly more visible.

Note

Members with an accessibility modifier of protected internal will be accessible from all locations within the containing assembly and from classes that derive from the type, even if the derived class is not in the same assembly.

End 7.2

Defining Namespaces

As mentioned in Chapter 2, all data types are identified by the combination of their namespace and their name. However, in the CLR, there is no such thing as a “namespace.” The type’s name actually is the fully qualified type name, including the namespace. For the classes you defined earlier, there was no explicit namespace declaration. Classes such as these are automatically declared as members of the default global namespace. It is likely that such classes will experience a name collision, which occurs when you attempt to define two classes with the same name. Once you begin referencing other assemblies from third parties, the likelihood of a name collision increases even further.

More important, there are thousands of types in the CLI framework and multiple orders of magnitude more outside the framework. Finding the right type for a particular problem, therefore, could potentially be a significant challenge.

The resolution to both of these problems is to organize all the types, grouping them into logical related categories called namespaces. For example, classes outside the System namespace are generally placed into a namespace corresponding with the company, product name, or both. Classes from Addison-Wesley, for example, are placed into an Awl or AddisonWesley namespace, and classes from Microsoft (not System classes) are located in the Microsoft namespace. The second level of a namespace should be a stable product name that will not vary between versions. Stability, in fact, is key at all levels. Changing a namespace name is a version-incompatible change that should be avoided. For this reason, you should avoid using volatile names (organization hierarchy, fleeting brands, and so on) within a namespace name.

Namespaces should be labeled using PascalCase, but if your brand uses nontraditional casing, it is acceptable to use the brand casing. (Consistency is key, so if that will be problematic—with PascalCase or brand-based casing—favor the use of whichever convention will produce the greater consistency.) You use the namespace keyword to create a namespace and to assign a class to it, as shown in Listing 10.14.

Listing 10.14: Defining a Namespace

// Define the namespace AddisonWesley
namespace AddisonWesley                                                   
{                                                                         
  class Program
  {
      // ...
   }
}                                                                          
// End of AddisonWesley namespace declaration

All content between the namespace declaration’s curly braces will then belong within the specified namespace. In Listing 10.14, for example, Program is placed into the namespace AddisonWesley, making its full name AddisonWesley.Program.

NOTE

In the CLR, there is no such thing as a “namespace.” Rather, the type’s name is the fully qualified type name.

Like classes, namespaces support nesting. This provides for a hierarchical organization of classes. All the System classes relating to network APIs are in the namespace System.Net, for example, and those relating to the Web are in System.Web.

There are two ways to nest namespaces. The first approach is to nest them within one another (similar to classes), as demonstrated in Listing 10.15.

Listing 10.15: Nesting Namespaces within One Another

// Define the namespace AddisonWesley
namespace AddisonWesley
{
  // Define the namespace AddisonWesley.Michaelis                  
  namespace Michaelis                                              
  {                                                                
    // Define the namespace                                        
    // AddisonWesley.Michaelis.EssentialCSharp                     
    namespace EssentialCSharp                                      
    {                                                              
      // Declare the class                                         
      // AddisonWesley.Michaelis.EssentialCSharp.Program           
      class Program
      {
          // ...
      }
    }                                                              
  }                                                                
}
// End of AddisonWesley namespace declaration

Such a nesting will assign the Program class to the AddisonWesley.Michaelis.EssentialCSharp namespace.

The second approach is to use the full namespace in a single namespace declaration in which a period separates each identifier, as shown in Listing 10.16.

Listing 10.16: Nesting Namespaces Using a Period to Separate Each Identifier

// Define the namespace AddisonWesley.Michaelis.EssentialCSharp
namespace AddisonWesley.Michaelis.EssentialCSharp                       
{                                                                       
  class Program
  {
      // ...
   }
}                                                                       
// End of AddisonWesley namespace declaration

Regardless of whether a namespace declaration follows the pattern shown in Listing 10.15, that in Listing 10.16, or a combination of the two, the resultant CIL code will be identical. The same namespace may occur multiple times, in multiple files, and even across assemblies. For example, with the convention of one-to-one correlation between files and classes, you can define each class in its own file and surround it with the same namespace declaration.

Given that namespaces are key for organizing types, it is frequently helpful to use the namespace for organizing all the class files. For this reason, it is a good idea to create a folder for each namespace, placing a class such as AddisonWesley.Fezzik.Services.RegistrationService into a folder hierarchy corresponding to the name.

When using Visual Studio projects, if the project name is AddisonWesley.Fezzik, you should create one subfolder called Services into which RegistrationService.cs is placed. You would then create another subfolder (Data, for example) into which you place classes relating to entities within the program (RealestateProperty, Buyer, and Seller, for example).

XML Comments

Chapter 1 introduced comments. However, you can use XML comments for more than just notes to other developers reviewing the source code. XML-based comments follow a practice popularized with Java. Although the C# compiler ignores all comments as far as the resultant executable goes, the developer can use command-line options to instruct the compiler 4 to extract the XML comments into a separate XML file. By taking advantage of the XML file generation, the developer can generate documentation of the API from the XML comments. In addition, C# editors can parse the XML comments in the code and display them to developers as distinct regions (e.g., as a different color from the rest of the code) or parse the XML comment data elements and display them to the developer.

4. The C# standard does not specify whether the C# compiler or a separate utility should take care of extracting the XML data. However, all mainstream C# compilers include the necessary functionality via a compile switch instead of within an additional utility.

Figure 10.4 demonstrates how an IDE can take advantage of XML comments to assist the developer with a tip about the code he is trying to write. Such coding tips offer significant assistance in large programs, especially when multiple developers share code. For this to work, however, the developer obviously must take the time to enter the XML comments within the code and then direct the compiler to create the XML file. The next section explains how to accomplish this.

A figure shows the uses of the comment tips in the visual studio IDE.

Figure 10.4: XML comments as tips in Visual Studio IDE

Begin 2.0

Starting with Visual Studio 2019, you can also embed simple HTML into a comment, and it will be reflected in the tips. For example, surrounding console with <strong> and </strong> will cause the word “console” to display in bold in Figure 10.4.

Associating XML Comments with Programming Constructs

Consider the listing of the DataStorage class, as shown in Listing 10.17.

Listing 10.17: Commenting Code with XML Comments

Images Images

Listing 10.17 uses both XML-delimited comments that span multiple lines and single-line XML comments in which each line requires a separate three-forward-slash delimiter (///).

Given that XML comments are designed to document the API, they are intended for use only in association with C# declarations, such as the class or method shown in Listing 10.17. Any attempt to place an XML comment inline with the code, unassociated with a declaration, will result in a warning by the compiler. The compiler makes the association simply because the XML comment appears immediately before the declaration.

Although C# allows any XML tag to appear in comments, theC# standard explicitly defines a set of tags to be used. <seealsocref="System.IO.StreamWriter"/> is an example of using the seealso tag. This tag creates a link between the text and the System.IO.StreamWriter class.

End 2.0

Generating an XML Documentation File

The compiler checks that the XML comments are well formed and issues a warning if they are not. To generate the XML file, add a DocumentationFile element to the ProjectProperties element:

<DocumentationFile>$(OutputPath)$(TargetFramework)$(AssemblyName).xml
</DocumentationFile>

This element causes an XML file to be generated during the build into the output directory using the <assemblyname>.xml as the filename. Using the CommentSamples class listed earlier and the compiler options listed here, the resultant CommentSamples.XML file appears as shown in Listing 10.18.

Listing 10.18: Comments.xml

<?xml version="1.0"?>
<doc>
    <assembly>
        <name>DataStorage</name>
    </assembly>
    <members>
        <member name="T:DataStorage">
            <summary>
            DataStorage is used to persist and retrieve
            employee data from the files.
            </summary>
        </member>
        <member name="M:DataStorage.Store(Employee)">
            <summary>
            Save an employee object to a file
            named with the Employee name.
            </summary>
            <remarks>
            This method uses
            <seealso cref="T:System.IO.FileStream"/>
            in addition to
            <seealso cref="T:System.IO.StreamWriter"/>
            </remarks>
            <param name="employee">
            The employee to persist to a file</param>
            <date>January 1, 2000</date>
        </member>
        <member name="M:DataStorage.Load(
                 System.String,System.String)">
            <summary>
            Loads up an employee object
            </summary>
            <remarks>
            This method uses
            <seealso cref="T:System.IO.FileStream"/>
            in addition to
            <seealso cref="T:System.IO.StreamReader"/>
            </remarks>
            <param name="firstName">
            The first name of the employee</param>
            <param name="lastName">
            The last name of the employee</param>
            <returns>
            The employee object corresponding to the names
            </returns>
            <date>January 1, 2000</date>*
        </member>
    </members>
</doc>

The resultant file includes only the amount of metadata that is necessary to associate an element back to its corresponding C# declaration. This is important because, in general, it is necessary to use the XML output in combination with the generated assembly to produce any meaningful documentation. Fortunately, tools such as the free GhostDoc5 and the open source project NDoc6 can generate documentation.

5. See http://submain.com/ to learn more about GhostDoc.

6. See http://ndoc.sourceforge.net to learn more about NDoc.

Garbage Collection

Garbage collection is obviously a core function of the runtime. Its purpose is to restore memory consumed by objects that are no longer referenced. The emphasis in this statement is on memory and references: The garbage collector is responsible only for restoring memory; it does not handle other resources such as database connections, handles (files, windows, etc.), network ports, and hardware devices such as serial ports. Also, the garbage collector determines what to clean up, based on whether any references remain. Implicitly, this means that the garbage collector works with reference objects and restores memory on the heap only. Additionally, it means that maintaining a reference to an object will delay the garbage collector from reusing the memory consumed by the object.

Weak References

All references discussed so far are strong references because they maintain an object’s accessibility and prevent the garbage collector from cleaning up the memory consumed by the object. The framework also supports the concept of weak references. Weak references do not prevent garbage collection on an object, but they do maintain a reference so that if the garbage collector does not clean up the object, it can be reused.

Weak references are designed for reference objects that are expensive to create, yet too expensive to keep around. Consider, for example, a large list of objects loaded from a database and displayed to the user. The loading of this list is potentially expensive, and once the user closes the list, it should be available for garbage collection. However, if the user requests the list multiple times, a second expensive load call will always be required. With weak references, it becomes possible to use code to check whether the list has been cleaned up, and if not, to re-reference the same list. In this way, weak references serve as a memory cache for objects. Objects within the cache are retrieved quickly, but if the garbage collector has recovered the memory of these objects, they will need to be re-created.

Once a reference object (or collection of objects) is recognized as worthy of potential weak reference consideration, it needs to be assigned to System.WeakReference (see Listing 10.19).

Listing 10.19: Using a Weak Reference

public static class ByteArrayDataSource
{
  static private byte[] LoadData()
  {
      // Imagine a much lager number
      byte[] data = new byte[1000];
      // Load data
      // ...
      return data;
  }

  static private WeakReference<byte[]>? Data { get; set; }
  static public byte[] GetData()
  {
      byte[]? target;
      if (Data is null)
      {
          target = LoadData();
          Data = new WeakReference<byte[]>(target);
          return target;
      }
      else if (Data.TryGetTarget(out target))
      {
          return target;
      }
      else
      {
          // Reload the data and assign it (creating a strong
          // reference) before setting WeakReference's Target
          // and returning it.
          target = LoadData();
          Data.SetTarget(target);
          return target;
      }
  }
}

// ...

Admittedly, this code uses generics, which aren’t discussed in this book until Chapter 12. However, you can safely ignore the <byte[]> text both when declaring the Data property and when assigning it. While there is a nongeneric version of WeakReference, there is little reason to consider it.7

7. Unless programming with .NET Framework 4.5 or earlier.

The bulk of the logic appears in the GetData() method. The purpose of this method is to always return an instance of the data—whether from the cache or by reloading it. GetData() begins by checking whether the Data property is null. If it is, the data is loaded and assigned to a local variable called target. This creates a reference to the data so that the garbage collector will not clear it. Next, we instantiate a WeakReference and pass a reference to the loaded data so that the WeakReference object has a handle to the data (its target); then, if requested, such an instance can be returned. Do not pass an instance that does not have a local reference to WeakReference, because it might get cleaned up before you have a chance to return it (i.e., do not call new WeakReference<byte[]>(LoadData())).

If the Data property already has an instance of WeakReference, then the code calls TryGetTarget() and, if there is an instance, assigns target, thus creating a reference so that the garbage collector will no longer clean up the data.

Lastly, if WeakReference’s TryGetTarget() method returns false, we load the data, assign the reference with a call to SetTarget(), and return the newly instantiated object.

Resource Cleanup

Garbage collection is a key responsibility of the runtime. Nevertheless, it is important to recognize that the garbage collection process centers on the code’s memory utilization. It is not about the cleaning up of file handles, database connection strings, ports, or other limited resources.

Finalizers

Finalizers allow developers to write code that will clean up a class’s resources. Unlike constructors that are called explicitly using the new operator, finalizers cannot be called explicitly from within the code. There is no new equivalent such as a delete operator. Rather, the garbage collector is responsible for calling a finalizer on an object instance. Therefore, developers cannot determine at compile time exactly when the finalizer will execute. All they know is that the finalizer will run sometime between when an object was last used and generally when the application shuts down normally. The deliberate injection of incertitude with the word “Generally” highlights the fact that finalizers might not execute. This possibility is obvious when you consider that a process might terminate abnormally. For instance, events such as the computer being turned off or a forced termination of the process, such as when debugging the process, will prevent the finalizer from running. However, with .NET Core, even under normal circumstances, finalizers may not get processed before the application shuts down. As we shall see in the next section, it thus may be necessary to take additional action to register finalization activities with other mechanisms.

Note

You cannot determine at compile time exactly when the finalizer will execute.

The finalizer declaration is identical to the destructor syntax of C#’s predecessor—that is, C++. As shown in Listing 10.20, the finalizer declaration is prefixed with a tilde before the name of the class.

Listing 10.20: Defining a Finalizer

using System.IO;

public class TemporaryFileStream
{
  public TemporaryFileStream(string fileName)
  {
      File = new FileInfo(fileName);
      // For a preferable solution use FileOptions.DeleteOnClose.
      Stream = new FileStream(
          File.FullName, FileMode.OpenOrCreate,
          FileAccess.ReadWrite);
  }

  public TemporaryFileStream()
      : this(Path.GetTempFileName()) { }

  // Finalizer                                                      
  ~TemporaryFileStream()                                            
  {                                                                 
      try                                                           
      {                                                             
          Close();                                                  
      }                                                             
      catch(Exception exception)                                    
      {                                                             
          // Write event to logs or UI                              
          // ...                                                    
      }                                                             
  }                                                                 

  public FileStream? Stream { get; private set; }
  public FileInfo? File { get; private set; }

  public void Close()
  {
      Stream?.Dispose();
      try
      {
          File?.Delete();
      }
      catch(IOException exception)
      {
          Console.WriteLine(exception);
      }
      Stream = null;
      File = null;
  }
}

Finalizers do not allow any parameters to be passed, so they cannot be overloaded. Furthermore, finalizers cannot be called explicitly—that is, only the garbage collector can invoke a finalizer. Access modifiers on finalizers are therefore meaningless, and as such, they are not supported. Finalizers in base classes will be invoked automatically as part of an object finalization call.

Note

Finalizers cannot be called explicitly; only the garbage collector can invoke a finalizer.

Because the garbage collector handles all memory management, finalizers are not responsible for de-allocating memory. Rather, they are responsible for freeing up resources such as database connections and file handles—resources that require an explicit activity that the garbage collector doesn’t know about.

In the finalizer shown in Listing 10.20, we start by disposing of the FileStream. This step is optional because the FileStream has its own finalizer that provides the same functionality as Dispose(). The purpose of invoking Dispose() now is to ensure that it is cleaned up when TemporaryFileStream is finalized, since the latter is responsible for instantiating the former. Without the explicit invocation of Stream?.Dispose(), the garbage collector will clean it up independently from the TemporaryFileStream once the TemporaryFileStream object is garbage collected and releases its reference on the FileStream object. That said, if we didn’t need a finalizer for resource cleanup anyway, it would not make sense to define a finalizer just for invoking FileStream.Dispose(). In fact, limiting the need for a finalizer to only objects that need resource cleanup that the runtime isn’t already aware of (resources that don’t have finalizers) is an important guideline that significantly reduces the number of scenarios where it is necessary to implement a finalizer.

In Listing 10.20, the purpose of the finalizer is to delete the file8—an unmanaged resource in this case. Hence we have the call to File?.Delete(). Now, when the finalizers are executed, the file will get cleaned up.

8. Listing 10.20 is somewhat a contrived example because there is a FileOptions.DeleteOnClose option when instantiating the FileStream, which triggers the file’s deletion when the FileStream closes.

Finalizers execute on an unspecified thread, making their execution even less deterministic. This indeterminate nature makes an unhandled exception within a finalizer (outside of the debugger) likely to crash the application—and the source of this problem is difficult to diagnose because the circumstances that led to the exception are not clear. From the user’s perspective, the unhandled exception will be thrown relatively randomly and with little regard for any action the user was performing. For this reason, you should take care to avoid exceptions within finalizers. Instead, you should use defensive programming techniques such as checking for null (refer to the use of the null-conditional operator in Listing 10.20). In fact, it is advisable to catch all exceptions in the finalizer and report them via an alternative means (such as logging or via the user interface) rather than keeping them as unhandled exceptions. This guideline leads to the try/catch block surrounding the Delete() invocation.

Another potential option to force finalizers to execute is to invoke System.GC.WaitForPendingFinalizers(). When this method is invoked, the current thread will be suspended until all finalizers for objects that are no longer referenced have executed.

Deterministic Finalization with the using Statement

The problem with finalizers on their own is that they don’t support deterministic finalization (the ability to know when a finalizer will run). Rather, finalizers serve the important role of being a backup mechanism for cleaning up resources if a developer using a class neglects to call the requisite cleanup code explicitly.

For example, consider the TemporaryFileStream, which includes not only a finalizer but also a Close() method. This class uses a file resource that could potentially consume a significant amount of disk space. The developer using TemporaryFileStream can explicitly call Close() to restore the disk space.

Providing a method for deterministic finalization is important because it eliminates a dependency on the indeterminate timing behavior of the finalizer. Even if the developer fails to call Close() explicitly, the finalizer will take care of the call. In such a case, the finalizer will run later than if it was called explicitly.

Because of the importance of deterministic finalization, the base class library includes a specific interface for the pattern and C# integrates the pattern into the language. The IDisposable interface defines the details of the pattern with a single method called Dispose(), which developers call on a resource class to “dispose” of the consumed resources. Listing 10.21 demonstrates the IDisposable interface and some code for calling it.

Listing 10.21: Resource Cleanup with IDisposable

using System;
using System.IO;

static class Program
{
  // ...
  static void Search()
  {
      TemporaryFileStream fileStream =
          new TemporaryFileStream();

      // Use temporary file stream
      // ...

      fileStream.Dispose();                                               

      // ...
  }
}
class TemporaryFileStream : IDisposable
{
  public TemporaryFileStream(string fileName)
  {
      File = new FileInfo(fileName);
      Stream = new FileStream(
          File.FullName, FileMode.OpenOrCreate,
          FileAccess.ReadWrite);
  }
  public TemporaryFileStream()
      : this(Path.GetTempFileName()) { }

  ~TemporaryFileStream()
  {
      Dispose(false);                                                      
  }

  public FileStream? Stream { get; private set; }
  public FileInfo? File { get; private set; }

  #region IDisposable Members                                               
  public void Dispose()                                                     
  {                                                                         
      Dispose(true);                                                        
                                                                            
      // Unregister from the finalization queue.                            
      System.GC.SuppressFinalize(this);                                     
  }                                                                         
  #endregion                                                                
  public void Dispose(bool disposing)                                       
  {                                                                         
      // Do not dispose of an owned managed object (one with a              
      // finalizer) if called by member finalize,                           
      // as the owned managed objects finalize method                       
      // will be (or has been) called by finalization queue                 
      // processing already                                                 
      if (disposing)                                                        
      {                                                                     
          Stream?.Close();                                                  
      }                                                                     
      try                                                  
      {                                                                     
          File?.Delete();                                                   
      }                                                                     
      catch(IOException exception)
      {                                                                     
          Console.WriteLine(exception);                                     
      }                                                                     
      Stream = null;
      File = null;
  }                                                                         
}
Begin 8.0

From Program.Search(), there is an explicit call to Dispose() after using the TemporaryFileStream. Dispose() is the method responsible for cleaning up the resources (in this case, a file) that are not related to memory and therefore are subject to cleanup implicitly by the garbage collector. Nevertheless, the execution here contains a hole that would prevent execution of Dispose()—namely, the chance that an exception will occur between the time when TemporaryFileStream is instantiated and the time when Dispose() is called. If this happens, Dispose() will not be invoked and the resource cleanup will have to rely on the finalizer. To avoid this problem, callers need to implement a try/finally block. Instead of requiring programmers to code such a block explicitly, C# provides a using statement expressly for the purpose (see Listing 10.22).

8.0

Listing 10.22: Invoking the using Statement

static class Program
{
  // ...
  static void Search()
  {
      using (TemporaryFileStream fileStream2 =                       
          new TemporaryFileStream(),                                 
          fileStream3 = new TemporaryFileStream())                   
      {
          // Use temporary file stream
      }

      // C# 8.0 or later                                             
      using TemporaryFileStream fileStream1 =                        
          new TemporaryFileStream();                                 
  }
}

In the first highlighted code snippet, the resultant CIL code is identical to the code that would be created if the programmer specified an explicit try/finally block, where fileStream.Dispose() is called in the finally block. The using statement, however, provides a syntax shortcut for the try/finally block.

Within this using statement, you can instantiate more than one variable by separating each variable from the others with a comma. The key considerations are that all variables must be of the same type, the type must implement IDisposable, and initialization occurs at the time of declaration. To enforce the use of the same type, the data type is specified only once rather than before each variable declaration.

C# 8.0 introduces a potential simplification with regard to resource cleanup. As shown in the second highlighted snippet of Listing 10.22, you can prefix the declaration of a disposable resource (one that implements IDisposable) with the using keyword. As is the case with the using statement, this will generate the try/finally behavior, with the finally block placed just before the variable goes out of scope (in this case, before the closing curly brace of the Search() method). One additional constraint on the using declaration is that the variable is read-only, so it can’t be assigned a different value.

End 8.0

Garbage Collection, Finalization, and IDisposable

There are several additional noteworthy items to point out in Listing 10.21. First, the IDisposable.Dispose() method contains an important call to System.GC.SuppressFinalize(). Its purpose is to remove the TemporaryFileStream class instance from the finalization (f-reachable) queue. This is possible because all cleanup was done in the Dispose() method rather than waiting for the finalizer to execute.

Without the call to SuppressFinalize(), the instance of the object will be included in the f-reachable queue—a list of all the objects that are mostly ready for garbage collection, except they also have finalization implementations. The runtime cannot garbage-collect objects with finalizers until after their finalization methods have been called. However, garbage collection itself does not call the finalization method. Rather, references to finalization objects are added to the f-reachable queue and are processed by an additional thread at a time deemed appropriate based on the execution context. In an ironic twist, this approach delays garbage collection for the managed resources—when it is most likely that these very resources should be cleaned up earlier. The reason for the delay is that the f-reachable queue is a list of “references”; as such, the objects are not considered garbage until after their finalization methods are called and the object references are removed from the f-reachable queue.

Note

Objects with finalizers that are not explicitly disposed will end up with an extended object lifetime. Even after all explicit references have gone out of scope, the f-reachable queue will have references, keeping the object alive until the f-reachable queue processing is complete.

For this reason, Dispose() invokes System.GC.SuppressFinalize. Invoking this method informs the runtime that it should not add this object to the finalization queue, but instead should allow the garbage collector to de-allocate the object when it no longer has any references (including any f-reachable references).

Second, Dispose() calls Dispose(bool disposing) with an argument of true. The result is that the Dispose() method on Stream is invoked (cleaning up its resources and suppressing its finalization). Next, the temporary file itself is deleted immediately upon calling Dispose(). This important call eliminates the need to wait for the finalization queue to be processed before cleaning up potentially expensive resources.

Third, rather than calling Close(), the finalizer now calls Dispose(bool disposing) with an argument of false. The result is that Stream is not closed (disposed) even though the file is deleted. The condition around closing Stream ensures that if Dispose(bool disposing) is called from the finalizer, the Stream instance itself will also be queued up for finalization processing (or possibly it would have already run depending on the order). Therefore, when executing the finalizer, objects owned by the managed resource should not be cleaned up, as this action will be the responsibility of the finalization queue.

Fourth, you should use caution when creating both a Close() type and a Dispose() method. It is not clear by looking at only the API that Close() calls Dispose(), so developers will be left wondering whether they need to explicitly call Close() and Dispose().

Fifth, to increase the probability that the functionality defined in the finalizer will execute before a process shuts down even in .NET Core, you should register the code with the AppDomain.CurrentDomain.ProcessExit event handler. Any finalization code registered with this event handler will be invoked baring an abnormal process termination (discussed in the next section).

Begin 7.0

Lazy Initialization

Begin 4.0

In the preceding section, we discussed how to deterministically dispose of an object with a using statement and how the finalization queue will dispose of resources in the event that no deterministic approach is used.

A related pattern is called lazy initialization or lazy loading. Using lazy initialization, you can create (or obtain) objects when you need them rather than beforehand—the latter can be an especially problematic situation when those objects are never used. Consider the FileStream property of Listing 10.24.

Listing 10.24: Lazy Loading a Property

using System.IO;

class DataCache
{
  // ...

        public TemporaryFileStream FileStream =>
            InternalFileStream??(InternalFileStream =
                new TemporaryFileStream());

        private TemporaryFileStream? InternalFileStream
            { get; set; } = null;
  // ...
}

In the FileStream expression bodied property, we check whether InternalFileStream is null before returning its value directly. If InternalFileStream is null, we first instantiate the TemporaryFileStream object and assign it to InternalFileStream before returning the new instance. Thus, the TemporaryFileStream required in the FileStream property is created only when the getter on the property is called. If the getter is never invoked, the TemporaryFileStream object would not be instantiated and we would save whatever execution time such an instantiation would cost. Obviously, if the instantiation is negligible or inevitable (and postponing the inevitable is less desirable), simply assigning it during declaration or in the constructor makes sense.

Summary

This chapter provided a whirlwind tour of many topics related to building solid class libraries. All the topics pertain to internal development as well, but they are much more critical to building robust classes. Ultimately, the focus here was on forming more robust and programmable APIs. In the category of robustness, we can include namespaces and garbage collection. Both of these topics fit in the programmability category as well, along with overriding object’s virtual members, operator overloading, and XML comments for documentation.

Exception handling heavily depends on inheritance, by defining an exception hierarchy and enforcing custom exceptions to fit within this hierarchy. Furthermore, the C# compiler uses inheritance to verify catch block order. In Chapter 11, you will see why inheritance is such a core part of exception handling.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.117.70.132