The previous chapters covered most of the constructs for defining classes and structs. However, several details remain to round out the type definition with fit-and-finish-type functionality. This chapter explains how to put the final touches on a type declaration.
object
MembersChapter 6 discussed how all classes and structs derive from object
. In addition, it reviewed each method available on object
and discussed how some of them are virtual. This section discusses the details concerning overriding the virtual methods.
ToString()
By default, calling ToString()
on any object will return the fully qualified name of the class. Calling ToString()
on a System.IO.FileStream
object will return the string System.IO.FileStream
, for example. For some classes, however, ToString()
can be more meaningful. On string
, for example, ToString()
returns the string value itself. Similarly, returning a Contact
’s name would make more sense. Listing 10.1 overrides ToString()
to return a string representation of Coordinate
.
public struct Coordinate { public Coordinate(Longitude longitude, Latitude latitude) { Longitude = longitude; Latitude = latitude; } public Longitude Longitude { get; } public Latitude Latitude { get; } public override string ToString() => $"{ Longitude } { Latitude }"; // ... }
Write methods such as Console.WriteLine()
and System.Diagnostics.Trace.Write()
call an object’s ToString()
method,1 so overloading the method often outputs more meaningful information than the default implementation. Consider overloading the ToString()
method whenever relevant diagnostic information can be provided from the output—specifically, when the target audience is developers, since the default object.ToString()
output is a type name and is not end-user friendly. Regardless, avoid returning an empty string or null
, as the lack of output will be very confusing. ToString()
is useful for debugging from within a developer IDE or writing to a log file. For this reason, you should keep the strings relatively short (one screen width) so that they are not cut off. However, the lack of localization and other advanced formatting features make this approach less suitable for general end-user text display.
1. Unless there is an implicit cast operator, as described in Advanced Topic: Cast Operator.
GetHashCode()
Overriding GetHashCode()
is more complex than overriding ToString()
.Even so, you should override GetHashCode()
when you are overriding Equals()
, and there is a compiler warning to indicate this step is recommended if you don’t. Overriding GetHashCode()
is a good practice when you are using it as a key into a hash table collection (e.g., System.Collections.Hashtable
and System.Collections.Generic.Dictionary
).
The purpose of the hash code is to efficiently balance a hash table by generating a number that corresponds to the value of an object. Here are some implementation principles for a good GetHashCode()
implementation:
Required: Equal objects must have equal hash codes (if a.Equals(b)
, then a.GetHashCode() == b.GetHashCode()
).
Required: GetHashCode()
’s returns over the life of a particular object should be constant (the same value), even if the object’s data changes. In many cases, you should cache the method return to enforce this constraint. However, when caching the value, be sure not to use the hash code when checking equality; if you do, two identical objects—one with a cached hash code of changed identity properties—will not return the correct result.
Required: GetHashCode()
should not throw any exceptions; GetHashCode()
must always successfully return a value.
Performance: Hash codes should be unique whenever possible. However, since hash codes return only an int
, there inevitably will be an overlap in hash codes for objects that have potentially more values than an int
can hold, which is virtually all types. (An obvious example is long
, since there are more possible long
values than an int
could uniquely identify.)
Performance: The possible hash code values should be distributed evenly over the range of an int
. For example, creating a hash that doesn’t consider the distribution of a string in Latin-based languages primarily centered on the initial 128 ASCII characters would result in a very uneven distribution of string values and would not be a strong GetHashCode()
algorithm.
Performance: GetHashCode()
should be optimized for performance. GetHashCode()
is generally used in Equals()
implementations to short-circuit a full equals comparison if the hash codes are different. As a result, it is frequently called when the type is used as a key type in dictionary collections.
Performance: Small differences between two objects should result in large differences between hash code values—ideally, a 1-bit difference in the object should result in approximately 16 bits of the hash code changing, on average. This helps ensure that the hash table remains balanced no matter how it is “bucketing” the hash values.
Security: It should be difficult for an attacker to craft an object that has a particular hash code. Such an attack seeks to flood a hash table with large amounts of data that all hash to the same value. The hash table implementation can then become inefficient, resulting in a denial-of-service attack.
These guidelines and rules are, of course, contradictory: It is very difficult to come up with a hash algorithm that is fast and meets all of these guidelines. As with any design problem, you’ll need to use a combination of good judgment and realistic performance measurements to come up with a good solution.
Consider the GetHashCode()
implementation for the Coordinate
type shown in Listing 10.2.
public struct Coordinate { public Coordinate(Longitude longitude, Latitude latitude) { Longitude = longitude; Latitude = latitude; } public Longitude Longitude { get; } public Latitude Latitude { get; } public override int GetHashCode() => => HashCode.Combine( Longitude.GetHashCode(), Latitude.GetHashCode()); // ... }
There are numerous well-established algorithms for the GetHashCode()
implementation, each with satisfactory results in terms of the guidelines described (see http://bit.ly/39yP8lm). However, the easiest approach is to call System.HashCode
’s Combine()
method, specifying a GetHashCode()
result from each of the identifying fields—the fields that produce your object’s uniqueness. (If the identifying fields are numbers, be wary of mistakenly using the fields themselves rather than their hash code values.) ValueTuple
invokes HashCode.Combine()
; thus, it may be easier to remember that you can adequately create a ValueTuple
with the same identifying fields (not their hash codes) and invoke the resulting tuple’s GetHashCode()
member.
Note that Coordinate
does not cache the value of the hash code. Since each field in the hash code calculation is readonly
, the value can’t change. However, implementations should cache the hash code if the calculated values could change or if a cached value could offer a significant performance advantage. However, if you decide to cache the hash code, do not use the hash code when checking equality. Doing so may cause an object with immutable identity properties to fail an identity check because the hash code was calculated before the identity properties changed.
Equals()
Overriding Equals()
without overriding GetHashCode()
results in a warning such as that shown in Output 10.1.
Output 10.1
warning CS0659: '<Class Name>' overrides Object.Equals(object? o) but does not override Object.GetHashCode()
Generally, developers expect overriding Equals()
to be trivial, but it includes a surprising number of subtleties that require careful thought and testing.
Two references are identical if both refer to the same instance. object
includes a static method called ReferenceEquals()
that explicitly checks for this object identity (see Figure 10.1).
However, reference equality is not the only type of equality. Two object instances can also be considered equal if the values of some or all of their members are equal. Consider the comparison of two ProductSerialNumber
s shown in Listing 10.3.
public sealed class ProductSerialNumber { // ... } class Program { static void Main() { ProductSerialNumber serialNumber1 = new ProductSerialNumber("PV", 1000, 09187234); ProductSerialNumber serialNumber2 = serialNumber1; ProductSerialNumber serialNumber3 = new ProductSerialNumber("PV", 1000, 09187234); // These serial numbers ARE the same object identity if(!ProductSerialNumber.ReferenceEquals(serialNumber1, serialNumber2)) { throw new Exception( "serialNumber1 does NOT " + "reference equal serialNumber2"); } // And, therefore, they are equal else if(!serialNumber1.Equals(serialNumber2)) { throw new Exception( "serialNumber1 does NOT equal serialNumber2"); } else { Console.WriteLine( "serialNumber1 reference equals serialNumber2"); Console.WriteLine( "serialNumber1 equals serialNumber2"); } // These serial numbers are NOT the same object identity if (ProductSerialNumber.ReferenceEquals(serialNumber1, serialNumber3)) { throw new Exception( "serialNumber1 DOES reference " + "equal serialNumber3"); } // But they are equal (assuming Equals is overloaded) else if(!serialNumber1.Equals(serialNumber3) || serialNumber1 != serialNumber3) { throw new Exception( "serialNumber1 does NOT equal serialNumber3"); } Console.WriteLine( "serialNumber1 equals serialNumber3" ); } }
The results of Listing 10.3 appear in Output 10.2.
Output 10.2
serialNumber1 reference equals serialNumber2 serialNumber1 equals serialNumber3
As the last assertion demonstrates by its use of ReferenceEquals()
, serialNumber1
and serialNumber3
are not the same reference. However, the code constructs them with the same values, and both are logically associated with the same physical product. If one instance was created from data in the database and another was created from manually entered data, you would expect the instances to be equal, so that the product would not be duplicated (reentered) in the database. Two identical references are obviously equal; however, two different objects could be equal but not reference equal. Such objects will not have identical object identities, but they may have key data that identifies them as being equal objects.
Only reference types can be reference equal, thereby supporting the concept of identity. Calling ReferenceEquals()
on value types will always return false
because value types are boxed when they are converted to object
for the call. Even when the same variable is passed in both (value type) parameters to ReferenceEquals()
, the result will still be false
because the values are boxed independently. Listing 10.4 demonstrates this behavior: Because each argument is put into a “different box” in this example, they are never reference equal.
Note
Calling ReferenceEquals()
on value types will always return false
.
public struct Coordinate { public Coordinate(Longitude longitude, Latitude latitude) { Longitude = longitude; Latitude = latitude; } public Longitude Longitude { get; } public Latitude Latitude { get; } // ... } class Program { public void Main() { //... Coordinate coordinate1 = new Coordinate( new Longitude(48, 52), new Latitude(-2, -20)); // Value types will never be reference equal if ( Coordinate.ReferenceEquals(coordinate1, coordinate1) ) { throw new Exception( "coordinate1 reference equals coordinate1"); } Console.WriteLine( "coordinate1 does NOT reference equal itself" ); } }
In contrast to the definition of Coordinate
as a reference type in Chapter 9, the definition going forward is that of a value type (struct
) because the combination of Longitude
and Latitude
data is logically thought of as a value and its size is less than 16 bytes. (In Chapter 9, Coordinate
aggregated Angle
rather than Longitude
and Latitude
.) A contributing factor to declaring Coordinate
as a value type is that it is a (complex) numeric value that has operations on it. In contrast, a reference type such as Employee
is not a value that you manipulate numerically, but rather refers to an object in real life.
Equals()
To determine whether two objects are equal (i.e., if they have the same identifying data), you use an object’s Equals()
method. The implementation of this virtual method on object
uses ReferenceEquals()
to evaluate equality. Since this implementation is often inadequate, it is sometimes necessary to override Equals()
with a more appropriate implementation.
Note
The implementation of object.Equals()
, the default implementation on all objects before overloading, relies on ReferenceEquals()
alone.
For objects to equal one another, the expectation is that the identifying data within them will be equal. For ProductSerialNumber
s, for example, the ProductSeries
, Model
, and Id
must be the same; however, for an Employee
object, perhaps comparing EmployeeId
s would be sufficient to determine equality. To correct the object.Equals()
implementation, it is necessary to override it. Value types, for example, override the Equals()
implementation to instead use the fields that the type includes.
The steps for overriding Equals()
are as follows:
Check for null
.
Check for equivalent types.
Invoke a typed helper method that can treat the operand as the compared type rather than an object (see the Equals(Coordinate obj)
method in Listing 10.5).
Possibly check for equivalent hash codes to short-circuit an extensive, field-by-field comparison. (Two objects that are equal cannot have different hash codes.)
Check base.Equals()
.
Compare each identifying field for equality.
Override GetHashCode()
.
Override the ==
and !=
operators (see the next section).
Listing 10.5 shows a sample Equals()
implementation.
public struct Longitude { // ... } public struct Latitude { // ... } public struct Coordinate: IEquatable<Coordinate> { public Coordinate(Longitude longitude, Latitude latitude) { Longitude = longitude; Latitude = latitude; } public Longitude Longitude { get; } public Latitude Latitude { get; } public override bool Equals(object? obj) { // STEP 1: Check for null if (obj is null) { return false; } // STEP 2: Equivalent data types; // can be avoided if type is sealed if (GetType() != obj.GetType()) { return false; } // STEP 3: Invoked strongly type helper version of Equals() return Equals((Coordinate)obj); } public bool Equals(Coordinate obj) { // STEP 1: Check for null if a reference type // (e.g., a reference type) // if (ReferenceEquals(obj, null)) // { // return false; // } // STEP 4: Possibly check for equivalent hash codes // but not if the identity properties are mutable // and the hash code is cached. // if (GetHashCode() != obj.GetHashCode()) // { // return false; // } // STEP 5: Check base.Equals if base overrides Equals() if ( !base.Equals(obj) ) { return false; } // STEP 6: Compare identifying fields for equality // using an overload of Equals on Longitude return ( (Longitude.Equals(obj.Longitude)) && (Latitude.Equals(obj.Latitude)) ); } // STEP 7: Override GetHashCode public override int GetHashCode() { /* ... */ } }
In this implementation, the first two checks are relatively obvious. However, step 2 can be avoided if the type is sealed.
Steps 4 to 6 occur in an overload of Equals()
that takes the Coordinate
data type specifically. This way, a comparison of two Coordinate
s will avoid Equals(object? obj)
and its GetType()
check altogether.
Since GetHashCode()
is not cached and is no more efficient than step 6, the GetHashCode()
comparison is commented out. Regardless, because GetHashCode()
does not necessarily return a unique value (it simply identifies when operands are different), on its own it does not conclusively identify equal objects. Furthermore, you should not compare the hash code when identity values are mutable and the hash code is cached; if you do, a comparison of equitable objects will return false
.
If base.Equals()
is not implemented, you could eliminate step 5. However, if base.Equals()
was added later, you would be missing an important check. For this reason, you should consider adding it by default.
Like GetHashCode()
, Equals()
should never throw any exceptions. It is a valid choice to compare any object with any other object, and doing so should never result in an exception.
GetHashCode()
and Equals()
with TuplesAs shown in the previous two sections, the implementations of Equals()
and GetHashCode()
are fairly complex, yet the actual code is generally boilerplate. For Equals()
, it’s necessary to compare all the contained identifying data structures while avoiding infinite recursion and null
reference exceptions. For GetHashCode()
, it’s necessary to combine the unique hash code of each of the non-null
-contained identifying data structures in an exclusive OR operation. With C# 7.0 tuples, this turns out to be quite simple.
For Equals(Coordinate coordinate)
, you can group each of the identifying members into a tuple and compare them to the target argument of the same type:
public bool Equals(Coordinate? coordinate) => return (Longitude, Latitude).Equals( (coordinate?.Longitude, coordinate?.Latitude));
(One might argue that this would be more readable if each identifying member were explicitly compared instead, but I leave that for the reader to arbitrate.) Internally, the tuple (System.ValueTuple<...>
) uses EqualityComparer<T>
, which relies on the type parameters implementation of IEquatable<T>
(which contains only a single Equals<T>(T other)
member). Therefore, to correctly override Equals
, you need to follow this guideline: DO implement IEquatable<T>
when overriding Equals()
. That way, your own custom data types will leverage your custom implementation of Equals()
rather than Object.Equals()
.
Perhaps the more compelling of the two overloads is GetHashCode()
and its use of the tuple. Rather than engaging in the complex gymnastics of an exclusive OR operation of the non-null
identifying members, you can simply instantiate a tuple of all identifying members and return the GetHashCode()
value for the tuple, like so:
public override int GetHashCode() => return (Radius, StartAngle, SweepAngle).GetHashCode();
Note that in C# 7.3, the tuple now implements ==
and !=
, which it should have when it was first implemented—a topic we investigate next.
The preceding section looked at overriding Equals()
and provided the guideline that the class should also implement ==
and !=
. Implementing any operator is called operator overloading. This section describes how to perform such overloading not only for ==
and !=
, but also for other supported operators.
For example, string
provides a +
operator that concatenates two strings. This is perhaps not surprising, because string
is a predefined type, so it could possibly have special compiler support. However, C# provides for adding +
operator support to a class or struct. In fact, all operators are supported except x.y
, f(x)
, new
, typeof
, default
, checked
, unchecked
, delegate
, is
, as
, =
, and =>
. One particularly noteworthy operator that cannot be implemented is the assignment operator; there is no way to change the behavior of the =
operator.
Before going through the exercise of implementing an operator overload, consider the fact that such operators are not discoverable through IntelliSense. Unless the intent is for a type to act like a primitive type (e.g., a numeric type), you should avoid overloading an operator.
==
, !=
, <
, >
, <=
, >=
)Once Equals()
is overridden, there is a possible inconsistency. That is, two objects could return true
for Equals()
but false
for the ==
operator because ==
performs a reference equality check by default. To correct this flaw, it is important to overload the equals (==
) and not equals (!=
) operators as well.
For the most part, the implementation for these operators can delegate the logic to Equals()
, or vice versa. However, for reference types, some initial null
checks are required first (see Listing 10.6).
public sealed class ProductSerialNumber { // ... public static bool operator ==( ProductSerialNumber leftHandSide, ProductSerialNumber rightHandSide) { // Check if leftHandSide is null // (operator == would be recursive) if(leftHandSide is null)) { // Return true if rightHandSide is also null // and false otherwise return rightHandSide is null; } return leftHandSide.Equals(rightHandSide); } public static bool operator !=( ProductSerialNumber leftHandSide, ProductSerialNumber rightHandSide) { return !(leftHandSide == rightHandSide); } }
Note that in this example, we use ProductSerialNumber
rather than Coordinate
to demonstrate the logic for a reference type, which has the added complexity of a null
value.
You should avoid using the equality operator within an equality operator (leftHandSide == null
). Doing so would recursively call back into the method, resulting in a loop that continues until the stack overflows. To avoid this problem, you can use is null
(C# 7.0 or later) or ReferenceEquals()
to check for null
.
+
, -
, *
, /
, %
, &
, |
, ^
, <<
, >>
)You can add an Arc
to a Coordinate
. However, the code so far provides no support for the addition operator. Instead, you need to define such a method, as Listing 10.7 demonstrates.
struct Arc { public Arc( Longitude longitudeDifference, Latitude latitudeDifference) { LongitudeDifference = longitudeDifference; LatitudeDifference = latitudeDifference; } public Longitude LongitudeDifference { get; } public Latitude LatitudeDifference { get; } } struct Coordinate { // ... public static Coordinate operator +( Coordinate source, Arc arc) { Coordinate result = new Coordinate( new Longitude( source.Longitude + arc.LongitudeDifference), new Latitude( source.Latitude + arc.LatitudeDifference)); return result; } }
The +
, -
, *
, /
, %
, &
, |
, ^
, <<
, and >>
operators are implemented as binary static methods, where at least one parameter is of the containing type. The method name is the operator symbol prefixed by the keyword operator
. As shown in Listing 10.8, given the definition of the -
and +
binary operators, you can add and subtract an Arc
to and from the coordinate. Note that Longitude
and Latitude
will also require implementations of the +
operator because they are called by source.Longitude + arc.LongitudeDifference
and source.Latitude + arc.LatitudeDifference
.
public class Program { public static void Main() { Coordinate coordinate1,coordinate2; coordinate1 = new Coordinate( new Longitude(48, 52), new Latitude(-2, -20)); Arc arc = new Arc(new Longitude(3), new Latitude(1)); coordinate2 = coordinate1 + arc; Console.WriteLine(coordinate2); coordinate2 = coordinate2 - arc; Console.WriteLine(coordinate2); coordinate2 += arc; Console.WriteLine(coordinate2); } }
The results of Listing 10.8 appear in Output 10.3.
Output 10.3
51° 52' 0 E -1° -20' 0 N 48° 52' 0 E -2° -20' 0 N 51° 52' 0 E -1° -20' 0 N
For Coordinate
, you implement the –
and +
operators to return coordinate locations after adding/subtracting Arc
. This allows you to string multiple operators and operands together, as in result = ((coordinate1 +arc1) + arc2) + arc3
. Moreover, by supporting the same operators (+
/-
) on Arc
(see Listing 10.9 later in this chapter), you could eliminate the parentheses. This approach works because the result of the first operand (arc1 + arc2
) is another Arc
, which you can then add to the next operand of type Arc
or Coordinate
.
In contrast, consider what would happen if you provided a –
operator that had two Coordinate
s as parameters and returned a double
corresponding to the distance between the two coordinate
s. Adding a double
to a Coordinate
is undefined, so you could not string together operators and operands. Caution is in order when defining operators that return a different type, because doing so is counterintuitive.
+=
, -=
, *=
, /=
, %=
, &=
, …)As previously mentioned, there is no support for overloading the assignment operator. However, assignment operators in combination with binary operators (+=
, -=
, *=
, /=
, %=
, &=
, |=
, ^=
, <<=
, and >>=
) are effectively overloaded when overloading the binary operator. Given the definition of a binary operator without the assignment, C# automatically allows for assignment in combination with the operator. Using the definition of Coordinate
in Listing 10.7, therefore, you can have code such as
coordinate += arc;
which is equivalent to the following:
coordinate = coordinate + arc;
&&
, ||
)Like assignment operators, conditional logical operators cannot be overloaded explicitly. However, because the logical operators &
and |
can be overloaded, and the conditional operators comprise the logical operators, effectively it is possible to overload conditional operators. x && y
is processed as x & y
, where y
must evaluate to true
. Similarly, x || y
is processed as x | y
only if x
is false
. To enable support for evaluating a type to true
or false
—in an if
statement, for example—it is necessary to override the true
/false
unary operators.
+
, -
, !
, ~
, ++
, --
, true
, false
)Overloading unary operators is very similar to overloading binary operators, except that they take only one parameter, also of the containing type. Listing 10.9 overloads the +
and –
operators for Longitude
and Latitude
and then uses these operators when overloading the same operators in Arc
.
public struct Latitude { // ... public static Latitude operator -(Latitude latitude) { return new Latitude(-latitude.DecimalDegrees); } public static Latitude operator +(Latitude latitude) { return latitude; } } public struct Longitude { // ... public static Longitude operator -(Longitude longitude) { return new Longitude(-longitude.DecimalDegrees); } public static Longitude operator +(Longitude longitude) { return longitude; } } public struct Arc { // ... public static Arc operator -(Arc arc) { // Uses unary – operator defined on // Longitude and Latitude return new Arc(-arc.LongitudeDifference, -arc.LatitudeDifference); } public static Arc operator +(Arc arc) { return arc; } }
Just as with numeric types, the +
operator in this listing doesn’t have any effect and is provided for symmetry.
Overloading true
and false
is subject to the additional requirement that both must be overloaded—not just one of the two. The signatures are the same as with other operator overloads; however, the return must be a bool
, as demonstrated in Listing 10.10.
public static bool operator false(IsValid item) { // ... } public static bool operator true(IsValid item) { // ... }
You can use types with overloaded true
and false
operators in if
, do
, while
, and for
controlling expressions.
Currently, there is no support in Longitude
, Latitude
, and Coordinate
for casting to an alternative type. For example, there is no way to cast a double
into a Longitude
or Latitude
instance. Similarly, there is no support for assigning a Coordinate
using a string
. Fortunately, C# provides for the definition of methods specifically intended to handle the converting of one type to another. Furthermore, the method declaration allows for specifying whether the conversion is implicit or explicit.
Defining a conversion operator is similar in style to defining any other operator, except that the “operator” is the resultant type of the conversion. Additionally, the operator
keyword follows a keyword that indicates whether the conversion is implicit or explicit (see Listing 10.11).
public struct Latitude { // ... public Latitude(double decimalDegrees) { DecimalDegrees = Normalize(decimalDegrees); } public double DecimalDegrees { get; } // ... public static implicit operator double(Latitude latitude) { return latitude.DecimalDegrees; } public static implicit operator Latitude(double degrees) { return new Latitude(degrees); } // ... }
With these conversion operators, you now can convert double
s implicitly to and from Latitude
objects. Assuming similar conversions exist for Longitude
, you can simplify the creation of a Coordinate
object by specifying the decimal degrees portion of each coordinate portion (e.g., coordinate = new Coordinate(43, 172);
).
Note
When implementing a conversion operator, either the return or the parameter must be of the enclosing type—in support of encapsulation. C# does not allow you to specify conversions outside the scope of the converted type.
The difference between defining an implicit and an explicit conversion operator centers on preventing an unintentional implicit conversion that results in undesirable behavior. You should be aware of two possible consequences of using the explicit conversion operator. First, conversion operators that throw exceptions should always be explicit. For example, it is highly likely that a string will not conform to the format that a conversion from string
to Coordinate
requires. Given the chance of a failed conversion, you should define the particular conversion operator as explicit, thereby requiring that you be intentional about the conversion and ensure that the format is correct or, alternatively, that you provide code to handle the possible exception. Frequently, the pattern for conversion is that one direction (string
to Coordinate
) is explicit and the reverse (Coordinate
to string
) is implicit.
A second consideration is that some conversions will be lossy. Converting from a float
(4.2
) to an int
is entirely valid, assuming an awareness of the fact that the decimal portion of the float
will be lost. Any conversions that will lose data and will not successfully convert back to the original type should be defined as explicit. If an explicit cast is unexpectedly lossy or invalid, consider throwing a System.InvalidCastException
.
Instead of placing all code into one monolithic binary file, C# and the underlying CLI framework allow you to spread code across multiple assemblies. This approach enables you to reuse assemblies across multiple executables.
Frequently, the code we write could be useful to more than one program. Imagine, for example, using the Longitude
, Latitude
, and Coordinate
classes from a mapping program and a digital photo geocoding program or writing a command-line parser class. Classes and sets of classes like these can be written once and then reused from many different programs. As such, they need to be grouped together into an assembly called a library or class library and written for the purposes of reuse rather than only within a single program.
To create a library rather than a console project, follow the same directions as provided in Chapter 1, with one exception: For Dotnet CLI, use Class Library or classlib for the template.
Similarly, with Visual Studio 2019, from the File->New Project… menu item (Ctrl+Shift+N), use the Search text box to find all Class Library templates, and then select Class Library (.NET Standard)—the Visual C# version, of course. Use GeoCoordinates for the project name.
Next, place the source code from Listing 10.9 into separate files for each struct and name the file after the struct name and build the project. Building the project will compile the C# code into an assembly—a GeoCoordinates.dll
file—and place it into a subdirectory of .in
.
Given the library, we need to reference it from a program. For example, for a new console program using the Program
class from Listing 10.8, we need to add a reference to the GeoCoordinates.dll
assembly, identifying where the library is located and embedding metadata that uniquely identifies the library into the program. There are several ways to do this. First, you can reference the library project file (*.csproj
), thus identifying which project contains the library source code and forming a dependency between the two projects. You can’t compile the program referencing the library until the library is compiled. This dependency causes the library to compile (if it isn’t compiled already) when the program compiles.
The second approach is to reference the assembly file itself. In other words, reference the compiled library (*.dll
) rather than the project. This makes sense when the library is compiled separately from the program, such as by another team within your organization.
Third, you can reference a NuGet package, as described in the next section.
Note that it isn’t only console programs that can reference libraries and packages. In fact, any assembly can reference any other assembly. Frequently, one library will reference another library, creating a chain of dependencies.
In Chapter 1, we discussed creating a console program. Doing so created a program that included a Main
method—the entry point at which the program will begin executing. To add a reference to the newly created assembly, we continue where we left off with an additional command for adding a reference:
dotnet add .HelloWorldHelloWord.csproj package .GeoCordinatesinDebug netcoreapp2.0GeoCoordinates.dll
Following the add
argument is a file path for the compiled assembly referenced by the project.
Rather than referencing the assembly, you can reference the project file. As already mentioned, this chains the projects together so that building the program will trigger the class library to compile first if it hasn’t compiled already. The advantage is that as the program compiles, it will automatically locate the compiled class library assembly—whether it be in the debug or release directory, for example. The command for referencing a project file is as follows:
dotnet add .HelloWorldHelloWord.csproj reference .GeoCoordinates GeoCoordinates.csproj
If you have the source code for a class library and that source code changes frequently, consider referencing the class library using the class library project file rather than the compiled assembly.
Upon completion of either the project or the compiled assembly reference, your project can compile with the Program
class source code found in Listing 10.8.
In Chapter 1, we also discussed creating a console program with Visual Studio. This created a program that included a Main
method. To add a reference to the GeoCoordinates
assembly, click the Project->Add Reference… menu item. Next, from the ProjectsSolution tab, select the GeoCoordinates project and OK to confirm the reference.
Similarly, to add an assembly reference, follow the same process as before, clicking the Project->Add Reference… menu item. However, this time click the Browse… button and navigate to and select the GeoCordinates.dll
assembly.
As with Dotnet CLI, you can compile the program project with the Program
class source code found in Listing 10.8.
Begin 4.0
Starting with Visual Studio 2010, Microsoft introduced a library packaging system called NuGet. This system is intended to provide a means to easily share libraries across projects and between companies. Frequently, a library assembly is more than just a single compiled file. It might have configuration files, additional resources, and metadata associated with it. Unfortunately, before NuGet, there was no manifest that identified all the dependencies. Furthermore, there was no standard provider or package library for where the referenced assemblies could be found.
NuGet addresses both issues. Not only does NuGet include a manifest that identifies the author(s), companies, dependencies, and more, it also comes with a default package provider at NuGet.org where packages can be uploaded, updated, indexed, and then downloaded by projects that are looking to leverage them. With NuGet, you can reference a NuGet package (*.nupkg
) and have it automatically installed from one of your preconfigured NuGet provider URLs.
The NuGet package is accompanied by a manifest (a *.nuspec
file) that contains all the additional metadata included in the package. Additionally, it provides all the additional resources you may want—localization files, config files, content files, and so on. In the end, the NuGet package is an archive of all the individual resources combined into a single ZIP file—albeit with the .nupkg
extension. If you rename the file with a *.zip
extension, you can open and examine the file using any common compression utility.
Begin 7.0
To add a NuGet package to your project using Dotnet CLI requires executing a single command:
>dotnet add .HelloWorldHelloWorld.csproj package Microsoft.Extensions. Logging.Console
This command checks each of the registered NuGet package providers for the specified package and downloads it. (You can also trigger the download explicitly using the command dotnet restore
.)
To create a local NuGet package, use the dotnet pack
command. This command generates a GeoCoordinates.1.0.0.nupkg
file, which you can reference using the add ... package
command.
The digits following the assembly name correspond to the package version number. To specify the version number explicitly, edit the project file (*.csproj
) and add a <Version>...</Version>
child element to the PropertyGroup
element.
End 7.0
If you followed the instructions laid out in Chapter 1, you already have a HelloWorld
project. Starting with that project, you can add a NuGet package using Visual Studio 2019 as follows:
Click the Project->Manage NuGet Packages… menu item (see Figure 10.2).
4.0
Select the Browse filter (generally the Installed filter is selected, so be sure to switch to Browse to add new package references), and then enter Microsoft.Extensions.Logging.Console into the Search (Ctrl+E) text box. Note that a partial name such as Logging.Console will also filter the list (see Figure 10.3).
4.0
Click the Install button to install the package into the project.
Upon completion of these steps, it is possible to begin using the Microsoft.Extensions.Logging.Console
library, along with any dependencies that it may have (which are automatically added in the process).
As with Dotnet CLI, you can use Visual Studio to build your own NuGet package using the Build->Pack <Project Name> menu item. Similarly, you can specify the package version number from the Package tab of the Project Properties.
Once the package or project is referenced, you can begin using it as though all the source code was included in the project. Listing 10.12 shows, for example, how to use the Microsoft.Extensions.Logging
library, and Output 10.4 shows the sample output.
public class Program { public static void Main(string[] args) { using ILoggerFactory loggerFactory = LoggerFactory.Create(builder => builder.AddConsole()/*.AddDebug()*/); ILogger logger = loggerFactory.CreateLogger( categoryName: "Console"); logger.LogInformation($@"Hospital Emergency Codes: = '{ string.Join("', '", args)}'"); // ... logger.LogWarning("This is a test of the emergency..."); // ... } }
Output 10.4
>dotnet run -- black blue brown CBR orange purple red yellow info: Console[0] Hospital Emergency Codes: = 'black', 'blue', 'brown', 'CBR', 'orange', 'purple', 'red', 'yellow' warn: Console[0] This is a test of the emergency...
4.0
This library Microsoft.Extensions.Logging.Console
NuGet package is used to log data to the console. In this case, we log both an information message and a warning and the messages appear in the console.
If you also referenced the Microsoft.Extensions.Logging.Debug
library, you could add an .AddDebug()
invocation after or before the AddConsole()
invocation. The result would be that output similar to Output 10.4 would also appear in the debug output window of Visual Studio (select the Debug->Windows->Output menu) or Visual Studio Code (with the View->Debug Console menu).
The Microsoft.Extensions.Logging.Console
NuGet package has three dependencies, including Microsoft.Extensions.Logging
. Each of these is listed under the DependenciesPackages node of the project in the Visual Studio Explorer window of Visual Studio. By adding a NuGet package, all dependencies are automatically added.
End 4.0
Just as classes serve as an encapsulation boundary for behavior and data, so assemblies provide for similar boundaries among groups of types. Developers can break a system into assemblies and then share those assemblies with multiple applications or integrate them with assemblies provided by third parties.
public
or internal
Access Modifiers on Type DeclarationsBy default, a class or struct without any access modifier is defined as internal
.2 The result is that the class is inaccessible from outside the assembly. Even if another assembly references the assembly containing the class, all internal classes within the referenced assemblies will be inaccessible.
2. Excluding nested types, which are private
by default.
Just as private
and protected
provide levels of encapsulation to members within a class, so C# supports the use of access modifiers at the class level for control over the encapsulation of the classes within an assembly. The access modifiers available are public
and internal
. To expose a class outside the assembly, the assembly must be marked as public
. Therefore, before compiling the Coordinates.dll
assembly, it is necessary to modify the type declarations as public
(see Listing 10.13).
public struct Coordinate { // ... } public struct Latitude { // ... } public struct Longitude { // ... } public struct Arc { // ... }
Similarly, declarations such as class
and enum
can be either public
or internal
.3 The internal access modifier is not limited to type declarations; that is, it is also available on type members. Consequently, you can designate a type as public
but mark specific methods within the type as internal
so that the members are available only from within the assembly. It is not possible for the members to have a greater accessibility than the type. If the class is declared as internal
, public members on the type will be accessible only from within the assembly.
3. You can decorate nested classes with any access modifier available to other class members (e.g., private
). However, outside the class scope, the only access modifiers that are available are public
and internal
.
protected internal
Type ModifierAnother type member access modifier is protected internal
. Members with an accessibility modifier of protected internal
will be accessible from all locations within the containing assembly and from classes that derive from the type, even if the derived class is not in the same assembly. The default member access modifier is private
, so when you add an access modifier (other than public
), the member becomes slightly more visible.
Note
Members with an accessibility modifier of protected internal
will be accessible from all locations within the containing assembly and from classes that derive from the type, even if the derived class is not in the same assembly.
End 7.2
As mentioned in Chapter 2, all data types are identified by the combination of their namespace and their name. However, in the CLR, there is no such thing as a “namespace.” The type’s name actually is the fully qualified type name, including the namespace. For the classes you defined earlier, there was no explicit namespace declaration. Classes such as these are automatically declared as members of the default global namespace. It is likely that such classes will experience a name collision, which occurs when you attempt to define two classes with the same name. Once you begin referencing other assemblies from third parties, the likelihood of a name collision increases even further.
More important, there are thousands of types in the CLI framework and multiple orders of magnitude more outside the framework. Finding the right type for a particular problem, therefore, could potentially be a significant challenge.
The resolution to both of these problems is to organize all the types, grouping them into logical related categories called namespaces. For example, classes outside the System
namespace are generally placed into a namespace corresponding with the company, product name, or both. Classes from Addison-Wesley, for example, are placed into an Awl
or AddisonWesley
namespace, and classes from Microsoft (not System
classes) are located in the Microsoft
namespace. The second level of a namespace should be a stable product name that will not vary between versions. Stability, in fact, is key at all levels. Changing a namespace name is a version-incompatible change that should be avoided. For this reason, you should avoid using volatile names (organization hierarchy, fleeting brands, and so on) within a namespace name.
Namespaces should be labeled using PascalCase, but if your brand uses nontraditional casing, it is acceptable to use the brand casing. (Consistency is key, so if that will be problematic—with PascalCase or brand-based casing—favor the use of whichever convention will produce the greater consistency.) You use the namespace
keyword to create a namespace and to assign a class to it, as shown in Listing 10.14.
// Define the namespace AddisonWesley namespace AddisonWesley { class Program { // ... } } // End of AddisonWesley namespace declaration
All content between the namespace declaration’s curly braces will then belong within the specified namespace. In Listing 10.14, for example, Program
is placed into the namespace AddisonWesley
, making its full name AddisonWesley.Program
.
NOTE
In the CLR, there is no such thing as a “namespace.” Rather, the type’s name is the fully qualified type name.
Like classes, namespaces support nesting. This provides for a hierarchical organization of classes. All the System
classes relating to network APIs are in the namespace System.Net
, for example, and those relating to the Web are in System.Web
.
There are two ways to nest namespaces. The first approach is to nest them within one another (similar to classes), as demonstrated in Listing 10.15.
// Define the namespace AddisonWesley namespace AddisonWesley { // Define the namespace AddisonWesley.Michaelis namespace Michaelis { // Define the namespace // AddisonWesley.Michaelis.EssentialCSharp namespace EssentialCSharp { // Declare the class // AddisonWesley.Michaelis.EssentialCSharp.Program class Program { // ... } } } } // End of AddisonWesley namespace declaration
Such a nesting will assign the Program
class to the AddisonWesley.Michaelis.EssentialCSharp
namespace.
The second approach is to use the full namespace in a single namespace declaration in which a period separates each identifier, as shown in Listing 10.16.
// Define the namespace AddisonWesley.Michaelis.EssentialCSharp namespace AddisonWesley.Michaelis.EssentialCSharp { class Program { // ... } } // End of AddisonWesley namespace declaration
Regardless of whether a namespace declaration follows the pattern shown in Listing 10.15, that in Listing 10.16, or a combination of the two, the resultant CIL code will be identical. The same namespace may occur multiple times, in multiple files, and even across assemblies. For example, with the convention of one-to-one correlation between files and classes, you can define each class in its own file and surround it with the same namespace declaration.
Given that namespaces are key for organizing types, it is frequently helpful to use the namespace for organizing all the class files. For this reason, it is a good idea to create a folder for each namespace, placing a class such as AddisonWesley.Fezzik.Services.RegistrationService
into a folder hierarchy corresponding to the name.
When using Visual Studio projects, if the project name is AddisonWesley.Fezzik
, you should create one subfolder called Services
into which RegistrationService.cs
is placed. You would then create another subfolder (Data
, for example) into which you place classes relating to entities within the program (RealestateProperty
, Buyer
, and Seller
, for example).
Chapter 1 introduced comments. However, you can use XML comments for more than just notes to other developers reviewing the source code. XML-based comments follow a practice popularized with Java. Although the C# compiler ignores all comments as far as the resultant executable goes, the developer can use command-line options to instruct the compiler 4 to extract the XML comments into a separate XML file. By taking advantage of the XML file generation, the developer can generate documentation of the API from the XML comments. In addition, C# editors can parse the XML comments in the code and display them to developers as distinct regions (e.g., as a different color from the rest of the code) or parse the XML comment data elements and display them to the developer.
4. The C# standard does not specify whether the C# compiler or a separate utility should take care of extracting the XML data. However, all mainstream C# compilers include the necessary functionality via a compile switch instead of within an additional utility.
Figure 10.4 demonstrates how an IDE can take advantage of XML comments to assist the developer with a tip about the code he is trying to write. Such coding tips offer significant assistance in large programs, especially when multiple developers share code. For this to work, however, the developer obviously must take the time to enter the XML comments within the code and then direct the compiler to create the XML file. The next section explains how to accomplish this.
Begin 2.0
Starting with Visual Studio 2019, you can also embed simple HTML into a comment, and it will be reflected in the tips. For example, surrounding console
with <strong>
and </strong>
will cause the word “console” to display in bold in Figure 10.4.
Consider the listing of the DataStorage
class, as shown in Listing 10.17.
Listing 10.17 uses both XML-delimited comments that span multiple lines and single-line XML comments in which each line requires a separate three-forward-slash delimiter (///
).
Given that XML comments are designed to document the API, they are intended for use only in association with C# declarations, such as the class or method shown in Listing 10.17. Any attempt to place an XML comment inline with the code, unassociated with a declaration, will result in a warning by the compiler. The compiler makes the association simply because the XML comment appears immediately before the declaration.
Although C# allows any XML tag to appear in comments, theC# standard explicitly defines a set of tags to be used. <seealsocref="System.IO.StreamWriter"/>
is an example of using the seealso
tag. This tag creates a link between the text and the System.IO.StreamWriter
class.
End 2.0
The compiler checks that the XML comments are well formed and issues a warning if they are not. To generate the XML file, add a DocumentationFile
element to the ProjectProperties
element:
<DocumentationFile>$(OutputPath)$(TargetFramework)$(AssemblyName).xml </DocumentationFile>
This element causes an XML file to be generated during the build into the output directory using the <assemblyname>.xml
as the filename. Using the CommentSamples
class listed earlier and the compiler options listed here, the resultant CommentSamples.XML
file appears as shown in Listing 10.18.
<?xml version="1.0"?> <doc> <assembly> <name>DataStorage</name> </assembly> <members> <member name="T:DataStorage"> <summary> DataStorage is used to persist and retrieve employee data from the files. </summary> </member> <member name="M:DataStorage.Store(Employee)"> <summary> Save an employee object to a file named with the Employee name. </summary> <remarks> This method uses <seealso cref="T:System.IO.FileStream"/> in addition to <seealso cref="T:System.IO.StreamWriter"/> </remarks> <param name="employee"> The employee to persist to a file</param> <date>January 1, 2000</date> </member> <member name="M:DataStorage.Load( System.String,System.String)"> <summary> Loads up an employee object </summary> <remarks> This method uses <seealso cref="T:System.IO.FileStream"/> in addition to <seealso cref="T:System.IO.StreamReader"/> </remarks> <param name="firstName"> The first name of the employee</param> <param name="lastName"> The last name of the employee</param> <returns> The employee object corresponding to the names </returns> <date>January 1, 2000</date>* </member> </members> </doc>
The resultant file includes only the amount of metadata that is necessary to associate an element back to its corresponding C# declaration. This is important because, in general, it is necessary to use the XML output in combination with the generated assembly to produce any meaningful documentation. Fortunately, tools such as the free GhostDoc5 and the open source project NDoc6 can generate documentation.
5. See http://submain.com/ to learn more about GhostDoc.
6. See http://ndoc.sourceforge.net to learn more about NDoc.
Garbage collection is obviously a core function of the runtime. Its purpose is to restore memory consumed by objects that are no longer referenced. The emphasis in this statement is on memory and references: The garbage collector is responsible only for restoring memory; it does not handle other resources such as database connections, handles (files, windows, etc.), network ports, and hardware devices such as serial ports. Also, the garbage collector determines what to clean up, based on whether any references remain. Implicitly, this means that the garbage collector works with reference objects and restores memory on the heap only. Additionally, it means that maintaining a reference to an object will delay the garbage collector from reusing the memory consumed by the object.
All references discussed so far are strong references because they maintain an object’s accessibility and prevent the garbage collector from cleaning up the memory consumed by the object. The framework also supports the concept of weak references. Weak references do not prevent garbage collection on an object, but they do maintain a reference so that if the garbage collector does not clean up the object, it can be reused.
Weak references are designed for reference objects that are expensive to create, yet too expensive to keep around. Consider, for example, a large list of objects loaded from a database and displayed to the user. The loading of this list is potentially expensive, and once the user closes the list, it should be available for garbage collection. However, if the user requests the list multiple times, a second expensive load call will always be required. With weak references, it becomes possible to use code to check whether the list has been cleaned up, and if not, to re-reference the same list. In this way, weak references serve as a memory cache for objects. Objects within the cache are retrieved quickly, but if the garbage collector has recovered the memory of these objects, they will need to be re-created.
Once a reference object (or collection of objects) is recognized as worthy of potential weak reference consideration, it needs to be assigned to System.WeakReference
(see Listing 10.19).
public static class ByteArrayDataSource { static private byte[] LoadData() { // Imagine a much lager number byte[] data = new byte[1000]; // Load data // ... return data; } static private WeakReference<byte[]>? Data { get; set; } static public byte[] GetData() { byte[]? target; if (Data is null) { target = LoadData(); Data = new WeakReference<byte[]>(target); return target; } else if (Data.TryGetTarget(out target)) { return target; } else { // Reload the data and assign it (creating a strong // reference) before setting WeakReference's Target // and returning it. target = LoadData(); Data.SetTarget(target); return target; } } } // ...
Admittedly, this code uses generics, which aren’t discussed in this book until Chapter 12. However, you can safely ignore the <byte[]>
text both when declaring the Data
property and when assigning it. While there is a nongeneric version of WeakReference
, there is little reason to consider it.7
7. Unless programming with .NET Framework 4.5 or earlier.
The bulk of the logic appears in the GetData()
method. The purpose of this method is to always return an instance of the data—whether from the cache or by reloading it. GetData()
begins by checking whether the Data
property is null
. If it is, the data is loaded and assigned to a local variable called target
. This creates a reference to the data so that the garbage collector will not clear it. Next, we instantiate a WeakReference
and pass a reference to the loaded data so that the WeakReference
object has a handle to the data (its target); then, if requested, such an instance can be returned. Do not pass an instance that does not have a local reference to WeakReference
, because it might get cleaned up before you have a chance to return it (i.e., do not call new WeakReference<byte[]>(LoadData())
).
If the Data
property already has an instance of WeakReference
, then the code calls TryGetTarget()
and, if there is an instance, assigns target
, thus creating a reference so that the garbage collector will no longer clean up the data.
Lastly, if WeakReference
’s TryGetTarget()
method returns false, we load the data, assign the reference with a call to SetTarget()
, and return the newly instantiated object.
Garbage collection is a key responsibility of the runtime. Nevertheless, it is important to recognize that the garbage collection process centers on the code’s memory utilization. It is not about the cleaning up of file handles, database connection strings, ports, or other limited resources.
Finalizers allow developers to write code that will clean up a class’s resources. Unlike constructors that are called explicitly using the new
operator, finalizers cannot be called explicitly from within the code. There is no new
equivalent such as a delete
operator. Rather, the garbage collector is responsible for calling a finalizer on an object instance. Therefore, developers cannot determine at compile time exactly when the finalizer will execute. All they know is that the finalizer will run sometime between when an object was last used and generally when the application shuts down normally. The deliberate injection of incertitude with the word “Generally” highlights the fact that finalizers might not execute. This possibility is obvious when you consider that a process might terminate abnormally. For instance, events such as the computer being turned off or a forced termination of the process, such as when debugging the process, will prevent the finalizer from running. However, with .NET Core, even under normal circumstances, finalizers may not get processed before the application shuts down. As we shall see in the next section, it thus may be necessary to take additional action to register finalization activities with other mechanisms.
Note
You cannot determine at compile time exactly when the finalizer will execute.
The finalizer declaration is identical to the destructor syntax of C#’s predecessor—that is, C++. As shown in Listing 10.20, the finalizer declaration is prefixed with a tilde before the name of the class.
using System.IO; public class TemporaryFileStream { public TemporaryFileStream(string fileName) { File = new FileInfo(fileName); // For a preferable solution use FileOptions.DeleteOnClose. Stream = new FileStream( File.FullName, FileMode.OpenOrCreate, FileAccess.ReadWrite); } public TemporaryFileStream() : this(Path.GetTempFileName()) { } // Finalizer ~TemporaryFileStream() { try { Close(); } catch(Exception exception) { // Write event to logs or UI // ... } } public FileStream? Stream { get; private set; } public FileInfo? File { get; private set; } public void Close() { Stream?.Dispose(); try { File?.Delete(); } catch(IOException exception) { Console.WriteLine(exception); } Stream = null; File = null; } }
Finalizers do not allow any parameters to be passed, so they cannot be overloaded. Furthermore, finalizers cannot be called explicitly—that is, only the garbage collector can invoke a finalizer. Access modifiers on finalizers are therefore meaningless, and as such, they are not supported. Finalizers in base classes will be invoked automatically as part of an object finalization call.
Note
Finalizers cannot be called explicitly; only the garbage collector can invoke a finalizer.
Because the garbage collector handles all memory management, finalizers are not responsible for de-allocating memory. Rather, they are responsible for freeing up resources such as database connections and file handles—resources that require an explicit activity that the garbage collector doesn’t know about.
In the finalizer shown in Listing 10.20, we start by disposing of the FileStream
. This step is optional because the FileStream
has its own finalizer that provides the same functionality as Dispose()
. The purpose of invoking Dispose()
now is to ensure that it is cleaned up when TemporaryFileStream
is finalized, since the latter is responsible for instantiating the former. Without the explicit invocation of Stream?.Dispose()
, the garbage collector will clean it up independently from the TemporaryFileStream
once the TemporaryFileStream
object is garbage collected and releases its reference on the FileStream
object. That said, if we didn’t need a finalizer for resource cleanup anyway, it would not make sense to define a finalizer just for invoking FileStream.Dispose()
. In fact, limiting the need for a finalizer to only objects that need resource cleanup that the runtime isn’t already aware of (resources that don’t have finalizers) is an important guideline that significantly reduces the number of scenarios where it is necessary to implement a finalizer.
In Listing 10.20, the purpose of the finalizer is to delete the file8—an unmanaged resource in this case. Hence we have the call to File?.Delete()
. Now, when the finalizers are executed, the file will get cleaned up.
8. Listing 10.20 is somewhat a contrived example because there is a FileOptions.DeleteOnClose
option when instantiating the FileStream
, which triggers the file’s deletion when the FileStream
closes.
Finalizers execute on an unspecified thread, making their execution even less deterministic. This indeterminate nature makes an unhandled exception within a finalizer (outside of the debugger) likely to crash the application—and the source of this problem is difficult to diagnose because the circumstances that led to the exception are not clear. From the user’s perspective, the unhandled exception will be thrown relatively randomly and with little regard for any action the user was performing. For this reason, you should take care to avoid exceptions within finalizers. Instead, you should use defensive programming techniques such as checking for null
(refer to the use of the null-conditional operator in Listing 10.20). In fact, it is advisable to catch all exceptions in the finalizer and report them via an alternative means (such as logging or via the user interface) rather than keeping them as unhandled exceptions. This guideline leads to the try/catch block surrounding the Delete()
invocation.
Another potential option to force finalizers to execute is to invoke System.GC.WaitForPendingFinalizers()
. When this method is invoked, the current thread will be suspended until all finalizers for objects that are no longer referenced have executed.
using
StatementThe problem with finalizers on their own is that they don’t support deterministic finalization (the ability to know when a finalizer will run). Rather, finalizers serve the important role of being a backup mechanism for cleaning up resources if a developer using a class neglects to call the requisite cleanup code explicitly.
For example, consider the TemporaryFileStream
, which includes not only a finalizer but also a Close()
method. This class uses a file resource that could potentially consume a significant amount of disk space. The developer using TemporaryFileStream
can explicitly call Close()
to restore the disk space.
Providing a method for deterministic finalization is important because it eliminates a dependency on the indeterminate timing behavior of the finalizer. Even if the developer fails to call Close()
explicitly, the finalizer will take care of the call. In such a case, the finalizer will run later than if it was called explicitly.
Because of the importance of deterministic finalization, the base class library includes a specific interface for the pattern and C# integrates the pattern into the language. The IDisposable
interface defines the details of the pattern with a single method called Dispose()
, which developers call on a resource class to “dispose” of the consumed resources. Listing 10.21 demonstrates the IDisposable
interface and some code for calling it.
using System; using System.IO; static class Program { // ... static void Search() { TemporaryFileStream fileStream = new TemporaryFileStream(); // Use temporary file stream // ... fileStream.Dispose(); // ... } } class TemporaryFileStream : IDisposable { public TemporaryFileStream(string fileName) { File = new FileInfo(fileName); Stream = new FileStream( File.FullName, FileMode.OpenOrCreate, FileAccess.ReadWrite); } public TemporaryFileStream() : this(Path.GetTempFileName()) { } ~TemporaryFileStream() { Dispose(false); } public FileStream? Stream { get; private set; } public FileInfo? File { get; private set; } #region IDisposable Members public void Dispose() { Dispose(true); // Unregister from the finalization queue. System.GC.SuppressFinalize(this); } #endregion public void Dispose(bool disposing) { // Do not dispose of an owned managed object (one with a // finalizer) if called by member finalize, // as the owned managed objects finalize method // will be (or has been) called by finalization queue // processing already if (disposing) { Stream?.Close(); } try { File?.Delete(); } catch(IOException exception) { Console.WriteLine(exception); } Stream = null; File = null; } }
Begin 8.0
From Program.Search()
, there is an explicit call to Dispose()
after using the TemporaryFileStream
. Dispose()
is the method responsible for cleaning up the resources (in this case, a file) that are not related to memory and therefore are subject to cleanup implicitly by the garbage collector. Nevertheless, the execution here contains a hole that would prevent execution of Dispose()
—namely, the chance that an exception will occur between the time when TemporaryFileStream
is instantiated and the time when Dispose()
is called. If this happens, Dispose()
will not be invoked and the resource cleanup will have to rely on the finalizer. To avoid this problem, callers need to implement a try/finally block. Instead of requiring programmers to code such a block explicitly, C# provides a using
statement expressly for the purpose (see Listing 10.22).
8.0
static class Program { // ... static void Search() { using (TemporaryFileStream fileStream2 = new TemporaryFileStream(), fileStream3 = new TemporaryFileStream()) { // Use temporary file stream } // C# 8.0 or later using TemporaryFileStream fileStream1 = new TemporaryFileStream(); } }
In the first highlighted code snippet, the resultant CIL code is identical to the code that would be created if the programmer specified an explicit try/finally block, where fileStream.Dispose()
is called in the finally block. The using
statement, however, provides a syntax shortcut for the try/finally block.
Within this using
statement, you can instantiate more than one variable by separating each variable from the others with a comma. The key considerations are that all variables must be of the same type, the type must implement IDisposable
, and initialization occurs at the time of declaration. To enforce the use of the same type, the data type is specified only once rather than before each variable declaration.
C# 8.0 introduces a potential simplification with regard to resource cleanup. As shown in the second highlighted snippet of Listing 10.22, you can prefix the declaration of a disposable resource (one that implements IDisposable
) with the using
keyword. As is the case with the using
statement, this will generate the try/finally behavior, with the finally block placed just before the variable goes out of scope (in this case, before the closing curly brace of the Search()
method). One additional constraint on the using
declaration is that the variable is read-only, so it can’t be assigned a different value.
End 8.0
IDisposable
There are several additional noteworthy items to point out in Listing 10.21. First, the IDisposable.Dispose()
method contains an important call to System.GC.SuppressFinalize()
. Its purpose is to remove the TemporaryFileStream
class instance from the finalization (f-reachable) queue. This is possible because all cleanup was done in the Dispose()
method rather than waiting for the finalizer to execute.
Without the call to SuppressFinalize()
, the instance of the object will be included in the f-reachable queue—a list of all the objects that are mostly ready for garbage collection, except they also have finalization implementations. The runtime cannot garbage-collect objects with finalizers until after their finalization methods have been called. However, garbage collection itself does not call the finalization method. Rather, references to finalization objects are added to the f-reachable queue and are processed by an additional thread at a time deemed appropriate based on the execution context. In an ironic twist, this approach delays garbage collection for the managed resources—when it is most likely that these very resources should be cleaned up earlier. The reason for the delay is that the f-reachable queue is a list of “references”; as such, the objects are not considered garbage until after their finalization methods are called and the object references are removed from the f-reachable queue.
Note
Objects with finalizers that are not explicitly disposed will end up with an extended object lifetime. Even after all explicit references have gone out of scope, the f-reachable queue will have references, keeping the object alive until the f-reachable queue processing is complete.
For this reason, Dispose()
invokes System.GC.SuppressFinalize
. Invoking this method informs the runtime that it should not add this object to the finalization queue, but instead should allow the garbage collector to de-allocate the object when it no longer has any references (including any f-reachable references).
Second, Dispose()
calls Dispose(bool disposing)
with an argument of true
. The result is that the Dispose()
method on Stream
is invoked (cleaning up its resources and suppressing its finalization). Next, the temporary file itself is deleted immediately upon calling Dispose()
. This important call eliminates the need to wait for the finalization queue to be processed before cleaning up potentially expensive resources.
Third, rather than calling Close()
, the finalizer now calls Dispose(bool disposing)
with an argument of false
. The result is that Stream
is not closed (disposed) even though the file is deleted. The condition around closing Stream
ensures that if Dispose(bool disposing)
is called from the finalizer, the Stream
instance itself will also be queued up for finalization processing (or possibly it would have already run depending on the order). Therefore, when executing the finalizer, objects owned by the managed resource should not be cleaned up, as this action will be the responsibility of the finalization queue.
Fourth, you should use caution when creating both a Close()
type and a Dispose()
method. It is not clear by looking at only the API that Close()
calls Dispose()
, so developers will be left wondering whether they need to explicitly call Close()
and Dispose()
.
Fifth, to increase the probability that the functionality defined in the finalizer will execute before a process shuts down even in .NET Core, you should register the code with the AppDomain.CurrentDomain.ProcessExit
event handler. Any finalization code registered with this event handler will be invoked baring an abnormal process termination (discussed in the next section).
Begin 7.0
Begin 4.0
In the preceding section, we discussed how to deterministically dispose of an object with a using
statement and how the finalization queue will dispose of resources in the event that no deterministic approach is used.
A related pattern is called lazy initialization or lazy loading. Using lazy initialization, you can create (or obtain) objects when you need them rather than beforehand—the latter can be an especially problematic situation when those objects are never used. Consider the FileStream
property of Listing 10.24.
using System.IO; class DataCache { // ... public TemporaryFileStream FileStream => InternalFileStream??(InternalFileStream = new TemporaryFileStream()); private TemporaryFileStream? InternalFileStream { get; set; } = null; // ... }
In the FileStream
expression bodied property, we check whether InternalFileStream
is null
before returning its value directly. If InternalFileStream
is null
, we first instantiate the TemporaryFileStream
object and assign it to InternalFileStream
before returning the new instance. Thus, the TemporaryFileStream
required in the FileStream
property is created only when the getter on the property is called. If the getter is never invoked, the TemporaryFileStream
object would not be instantiated and we would save whatever execution time such an instantiation would cost. Obviously, if the instantiation is negligible or inevitable (and postponing the inevitable is less desirable), simply assigning it during declaration or in the constructor makes sense.
This chapter provided a whirlwind tour of many topics related to building solid class libraries. All the topics pertain to internal development as well, but they are much more critical to building robust classes. Ultimately, the focus here was on forming more robust and programmable APIs. In the category of robustness, we can include namespaces and garbage collection. Both of these topics fit in the programmability category as well, along with overriding object
’s virtual members, operator overloading, and XML comments for documentation.
Exception handling heavily depends on inheritance, by defining an exception hierarchy and enforcing custom exceptions to fit within this hierarchy. Furthermore, the C# compiler uses inheritance to verify catch block order. In Chapter 11, you will see why inheritance is such a core part of exception handling.
18.225.149.136