CHAPTER 32

image

Making Friends with the .NET Framework

The information in the preceding chapters is sufficient for writing objects that will function in the .NET Runtime, but those objects may not work as expected when used in collections or when debugged. This chapter details a few ways to improve this situation.

ToString( )

Overriding the ToString() function defined in the object class gives a nice representation of the values in an object. If this isn’t done, object.ToString() will merely return the name of the class, which will make debugging more difficult.

Here’s an example of the default behavior:

using System;
public class Employee
{
    public Employee(int id, string name)
    {
       m_id = id;
       m_name = name;
    }
    int m_id;
    string m_name;
}
class Test
{
    public static void Main()
    {
       Employee herb = new Employee(555, "Herb");
       Console.WriteLine("Employee: {0}", herb);
    }
}

The preceding code will result in the following:

Employee: Employee

By overriding ToString(), the representation can be much more useful.

using System;
public class Employee
{
    public Employee(int id, string name)
    {
       m_id = id;
       m_name = name;
    }
    public override string ToString()
    {
       return(String.Format("{0}({1})", m_name, m_id));
    }
    int m_id;
    string m_name;
}
class Test
{
    public static void Main()
    {
       Employee herb = new Employee(555, "Herb");
       Console.WriteLine("Employee: {0}", herb);
    }
}

This gives a far better result:

Employee: Herb(555)

When Console.WriteLine() needs to convert an object to a string representation, it will call the ToString() virtual function, which will forward to an object’s specific implementation. If more control over formatting is desired, such as implementing a floating-point class with different formats, the IFormattable interface can be overridden. IFormattable is covered in the “Custom Object Formatting” section of Chapter 38.

Object Equality

Some classes have a strong concept of equality; for example, an Employee class might have a unique identifier associated with it. Such classes should expose that concept so that other classes can use it to check for equality. There are several different ways in which object equality can be defined in C#.

  • By overriding Equals(object obj)
  • By overloading the == and != operators
  • By implementing IEquatable<T>.Equals(T other)

The first two ways were present in all versions of C#. The first one unfortunately has a parameter of type object; this means that calling Equals() with a value type results in an unnecessary boxing and unboxing operation. It also permits you to write interesting code such as the following:

bool result = 13.Equals("aardvark");

The overloaded operators (== and !=) are strongly typed and therefore do not suffer from the same issues. With the introduction of generics, the strongly typed IEquatable<T> interface was introduced, and the original version of Equals() is used only if called with a parameter of type object.

It is important that all of these implementations match. Here’s an example that extends Employee:

public class Employee: IEquatable<Employee>
{
    public Employee(int id, string name)
    {
       m_id = id;
       m_name = name;
    }
    public bool Equals(Employee other)
    {
       return this == other;
    }
    public override bool Equals(object obj)
    {
       return Equals((Employee)obj);
    }
    public static bool operator ==(Employee emp1, Employee emp2)
    {
       if (emp1.m_id != emp2.m_id)
       {
            return false;
       }
       else if (emp1.m_name != emp2.m_name)
       {
            return false;
       }
       return true;
    }
    public static bool operator !=(Employee emp1, Employee emp2)
    {
       return !(emp1 == emp2);
    }
    int m_id;
    string m_name;
}
class Test
{
    public static void Main()
    {
       Employee herb = new Employee(555, "Herb");
       Employee herbClone = new Employee(555, "Herb");
       Employee andy = new Employee(123, "Andy");
       Console.WriteLine("Equal: {0}", herb.Equals(herbClone));
       Console.WriteLine("Equal: {0}", herb == herbClone);
       Console.WriteLine("Equal: {0}", herb == andy);
    }
}

This will produce the following output:

Equal: True
Equal: True
Equal: False

In this case, operator==() and operator!=() have also been overloaded, which allows the operator syntax to be used in the last line of Main(). These operators must be overloaded in pairs; they cannot be overloaded separately.1

Hashes and GetHashCode( )

The .NET Framework includes two related classes, the pregeneric HashTable class and the Dictionary<T> class, which are very useful for doing fast lookup of objects by a key. They work by using a hash function, which produces an integer “key” for a specific instance of a class. This key is a condensed version of the contents of the instance. While different instances can have the same hash code, it’s a rare occurrence.

A hash table uses this key as a way of drastically limiting the number of objects that must be searched to find a specific object in a collection of objects. It does this by first getting the hash value of the object, which will eliminate all objects with a different hash code, leaving only those with the same hash code to be searched. Since the number of instances with that hash code is small, searches can be much quicker.

That’s the basic idea. For a more detailed explanation, please refer to a good data structures and algorithms book.2 Hashes are a tremendously useful construct.

The GetHashCode() function should be overridden in user-written classes because the values returned by GetHashCode() are required to be related to the value returned by Equals(). Two objects that are the same by Equals() must always return the same hash code.

The default implementation of GetHashCode() doesn’t work this way, and therefore it must be overridden to work correctly. If not overridden, the hash code will be identical only for the same instance of an object, and a search for an object that is equal but not the same instance will fail. It is therefore very important to override GetHashCode() for all objects that override equality.

Here I extend the example to support GetHashCode():

public class Employee: IEquatable<Employee>
{
    public Employee(int id, string name)
    {
       m_id = id;
       m_name = name;
    }
    public bool Equals(Employee other)
    {
       return this == other;
    }
    public override bool Equals(object obj)
    {
       return Equals((Employee)obj);
    }
    public override int GetHashCode()
    {
       return m_id.GetHashCode();
    }
    public static bool operator ==(Employee emp1, Employee emp2)
    {
       if (emp1.m_id != emp2.m_id)
       {
            return false;
       }
       else if (emp1.m_name != emp2.m_name)
       {
            return false;
       }
       return true;
    }
    public static bool operator !=(Employee emp1, Employee emp2)
    {
       return !(emp1 == emp2);
    }
    int m_id;
    string m_name;
}
class Test
{
    public static void Main()
    {
       Employee herb = new Employee(555, "Herb");

       Employee george = new Employee(123, "George");

       Employee frank = new Employee(111, "Frank");
       Dictionary<Employee, string> employees =
                   new Dictionary<Employee, string>();
       employees.Add(herb, "414 Evergreen Terrace");
       employees.Add(george, "2335 Elm Street");
       employees.Add(frank, "18 Pine Bluff Road");
       Employee herbClone = new Employee(555, "Herb");
       string address = employees[herbClone];
       Console.WriteLine("{0} lives at {1}", herbClone, address);
    }
}

The code outputs the following:

Herb(555) lives at 414 Evergreen Terrace

In the Employee class, the id member is unique, so it is used for the hash code. In the Main() function, several employees are created, and they are then used as the key values to store the addresses of the employees.

If there isn’t a single unique field, the hash code should be created out of the values contained in a function. If the employee class didn’t have a unique identifier but did have fields for name and address, the hash function could use those. The following shows a hash function that could be used:3

public class Employee
{
    public Employee(string name, string address)
    {
       m_name = name;
       m_address = address;
    }
    public override int GetHashCode()
    {
       return m_name.GetHashCode() ^ m_address.GetHashCode();
    }
    string m_name;
    string m_address;
}

This implementation of GetHashCode() simply XORs the hash codes of the elements together and returns them.

Design Guidelines

Any class that overrides Equals() should also override GetHashCode(). In fact, the C# compiler will issue an error if you forget. The reason for this error is that it prevents strange and difficult-to-debug behavior when the class is used in a Dictionary or Hashtable.

These classes depend on the fact that all instances that are equal have the same hash value. The default implementation of GetHashCode(), however, returns a value that is unique on a per-instance basis. If this implementation was not overridden, it’s very easy to put objects in a hash table but not be able to retrieve them.

Value Type Guidelines

The System.ValueType class contains a version of Equals() that works for all value types, but this version of Equals() works through reflection and is therefore slow. It’s therefore recommended that an Equals() be overridden for all value types.4

Reference Type Guidelines

For most reference types, users will expect that == will mean reference comparison, and in this case == should not be overloaded, even if the object implements Equals().

If the type has value semantics (something like a String or a BigNum), operator==() and Equals() should be overridden. If a class overloads + or , that’s a pretty good indication that it should also override == and Equals().

1 This is required for two reasons. The first is that if a user uses ==, they can expect != to work as well. The other is to support nullable types, for which a == b does not imply !(a != b).

2 I’ve always liked Robert Sedgewick’s Algorithms.

3 This is by no means the only hash function that could be used, or even a particularly good one. Any good algorithms book will have more information on constructing good hash functions.

4 This makes conceptual sense, since types that have a concept of value generally have a concept of equality.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.128.197.164