Chapter 3. Language and API Gotchas

Languages like C# and VB.NET allow you to develop object-oriented code without the complexity of syntax found in languages like C++. Nevertheless, there are features of the C# and VB.NET languages, and the .NET Framework types, that are likely to run counter to your intuitions. In particular, your experience with other languages may lead you to expect that certain features will behave in familiar ways in .NET and that the .NET languages will be consistent in how they handle common tasks. If so, you’ll be in for some surprises. For example, .NET takes a different approach to copy constructors than C++ does. The .NET XML reader needs some help to work as quickly as you might expect. And the VB.NET and C# language compilers don’t handle object initialization in the same way; nor do they spot bad enumerations at compile time, as you might expect.

In this chapter I focus on gotchas at the language and API level of the .NET platform.

GOTCHA #20 Singleton isn’t guaranteed process-wide

A static/Shared field belongs to the class and is not part of any instance. Typically, when you see a static/Shared field, you know that no matter how many instances of the class exist, there is one and only one occurrence of this field. Often a static/Shared field is used to limit the number of instances of an object—the concept of Singleton. (Refer to “Exploring the Singleton Design Pattern” and "Implementing the Singleton Pattern in C#" in the section "on the web" in the Appendix for very good articles on this topic.) A singleton takes measures to make sure that no more than one instance of its type can be created in an application. One way to do this is to make the constructor of the class protected or private, and to provide a static/Shared method to fetch the object, as shown in Example 3-1.

Example 3-1. Example of a singleton

C# (SingletonAppDomain )

using System;

namespace Singleton
{
    public class MySingleton
    {
        public readonly DateTime creationTime;

        protected MySingleton()
        {
            creationTime = DateTime.Now;
        }


        protected static MySingleton theInstance =
            new MySingleton();

        public static MySingleton GetInstance()
        {
            return theInstance;
        }
    }
}

VB.NET (SingletonAppDomain)

 Public Class MySingleton
    Public ReadOnly creationTime As DateTime

    Protected Sub New()
        creationTime = DateTime.Now
    End Sub


    Protected Shared theInstance As New MySingleton

    Public Shared Function GetInstance() As MySingleton
        Return theInstance
    End Function

End Class

The MySingleton class is written so that at most one instance can be created. But here’s the gotcha: the unit of granularity for static/Shared fields in .NET is not the process, but the AppDomain. (Application domains provide isolation, unloading, and security boundaries for executing managed code.) And a process can contain more than one AppDomain. So the above code restricts MySingleton to one instance only within its AppDomain, but not within the entire process, as the code in Example 3-2 demonstrates.

Example 3-2. Singleton within AppDomain

C# (SingletonAppDomain)

using System;
using System.Threading;
using System.Reflection;

namespace Singleton
{
    class Test : MarshalByRefObject
    {
        public void Run()
        {

            MySingleton object1 = MySingleton.GetInstance();

            Console.WriteLine("Object created at {0}",
                object1.creationTime.ToLongTimeString());

            Thread.Sleep(1000);


            MySingleton object2 = MySingleton.GetInstance();
            Console.WriteLine("Object created at {0}",
                object1.creationTime.ToLongTimeString());
        }

        [STAThread]
        static void Main(string[] args)
        {
            Test anObject = new Test();


            anObject.Run();
            Thread.Sleep(1000);

            AppDomain domain =

                AppDomain.CreateDomain("MyDomain");
            Test proxy =
                domain.CreateInstance(
                    Assembly.GetExecutingAssembly().FullName,
                typeof(Test).FullName).Unwrap() as Test;

            proxy.Run();

            Thread.Sleep(1000);
            anObject.Run();

        }
    }
}

VB.NET (SingletonAppDomain)

Imports System.Threading

Public Class Test
        Inherits MarshalByRefObject
    Public Sub Run()

        Dim object1 As MySingleton = MySingleton.GetInstance()

        Console.WriteLine("Object created at {0}", _
         object1.creationTime.ToLongTimeString())

        Thread.Sleep(1000)


        Dim object2 As MySingleton = MySingleton.GetInstance()
        Console.WriteLine("Object created at {0}", _
         object1.creationTime.ToLongTimeString())
    End Sub

    Public Shared Sub Main()
        Dim anObject As Test = New Test


        anObject.Run()
        Thread.Sleep(1000)

        Dim domain As AppDomain = _

           AppDomain.CreateDomain("MyDomain")

        Dim proxy As Test = _
          CType( _
              domain.CreateInstance( _
                System.Reflection. _
                Assembly.GetExecutingAssembly().FullName, _
                GetType(Test).FullName).Unwrap(), Test)


        proxy.Run()

        Thread.Sleep(1000)
        anObject.Run()
    End Sub
End Class

In the above code you call the GetInstance() method of MySingleton from within the Test class’s Run() method. Then you create an object of Test within another AppDomain and call Run() on it. The output from the program is shown in Figure 3-1.

Output from Example 3-2

Figure 3-1. Output from Example 3-2

Notice that the four calls to GetInstance() made from within the default AppDomain (that is, calls to Run() from within the Main() method) fetch the same object of MySingleton (as seen in the first two and the last two statements of output). However, the calls to GetInstance() from the AppDomain you created produce a different instance of the MySingleton class.

In the example you create an AppDomain explicitly, so you at least know of its existence. There are times, however, when an AppDomain is created by the .NET framework (like in ASP.NET) or other APIs you may use without your being aware of it. The behavior of singleton is no different in those cases.

There is an excellent discussion of how and why .NET creates new AppDomains in [Lowy03].

IN A NUTSHELL

A class’s static/Shared fields are unique only in the AppDomain where the class is loaded. Each new AppDomain created in your application produces a new copy of them.

GOTCHA #21 Default performance of Data.ReadXMLData.ReadXML

The System.Data.DataSet class provides great flexibility for disconnected access to data. Its capabilities to transform data into XML and to read data from XML come in very handy.

One major problem with using XML is performance. Consider the simple XML document in Example 3-3.

Example 3-3. A simple XML document

<root>
    <row>
      <item1>0</item1>
      <item2>0.703552246887028</item2>
      <item3>0.993569961746023</item3>
      <item4>0.147870197961046</item4>
      <item5>0.740130904009627</item5>
    </row>
    <row>
      <item1>1</item1>
      <item2>0.378916004383432</item2>
      <item3>0.143134204737439</item3>
      <item4>0.419504510434114</item4>
      <item5>0.403854837363518</item5>
  </row>
...

The root element contains a number of row elements. Each row contains five elements named item1, item2, etc. Each item contains a value of type double.

If I have 100 rows in this document, it takes 90 milliseconds to read the XML document into the DataSet using ReadXML().[2] If I have 1,000 rows, it takes 200 milliseconds. Not too bad. But if I have 5,000 rows, it takes 5,700 milliseconds. Finally, if I have 10,000 rows, it takes an objectionable 24,295 milliseconds (about 25 seconds).

Interestingly, if I use the System.Xml.XmlDocument parser class to parse the XML document, it doesn’t take that long. So what’s the problem with ReadXML()?

It turns out that ReadXML() spends most of its time not in parsing the XML document, but in analyzing it to understand its format. In other words, it tries to infer a schema from the XML. So you can achieve a significant speedup by preloading the schema into the DataSet before reading the XML. You can obtain the schema in several ways. For instance, you can ask the sender of the document to provide you with the schema; you can create it manually; or you can use the xsd.exe tool to generate it.

Example 3-4 shows the optimization realized when reading an XML document with 10,000 rows. It alternates between reading the XML without knowing its format and loading the format from an .xsd (XML Schema Definition) file before reading the data.

Example 3-4. Speedup due to preloading schema

C# (DataSetXMLSpeed )

using System;
using System.Data;

namespace ReadingXML
{
    class Test
    {
        private static void timeRead(bool fetchSchema)
        {
            DataSet ds = new DataSet();
            int startTick = Environment.TickCount;


            if (fetchSchema)
            {

                ds.ReadXmlSchema(@"....data.xsd");
            }


            ds.ReadXml(@"....data.xml");
         
            int endTick = Environment.TickCount;

            Console.WriteLine(
                "Time taken to read {0} rows is {1} ms",
                ds.Tables[0].Rows.Count,
                (endTick - startTick));
        }

        [STAThread]
        static void Main(string[] args)
        {
            Console.WriteLine("Reading XML into DataSet");
            timeRead(false);

            Console.WriteLine(
            "Reading XML into DataSet after reading Schema");
            timeRead(true);
        }
    }
}

VB.NET (DataSetXMLSpeed)

Module Test

    Private Sub timeRead(ByVal fetchSchema As Boolean)
        Dim ds As DataSet = New DataSet
        Dim startTick As Integer = Environment.TickCount


        If fetchSchema Then
            ds.ReadXmlSchema("..data.xsd")
        End If


        ds.ReadXml("..data.xml")

        Dim endTick As Integer = Environment.TickCount

        Console.WriteLine( _
         "Time taken to read {0} rows is {1} ms", _
         ds.Tables(0).Rows.Count.ToString(), _
         (endTick - startTick).ToString())
    End Sub

    Sub Main()
        Console.WriteLine("Reading XML into DataSet")
        timeRead(False)

        Console.WriteLine( _
            "Reading XML into DataSet after reading Schema")
        timeRead(True)
    End Sub

End Module

In this example you read the data.xml file containing 10,000 rows in the format discussed in Example 3-3. In the first run, you load the DataSet with the raw XML document. In the second run, you preload the DataSet with the data.xsd schema file, then ask the program to read the XML document. The data.xsd file was generated using the xsd.exe tool from the .NET command prompt as follows:

xsd data.xml

The time taken for each of these approaches is shown in the output in Figure 3-2.

Output showing the speedup from Example 3-4

Figure 3-2. Output showing the speedup from Example 3-4

Reading the XML document cold takes about 25 seconds, while reading it after preloading the schema takes just over half a second.

How does this differ in .NET 2.0 Beta 1? The speed of execution of ReadXML() has significantly improved in .NET 2.0. For the case of 10,000 rows without preloading the schema, it takes only around 1,000 ms. The time taken after preloading the schema was less than 420 ms. It still helps to preload the schema.

IN A NUTSHELL

Preload the schema into the DataSet before calling ReadXML(). It makes a significant difference in performance as the XML file size grows. This eliminates the time taken by ReadXML() to infer the schema from the XML document.

SEE ALSO

Gotcha #9, "Typeless ArrayList isn’t type-safe.”

GOTCHA #22 enum lacks type-safety

enum provides convenience and improved productivity. The possible values get listed in IntelliSense, so it’s easy to select the one you want during programming. If your method takes an enum as a parameter, the users of your API will typically select a value from the list presented by IntelliSense. But unfortunately they don’t have to, which could lead to code like that shown in Example 3-5. In this program, Method1() receives an enum and accesses the array resource based on that value. First, you pass three valid values of the Size enum to Method1(). Then, you pass an invalid value of 3. The output is shown in Figure 3-3.

Example 3-5. Example to study type-safety of enum

C# (EnumSafety)

using System;

namespace EnumTypesafety
{
    class Program
    {

        private static int[] resource = new int[] {0, 1, 2};

        public enum Size
        {
            Small,
            Medium,
            Large
        }

        public static void Method1(Size theSize)
        {
            Console.WriteLine(theSize);
            Console.WriteLine("Resource: {0}",
                resource[(int)theSize]);
        }

        [STAThread]
        static void Main(string[] args)
        {
            Method1(Size.Small);
            Method1(Size.Large);
            Method1((Size) 1);
            Method1((Size) 3);
        }
    }
}

VB.NET (EnumSafety)

Module Program
    Private resource() As Integer = New Integer() {0, 1, 2}

    Public Enum Size
        Small
        Medium
        Large
    End Enum

    Public Sub Method1(ByVal theSize As Size)
        Console.WriteLine(theSize)
        Console.WriteLine("Resource: {0}", _

            resource(Convert.ToInt32(theSize)))
    End Sub

    Sub Main()
        Method1(Size.Small)
        Method1(Size.Large)
        Method1(CType(1, Size))
        Method1(CType(3, Size))
    End Sub
End Module

So what happens if the value sent in for the enum does not match one of the permissible values? At compile time, no error or warning is reported. Users are allowed to send Method1() an invalid value of 3 for the enum.

What’s going on here? The answer lies in the translation to MSIL. The MSIL generated from the above code is shown in Figure 3-4.

Output from Example 3-5

Figure 3-3. Output from Example 3-5

MSIL for the Main method in Example 3-5

Figure 3-4. MSIL for the Main method in Example 3-5

There is no difference between passing one of the correct values and one of the incorrect ones. Under the hood, there is no type safety or range checking in place. That’s why you get a runtime exception instead of a compile-time error when you access the array with the index provided. Too bad the compiler does not catch this or even give you a warning.

What can you do about this? Within methods that receive an enum, make sure the given value is valid. You have to do this before using an enum parameter to make your code robust. A modified Method1() that takes care of this checking is shown in Example 3-6, along with a Main() modified to catch the thrown exception. The output is shown in Figure 3-5.

Example 3-6. Example of type-safe usage of enum

C# (EnumSafety)

using System;

namespace EnumTypesafety
{
    class Program
    {
        private static int[] resource = new int[] {0, 1, 2};
        public enum Size
        {
            Small,
            Medium,
            Large
        }

        public static void Method1(Size theSize)
        {

            if(System.Enum.IsDefined(typeof(Size), theSize))
            {
                Console.WriteLine(theSize);
                Console.WriteLine("Resource: {0}",
                    resource[(int)theSize]);

           } 
            else
            {
                throw new ApplicationException(
                        "Invalid input for Size");
            }
        }

        [STAThread]
        static void Main(string[] args)
        {

            try
            {
                Method1(Size.Small);
                Method1(Size.Large);
                Method1((Size)(1));
                Method1((Size)(3));

            }
            catch(ApplicationException ex)
            {
                Console.WriteLine(ex.Message);
                Console.WriteLine(ex.StackTrace);
            }
        }
    }
}

VB.NET (EnumSafety)

Module Program
    Private resource() As Integer = New Integer() {0, 1, 2}

    Public Enum Size
        Small
        Medium
        Large
    End Enum
    Public Sub Method1(ByVal theSize As Size)

        If System.Enum.IsDefined(GetType(Size), theSize) Then
            Console.WriteLine(theSize)
            Console.WriteLine("Resource: {0}", _
                resource(Convert.ToInt32(theSize)))

        Else
            Throw New ApplicationException( _
                "Invalid input for Size")
        End If
    End Sub
    Sub Main()

        Try
            Method1(Size.Small)
            Method1(Size.Large)
            Method1(CType(1, Size))
            Method1(CType(3, Size))

        Catch ex As ApplicationException
            Console.WriteLine(ex.Message)
            Console.WriteLine(ex.StackTrace)
        End Try
    End Sub
End Module
Output after the modifications in Example 3-6

Figure 3-5. Output after the modifications in Example 3-6

This code verifies that the enum value you’ve been passed is valid by calling System.Enum.IsDefined(). If it is not, it throws an exception. This prevents you from accessing the resource array with an invalid index. While using System.Enum.IsDefined() to verify the enum value may appear logical, there are some inherent problems in using it. First, the call to IsDefined() is expensive, as it relies heavily on reflection and metadata. Second, IsDefined() checks if the value is one of the possible values, not necessarily the ones you expect. This will be a problem if a new value is added to the enum during versioning. Refer to http://blogs.msdn.com/brada/archive/2003/11/29/50903.aspx for more details on this.

Another problem with enum types relates to serialization. The deserialization of an enum may break if you change a value in the enum after serializing an object.

IN A NUTSHELL

Do not assume that the value of an enum received as a method parameter is within range. Check to verify it. This will make your code more robust. Program defensively.

SEE ALSO

Gotcha #15, "rethrow isn’t consistent,” and Gotcha #16, "Default of Option Strict (off) isn’t good.”

GOTCHA #23 Copy Constructor hampers exensibility

You use classes to model concepts in an object-oriented system and create instances of your classes throughout an application. You may be interested in making a copy of an object at runtime. How do you make such a copy? In C++, you don’t have to do anything special; C++ gives you a default copy constructor, a constructor that takes an instance of the class as its parameter. But this is a mixed blessing (or is it a curse?). The default C++ copy constructor makes what is called a shallow copy ; i.e., the contents of the source object are bit-wise copied into the other object. Deep copy is when not only the contents of an object are copied, but also the contents of the objects that this object refers to. A deep copy does not copy just one object; it copies a tree of objects. Whether you need a shallow copy or a deep copy depends on the relationship between the object and its contents. For instance, consider Example 3-7.

Example 3-7. A class with different relationships with its contents

C# (CopyingObjects )

public class Person
{
    private int age;
    private Brain theBrain;
    private City cityOfResidence;
}

VB.NET (CopyingObjects)

Public Class Person
    Private age as Integer
    Private theBrain as Brain
    Private cityOfResidence as City
End Class

If you make a copy of a Person object, you most likely want the new person to have a separate Brain, but may want to refer to (share) the City of the other person. From the object modeling point of view, the person aggregates the Brain but associates with the City. Generally you want to deep-copy the aggregated object, but you may want to shallow-copy the associated object, or just set it to null/Nothing. At the code level, you use a reference to represent both aggregation and association. There is a semantic mismatch between the object model and how it is expressed in the language. There is no way for the compiler or the runtime to figure out whether an object is being associated or aggregated. You have to implement the logic to properly copy an object. Without it, any effort to do so is just a guess, and probably not correct.

This is the problem with the C++ approach. Unfortunately, C++ decided to err on the side of shallow copy. Instead of saying, “Hum, I have no idea how to make a copy so I won’t even try,” C++ decided, “Hum, I have no idea how to make a copy so I’ll make a shallow copy.”

.NET decided to err on the side of caution. It says “I can’t possibly make a copy of an object without the programmer clearly specifying the intent.” So .NET doesn’t provide a default copy constructor.

Thus if you want to make a copy of an object, you just write your own copy constructor, right? Let’s explore this further in Example 3-8.

Example 3-8. Writing a copy constructor

C# (CopyingObjects)

//Brain.cs
using System;

namespace Copy
{
    public class Brain
    {
        public Brain() {}


        public Brain(Brain another)
        {
            //Code to properly copy Brain can go here
        }

        public override string ToString()
        {
            return GetType().Name + ":" + GetHashCode();
        }

    }
}


//Person.cs

using System;

namespace Copy
{
    public class Person
    {
        private int theAge;
        private Brain theBrain;

        public Person(int age, Brain aBrain)
        {
            theAge = age;
            theBrain = aBrain;
        }

public Person(Person another)
        {
            theAge = another.theAge;
theBrain = new Brain(another.theBrain);
        }

        public override string ToString()
        {
            return "This is person with age " +
                       theAge + " and " +
                       theBrain;
        }

    }
}


//Test.cs
using System;

namespace Copy
{
    class Test
    {
        [STAThread]
        static void Main(string[] args)
        {
            Person sam = new Person(1, new Brain());
Person bob = new Person(sam);
            // You rely on the copy constructor of Brain
            //to make a good deep copy
            Console.WriteLine(sam);
            Console.WriteLine(bob);
        }
    }
}

VB.NET (CopyingObjects)

'Brain.vb
Public Class Brain
    Public Sub New()

    End Sub

Public Sub New(ByVal another As Brain)
        ' Code to properly copy Brain can go here
    End Sub

    Public Overrides Function ToString() As String
        Return Me.GetType().Name & ":" & GetHashCode()
    End Function

End Class


'Person.vb
Public Class Person
    Private theAge As Integer
    Private theBrain As Brain

    Public Sub New(ByVal age As Integer, ByVal aBrain As Brain)
        theAge = age
        theBrain = aBrain
    End Sub

Public Sub New(ByVal another As Person)
        theAge = another.theAge
theBrain = New Brain(another.theBrain)
        ' You rely on the copy constructor of Brain
        ' to make a good deep copy
    End Sub

    Public Overrides Function ToString() As String
        Return "This is person with age " & _
            theAge & " and " & _
            theBrain.ToString()
    End Function
End Class


'Test.vbModule Test

    Sub Main()
        Dim sam As New Person(1, New Brain)
Dim bob As Person = New Person(sam)

        Console.WriteLine(sam)
        Console.WriteLine(bob)
    End Sub

End Module

This example has a Person class with theAge and theBrain as its members. Person has a constructor and a copy constructor. The Main() method in Test copies Person sam to Person bob. The output is shown in Figure 3-6.

Output from Example 3-8

Figure 3-6. Output from Example 3-8

When it prints the first Person (sam), the age is 1 and the Brain’s hash code value is 1. When it prints the second Person (bob), which was copied from the instance sam, the age is 1 but the Brain’s hash code is 2.

Tip

Generally speaking you should not use the hash code to determine identity. Even if the hash code values are the same, it does not mean the objects are identical. Here, however, since the hash code is different, you can infer that the objects are different. In reality you might use something like a GUID in each object to determine its uniqueness, or you could test the references to the Brain of the two objects to confirm that they are different. (The issues of dealing with the hash code and determining the identity of objects can get complicated. For good discussions on these topics refer to "Common Object Operations,” "Equals vs. ==,” and "Hashcode,” in the section "on the web" in the Appendix.)

So, in Example 3-8 we created a copy of the Person with his own Brain. Have you solved the problem of properly copying the object? Not really, because the Person’s copy constructor depends on the Brain class. It specifically creates an instance of Brain. What if you have a class that derives from Brain, as shown in Example 3-9?

Example 3-9. Incorrect copying

C# (CopyingObjects )

//SmarterBrain.cs
using System;

namespace Copy
{
    public class SmarterBrain : Brain
    {
        public SmarterBrain()
        {
        }

        public SmarterBrain(SmarterBrain another) : base(another)
        {
        }
    }
}


//Test.cs
using System;

namespace Copy
{
    class Test
    {
        [STAThread]
        static void Main(string[] args)
        {
            Person sam = new Person(1, new SmarterBrain());
            Person bob = new Person(sam);

            Console.WriteLine(sam);
            Console.WriteLine(bob);
        }
    }
}

VB.NET (CopyingObjects)

'SmarterBrain.vb

Public Class SmarterBrain
    Inherits Brain


    Public Sub New()    End Sub

    Public Sub New(ByVal another As SmarterBrain)
        MyBase.New(another)
    End Sub
End Class


'Test.vb
Module Test

    Sub Main()
        Dim sam As New Person(1, New SmarterBrain)
        Dim bob As Person = New Person(sam)

        Console.WriteLine(sam)
        Console.WriteLine(bob)
    End Sub

End Module

SmarterBrain inherits from Brain. In the Main() method of Test you create an instance of SmarterBrain and send it to the Person object. The output after this enhancement is shown in Figure 3-7.

Output after change in Example 3-9

Figure 3-7. Output after change in Example 3-9

While the first Person instance (sam) has an instance of SmarterBrain, the copied instance (bob) is left with just a regular plain vanilla Brain. What went wrong? The Person’s copy constructor is asking a new instance of Brain to be created regardless of the actual object referred to by theBrain. How about the fix in Example 3-10?

Example 3-10. A fix?

C# (CopyingObjects)

        public Person(Person another)
        {
            theAge = another.theAge;


            if(another.theBrain is SmarterBrain)
            {
                theBrain = new SmarterBrain(
                    (SmarterBrain) another.theBrain);
            }
            else
            {
                theBrain = new Brain(another.theBrain);
            }
        }

VB.NET (CopyingObjects)

    Public Sub New(ByVal another As Person)
        theAge = another.theAge


        If TypeOf another.theBrain Is SmarterBrain Then
            theBrain = New SmarterBrain( _
                CType(another.theBrain, SmarterBrain))
        Else
            theBrain = New Brain(another.theBrain)
        End If
    End Sub

Here you have modified the copy constructor of the Person class to use Runtime Type Identification (RTTI). It seems to fix the problem.

But what do you think about this solution? Not exactly elegant, is it? Actually, it’s awful. It requires Person, which aggregates Brain, to know about all the subclasses of Brain. (The upside of code like this is job security. You will be around forever fixing and tweaking it.)

As it stands, the Person class is not extensible for the addition of new types of Brains. It fails the Open-Closed Principle (OCP). Refer to [Martin03] for details on this and other object-oriented design principles.

How can you fix the code so it makes a proper copy of the object? The correct option is the prototype pattern, which is based on abstraction and polymorphism [Freeman04, Gamma95]. You depend on a prototypical instance to create a copy. This is discussed in the next gotcha, “Clone() has limitations”

IN A NUTSHELL

image with no caption

Writing a public copy constructor leads to extensibility problems. You should not use a public copy constructor in C++, Java, and the .NET languages.

GOTCHA #24 Clone() has limitations

You saw the extensibility problems posed by the use of a public copy constructor in Gotcha #23, "Copy Constructor hampers exensibility.” How do you make a good copy of an object?

If all you need is a shallow copy, do you have to write all the code yourself? No, .NET provides a MemberwiseClone() method that performs a shallow copy. However, to make sure this is not inadvertently used like the default copy constructor in C++, it is protected, not public. If you need to do a simple shallow copy, you provide a method that you implement using MemberwiseClone(). However, there are a couple of problems with this:

  • You can’t invoke MemberwiseClone() from within a copy constructor. This is because MemberwiseClone() creates and returns an object, and there is no way to return this from a copy constructor.

  • MemberwiseClone() does not use a constructor to create an object and can’t deal with objects that have readonly fields, as discussed later in this gotcha.

It is better to rely on polymorphism to create an object of the appropriate class. This is the intent of the System.ICloneable interface. You can implement ICloneable on the Brain class and call its Clone() method to copy the object, as shown in Example 3-11.

Example 3-11. Using ICloneable

C# (CopyingObjects)

//Brain.cs
using System;

namespace Copy
{
    public class Brain : ICloneable    {
        //...

        #region ICloneable Members

        public object Clone()
        {
            return MemberwiseClone();
        }

        #endregion
    }
}


//Person.cs
//...

    public class Person
    {
        //...
        public Person(Person another)
        {
            theAge = another.theAge;

theBrain = another.theBrain.Clone() as Brain;
        }
    }

VB.NET (CopyingObjects)

'Brain.vb
Public Class Brain
Implements ICloneable


    '...

    Public Function Clone() As Object _
        Implements System.ICloneable.Clone
        Return MemberwiseClone()
    End Function
End Class


'Person.vb
Public Class Person

    '...
    Public Sub New(ByVal another As Person)
        theAge = another.theAge
theBrain = CType(another.theBrain.Clone(), Brain)
    End Sub

End Class

In this version, you implement ICloneable on the Brain class, and in its Clone() method do a shallow copy using MemberwiseClone(). For now, a shallow copy is good enough. The output of the program is shown in Figure 3-8.

Output from Example 3-11

Figure 3-8. Output from Example 3-11

The Clone() method copies the object correctly. The Person class is extensible to adding new types of Brain classes as well. Looks good. Are you done?

Well, unfortunately, not yet! Let’s think about this some more. Say the Brain has an identifier. (Brains don’t usually, but just for the sake of this example, assume that the idea makes sense.) So, here is the Brain class with its identifier in Example 3-12.

Example 3-12. A class with an identifier

C# (CopyingObjects)

//Brain.cs
using System;

namespace Copy
{
    public class Brain : ICloneable
    {

        private int id;
        private static int idCount;

        public Brain()
        {

            id =
            System.Threading.Interlocked.Increment(ref idCount);
        }

        public Brain(Brain another)
        {
            //Code to properly copy Brain can go here
        }
        public override string ToString()
        {
return GetType().Name + ":" + id;
        }

        #region ICloneable Members

        public object Clone()
        {
            return MemberwiseClone();
        }

        #endregion
    }
}

VB.NET (CopyingObjects)

'Brain.vb
Public Class Brain
    Implements ICloneable


    Private id As Integer
    Private Shared idCount As Integer

    Public Sub New()

        id = System.Threading.Interlocked.Increment(idCount)
    End Sub

    Public Sub New(ByVal another As Brain)
        ' Code to properly copy Brain can go here
    End Sub

    Public Overrides Function ToString() As String

        Return Me.GetType().Name & ":" & id
    End Function

    Public Function Clone() As Object _
        Implements System.ICloneable.Clone
        Return MemberwiseClone()
    End Function
End Class

The Brain class has an id and a static/Shared field idCount. Within the constructor you increment (in a thread-safe manner) the idCount and store the value in the id field. You use this id instead of the hash code in the ToString() method. When you execute the code you get the output as in Figure 3-9.

Output from Example 3-12

Figure 3-9. Output from Example 3-12

Both the objects of SmarterBrain end up with the same id. Why’s that? It’s because the MemberwiseClone() method does not call any constructor. It just creates a new object by making a copy of the original object’s memory. If you want to make id unique among the instances of Brain, you need to do it yourself. Let’s fix the Clone() method, as shown in Example 3-13, by creating a clone using the MemberwiseClone() method, then modifying its id before returning the clone. The output after this change is shown in Figure 3-10.

Example 3-13. Fixing the Clone() to maintain unique id

C# (CopyingObjects )

        public object Clone()
        {

            Brain theClone = MemberwiseClone() as Brain;
            theClone.id =
                System.Threading.Interlocked.Increment(ref idCount);

            return theClone;
        }

VB.NET (CopyingObjects)

    Public Function Clone() As Object _
        Implements System.ICloneable.Clone

        Dim theClone As Brain = CType(MemberwiseClone(), Brain)
        theClone.id = _
            System.Threading.Interlocked.Increment(idCount)
        Return theClone
    End Function
Output from Example 3-13

Figure 3-10. Output from Example 3-13

That looks better. But let’s go just a bit further with this. If id is a unique identifier for the Brain object, shouldn’t you make sure it doesn’t change? So how about making it readonly? Let’s do just that in Example 3-14.

Example 3-14. Problem with readonly and Clone()

C# (CopyingObjects )

//Brain.cs
// ...
    public class Brain : ICloneable
    {

        private readonly int id;
        private static int idCount;

        // ...

VB.NET (CopyingObjects)

'Brain.vb
Public Class Brain
    Implements ICloneable


    Private ReadOnly id As Integer
    Private Shared idCount As Integer

    '...

As a result of this change, the C# compiler gives the error:

   A readonly field cannot be assigned to (except in a constructor or a variable
   initializer).

In VB.NET, the error is:

   'ReadOnly' variable cannot be the target of an assignment.

A readonly field can be assigned a value at the point of declaration or within any of the constructors, but not in any other method. But isn’t the Clone() method a special method? Yes, but not special enough. So if you have a readonly field that needs to have unique values, the Clone() operation will not work.

Joshua Bloch discusses cloning very clearly in his book Effective Java [Bloch01]. He states, “... you are probably better off providing some alternative means of object copying or simply not providing the capability.” He goes on to say, “[a] fine approach to object copying is to provide a copy constructor.”

Unfortunately, as you saw in Gotcha #23, "Copy Constructor hampers exensibility,” the use of a copy constructor leads to extensibility issues. Here’s the dilemma: I say copy constructors are a problem and Bloch says you can’t use Clone(). So what’s the answer?

Providing a copy constructor is indeed a fine approach, as Bloch states—as long as it’s with a slight twist. The copy constructor has to be protected and not public, and it should be invoked within Brain.Clone() instead of within the copy constructor of Person. The modified code is shown in Example 3-15.

Example 3-15. A copy that finally works

C# (CopyingObjects)

//Brain.cs
using System;

namespace Copy
{
    public class Brain : ICloneable
    {
        private readonly int id;
        private static int idCount;

        public Brain()
        {
            id =
            System.Threading.Interlocked.Increment(ref idCount);
        }


        protected Brain(Brain another)
        {
            id = System.Threading.Interlocked.Increment(ref idCount);
        }

        public override string ToString()
        {
            return GetType().Name + ":" + id;
        }

        #region ICloneable Members

public virtual object Clone()
        {
return new Brain(this);
        }

        #endregion
    }
}


//SmarterBrain.cs
using System;

namespace Copy
{
    public class SmarterBrain : Brain
    {
        public SmarterBrain()
        {
        }

        protected SmarterBrain(SmarterBrain another)
            : base(another)
        {
        }

        public override object Clone()
        {
return new SmarterBrain(this);
        }

    }
}

VB.NET (CopyingObjects)

'Brain.vb
Public Class Brain
    Implements ICloneable

    Private ReadOnly id As Integer
    Private Shared idCount As Integer
    Public Sub New()
        id = System.Threading.Interlocked.Increment(idCount)
    End Sub


    Protected Sub New(ByVal another As Brain)
        id = System.Threading.Interlocked.Increment(idCount)
    End Sub

    Public Overrides Function ToString() As String
        Return Me.GetType().Name & ":" & id
    End Function


    Public Overridable Function Clone() As Object _
        Implements System.ICloneable.Clone
        Return New Brain(Me)
    End Function
End Class


'SmarterBrain.vb

Public Class SmarterBrain
    Inherits Brain

    Public Sub New()

    End Sub


    Protected Sub New(ByVal another As SmarterBrain)
        MyBase.New(another)
    End Sub


    Public Overrides Function Clone() As Object
        Return New SmarterBrain(Me)
    End Function
End Class

Now you have made the copy constructors of Brain and SmarterBrain protected. Also, you have made the Brain.Clone() method virtual/overridable. In it, you return a copy of the Brain created using the copy constructor. In the overridden Clone() method of SmarterBrain, you use the copy constructor of SmarterBrain to create a copy. When the Person class invokes theBrain.Clone(), polymorphism assures that the appropriate Clone() method in Brain or SmarterBrain is called, based on the real type of the object at runtime. This makes the Person class extensible as well. The output after the above modifications is shown in Figure 3-11.

Output from Example 3-15

Figure 3-11. Output from Example 3-15

A similar change to the Person class results in the code shown in Example 3-16.

Example 3-16. Proper copying of Person class

C# (CopyingObjects)

//Person.cs

using System;

namespace Copy
{
    public class Person : ICloneable
    {
        private int theAge;
        private Brain theBrain;

        public Person(int age, Brain aBrain)        {
            theAge = age;
            theBrain = aBrain;
        }


        protected Person(Person another)
        {
            theAge = another.theAge;

            theBrain = another.theBrain.Clone() as Brain;
        }

        public override string ToString()
        {
            return "This is person with age " +
                       theAge + " and " +
                       theBrain;
        }

        #region ICloneable Members

public virtual object Clone()
        {
return new Person(this);
        }

        #endregion
    }
}


//Test.cs
using System;

namespace Copy
{
    class Test
    {
        [STAThread]
        static void Main(string[] args)
        {
            Person sam = new Person(1, new SmarterBrain());

            //Person bob = new Person(sam);
            Person bob = sam.Clone() as Person;

            Console.WriteLine(sam);
            Console.WriteLine(bob);
        }
    }
}

VB.NET (CopyingObjects)

'Person.vb
Public Class Person
Implements ICloneable

    Private theAge As Integer
    Private theBrain As Brain

    Public Sub New(ByVal age As Integer, ByVal aBrain As Brain)
        theAge = age
        theBrain = aBrain
    End Sub


    Protected Sub New(ByVal another As Person)
        theAge = another.theAge

        theBrain = CType(another.theBrain.Clone(), Brain)
    End Sub

    Public Overrides Function ToString() As String
        Return "This is person with age " & _
            theAge & " and " & _
            theBrain.ToString()
    End Function


    Public Overridable Function Clone() As Object _
        Implements System.ICloneable.Clone
        Return New Person(Me)
    End Function
End Class



'Test.vb
Module Test

    Sub Main()
        Dim sam As New Person(1, New SmarterBrain)

        'Dim bob As Person = New Person(sam)
        Dim bob As Person = CType(sam.Clone(), Person)

        Console.WriteLine(sam)
        Console.WriteLine(bob)
    End Sub

End Module

IN A NUTSHELL

Avoid public copy constructors and do not rely on MemberwiseClone(). Invoke your protected copy constructor from within your Clone() method. Public copy constructors lead to extensibility problems. Using MemberwiseClone() can also cause problems if you have readonly fields in your class. A better approach is to write a Clone() method and have it call your class’s protected copy constructor.

GOTCHA #25 Access to static/Shared members isn’t enforced consistently

A static/Shared method belongs to a class and not to any specific instance. Furthermore, it is never polymorphic. C# has taken the high road of disallowing the call to static members using an object reference. Unfortunately, VB.NET does not impose the same restriction. The downside to calling Shared members on a reference is that it may lead to confusion at times. Consider Example 3-17.

Example 3-17. Invoking Shared member using a reference

C# (Shared)

C# does not allow the call to static members using an object reference. So this is not an issue for C# programmers. It only concerns VB.NET, C++, and Java programmers.

C# (VB.NET (Shared))

'Base.vb
Public Class Base
 Public Overridable Sub Method1()
Console.WriteLine("Base.Method1")
 End Sub

 Public Shared Sub Method2()
 Console.WriteLine("Base Method2")
 End Sub
End Class

'Derived.vbPublic Class Derived
 Inherits Base
 Public Overrides Sub Method1()
 Console.WriteLine("Derived.Method1")
 End Sub

 Public Shared Sub Method2()
"Derived Method2")
 End Sub
End Class

'Test.vb
Public Class Test
 Public Shared Sub Run(ByVal b As Base)
 b.Method1()
 b.Method2()
 End Sub
 Public Shared Sub Main()
 Dim object1 As New Derived

 Console.WriteLine("--------- Using object of Derived")
 object1.Method1()
 object1.Method2()
 Run(object1)
 End Sub
End Class

In the VB.NET version of Example 3-17, there is a class named Base and a class named Derived that inherits from Base. Base has an overridable method named Method1() and a Shared method named Method2(). The Derived class overrides Method1() and also provides a Shared method named Method2().

In the Test code, you create a Derived object, then call Method1() and Method2() on it using the object reference object1. Next you pass the reference object1 to a method Run(), which treats it as a Base type. Within the Run() method you are still dealing with an object of Derived. When the Run() method invokes the two methods using the Base type reference, the method invoked for the call to Method1() is on Derived due to polymorphism. However, the method invoked for the call to Method2() is on Base, not on Derived, even though the object being pointed to by the reference is of type Derived. When the program executes, you get the output shown in Figure 3-12.

Output from Example 3-17

Figure 3-12. Output from Example 3-17

Note that when the code is compiled a warning is generated for Method2() of Derived. It recommends that you mark Method2() with the Shadows keyword. Marking it Shadows will not change the output of the program, however. The call to Method1() using the Base reference is polymorphic and goes to the Derived class’s Method2(). However, the call to Method2() is not polymorphic. Within the Run() method, it is statically bound to the method of Base at compile time. As a result, Method2() of Base is invoked rather than Method2() of Derived. In fact, when a Shared method is accessed using an object reference, the compiler replaces the object reference with the class name at the time of compilation. In this example, b.Method2() is replaced by Base.Method2(). While the call b.Method1() results in a polymorphic callvirt instruction in MSIL, the call to b.Method2() simply becomes a static call. Figure 3-13 shows the MSIL generated from the above code.

MSIL for Run() method in Example 3-17

Figure 3-13. MSIL for Run() method in Example 3-17

Given that static/shared methods are not polymorphic, it is easy to get confused if they are invoked using an object reference, especially if the static/Shared methods are part of the derived class as well. As a good coding practice, you should refrain from calling Shared members using an object reference in VB.NET (and C++ and Java as well). Instead, use the class to access them. Instead of calling b.Method2(), write Base.Method2().

How does this differ in .NET 2.0 Beta 1? The VB.NET compiler issues a warning (not an error) if you access a Shared member using an object reference. The warning generated is:

    warning BC42025: Access of shared member through an instance; qualifying expression
    will not be evaluated.

If you configure Visual Studio to treat warnings as errors (see Gotcha #12, "Compiler warnings may not be benign“), you will avoid this gotcha.

IN A NUTSHELL

Refrain from accessing Shared members of a class through an object reference. Use the class to access them.

GOTCHA #26 Details of exception may be hidden

When you receive an exception, you try to figure out its cause. At times, though, the exception you get does not give you enough details, so you are left wondering what really went wrong. At these times you should look deeper into the exception object to see if more information is present in the InnerException.

Consider the XMLSerializer class, which makes the tasks of parsing and creating an XML document almost trivial in .NET. However, when it fails, it fails with style. I have found it painful to diagnose the problems. Then somehow I discovered that the actual error message is hidden in the InnerException property. Look at the example in Example 3-18.

Example 3-18. Failure of XMLSerializer

C# (XMLSerializer)

//SomeType.cs
using System;

namespace XmlSerializerException
{
    public class SomeType
    {
        private int val;

        public int TheValue
        {
            get { return val; }
            set { val = value; }
        }
    }
}
 //Program.cs
using System;
using System.Collections;
using System.Xml.Serialization;
using System.IO;

namespace XmlSerializerException
{
    class Program
    {
        [STAThread]
        static void Main(string[] args)
        {
            ArrayList myList = new ArrayList();
            myList.Add(new SomeType());

            try
            {
                using(FileStream fileStrm
                          = new FileStream("output.xml",
                          FileMode.Create))
                {
                    XmlSerializer theSerializer
                        = new XmlSerializer(
                            typeof(ArrayList));
theSerializer.Serialize(fileStrm, myList);
                }
            }
            catch(InvalidOperationException ex)
            {
                Console.WriteLine(
                    "OOps: The Problem is "{0}"",
ex.Message);
            }
            catch(Exception catchAllEx)
            {
                Console.WriteLine(
                    "OOps: The Problem is "{0}"",
                    catchAllEx.Message);
                throw;
            }
        }
    }
}

VB.NET (XMLSerializer)

'SomeType.vb

Public Class SomeType
    Private val As Integer

    Public Property TheValue() As Integer
        Get
            Return val
        End Get
        Set(ByVal Value As Integer)
            val = Value
        End Set
    End Property
End Class

'Program.vb
Imports System.IO
Imports System.Xml.Serialization

Module Program

    Sub Main()
        Dim myList As New ArrayList
        myList.Add(New SomeType)

        Try
            Dim fileStrm As New FileStream("output.xml", _
                FileMode.Create)

            Dim theSerializer As New XmlSerializer(GetType(ArrayList))

theSerializer.Serialize(fileStrm, myList)
        Catch ex As InvalidOperationException
            Console.WriteLine( _
             "OOps: The Problem is ""{0}""", _
             ex.Message)
        Catch catchAllEx As Exception
            Console.WriteLine( _
             "OOps: The Problem is ""{0}""", _
             catchAllEx.Message)
            Throw
        End Try
    End Sub

End Module

In this example you create an ArrayList and populate it with one SomeType Object. Then you create an XMLSerializer and ask it to serialize the ArrayList to the file output.xml. This looks pretty straightforward. But when you execute the code, you get the exception shown in Figure 3-14.

Exception from Example 3-18

Figure 3-14. Exception from Example 3-18

Not a very helpful message, is it? You could sit there scratching your head until all your hair falls out. If the code is a bit more complicated, it can be even more frustrating to find the real problem. But if you modify the catch statement to print the details from the InnerException property, you get more meaningful information. The modified catch block is shown in Example 3-19.

Example 3-19. Looking for InnerException

C# (XMLSerializer)

            catch(InvalidOperationException ex)
            {
                Console.WriteLine(
                    "OOps: The Problem is "{0}"",
                    ex.Message);
                if (ex.InnerException != null)
                {
                    Console.WriteLine(

                        "The real problem is {0}",
                        ex.InnerException);
                }
            }

VB.NET (XMLSerializer)

        Catch ex As InvalidOperationException
            Console.WriteLine( _
             "OOps: The Problem is ""{0}""", _
             ex.Message)

            If Not ex.InnerException Is Nothing Then
                Console.WriteLine( _
                    "The real problem is {0}", ex.InnerException)
            End If

In addition to displaying the information from the exception, it also displays details of its InnerException. The output after the code change appears in Figure 3-15.

Detailed error reported from change in Example 3-19

Figure 3-15. Detailed error reported from change in Example 3-19

The inner exception clearly tells you what the problem is, and now it seems obvious: the XmlSerializer has no idea what types of objects will be held in the ArrayList. Now this is disappointing—if you specify all the types that may be in the ArrayList then the code is non-extensible; it will violate the Open-Closed Principle (See the sidebar "The Open-Closed Principle (OCP) " in Gotcha #23, "Copy Constructor hampers exensibility,” for more details on this principle.) In this case, you have to use the XmlInclude attribute to indicate what types the ArrayList can hold. (You will have to change the attribute declaration if you decide to add an object of a new type to it. But that is another problem.) The point here is that you can get the information about what went wrong by examining the InnerException of the received exception.

An easy way to examine the InnerException of an exception is to use the Exception.ToString() method to display its information, instead of using its Message property. (See Gotcha #15, "rethrow isn’t consistent.”)

IN A NUTSHELL

Look at the InnerException for a fuller understanding of the problem when you receive an exception. In general, examine the InnerException if there is one. If you will be logging an exception, remember to log not only the exception details, but the InnerException information as well. And remember that the InnerException is itself an Exception; it might contain its own InnerException.

SEE ALSO

Gotcha #15, "rethrow isn’t consistent" and Gotcha #23, "Copy Constructor hampers exensibility.”

GOTCHA #27 Object initialization sequence isn’t consistent

When you create an object, the memory for the instance is allocated, each of its fields is initialized with the default value defined by the CTS, and then the constructor is invoked. If you create an object of a derived class, then all fields of the base are initialized and the constructor of the base is invoked before any field of the derived class is initialized. This is conventional wisdom derived from languages such as C++ and Java. But it is not the sequence that is followed in C#. In fact, the sequence of initialization is not the same between C# and VB.NET. Take a look at Example 3-20.

Warning

The object initialization sequence in C# is well-documented in section 10.10.3 of the C# Language Specification (see "on the web" in the Appendix”). My worry is not the fact that the object initialization sequence differs in C#, as compared to C++ or Java. It is that the sequence is not consistent between .NET languages—for instance, between C# and VB.NET.

Example 3-20. Object initialization sequence

C# (Initialization)

//SomeClass1.cs
using System;

namespace ObjectInitSequence
{
    public class SomeClass1
    {
        public SomeClass1()
        {
            Console.WriteLine("Constructor of SomeClass1 called");
        }
    }
}

// SomeClass2.cs
using System;

namespace ObjectInitSequence
{
    public class SomeClass2
    {
        public SomeClass2()
        {
            Console.WriteLine("Constructor of SomeClass2 called");
        }
    }
}

//Base.cs
using System;

namespace ObjectInitSequence
{

    public class Base
    {
        private SomeClass1 obj1 = new SomeClass1();

        public Base()
        {
            Console.WriteLine("Constructor of Base called");
        }
    }
}

//Derived.cs
using System;

namespace ObjectInitSequence
{

    public class Derived : Base
    {
        private SomeClass2 obj2 = new SomeClass2();

        public Derived()
        {
            Console.WriteLine("Constructor of Derived called");
        }
    }
}

//Test.cs
using System;

namespace ObjectInitSequence
{
    class Test
    {
        [STAThread]
        static void Main(string[] args)
        {
Derived obj = new Derived();
        }
    }
}

VB.NET (Initialization)

'SomeClass1
Public Class SomeClass1
    Public Sub New()
        Console.WriteLine("Constructor of SomeClass1 called")
    End Sub
End Class

'SomeClass2.vb
Public Class SomeClass2
    Public Sub New()
        Console.WriteLine("Constructor of SomeClass2 called")
    End Sub
End Class

'Base.vb

Public Class Base
    Private obj1 As SomeClass1 = New SomeClass1

    Public Sub New()
        Console.WriteLine("Constructor of Base called")
    End Sub
End Class

'Derived.vb

Public Class Derived
    Inherits Base
    Private obj2 As SomeClass2 = New SomeClass2

    Public Sub New()
        Console.WriteLine("Constructor of Derived called")
    End Sub
End Class

'Test.vb
Module Test
    Sub Main()

        Dim obj As Derived = New Derived
    End Sub

End Module

In the above code, the class Base has a field of type SomeClass1. The class Derived, which inherits from Base, has a field of type SomeClass2. Each of these classes has a constructor that prints a message announcing itself. What is the sequence of field initialization and constructor calls when an object of Derived is created? Before you answer, you may want to ask, which language?! The C# code given above produces the output shown in Figure 3-16.

Output from C# version of Example 3-20

Figure 3-16. Output from C# version of Example 3-20

However, the VB.NET version of the code produces different results, shown in Figure 3-17.

Output from VB.NET version of Example 3-20

Figure 3-17. Output from VB.NET version of Example 3-20

While the two programs are identical except for the language used to write them, the behavior is different. In C#, the Derived class’s fields are initialized, then those of the Base class. Next, the constructors are called top-down, the Base constructor first and then the Derived constructor. In the case of the VB.NET program, however, the sequence is different (and conformant with the sequence in C++ and Java). The initialization of fields in Base and the invocation of the Base class constructor complete before any field of the Derived class is initialized.

While you are wondering about this, let me throw you some even more interesting things. What is the sequence if I derive a C# class from a VB.NET class? What happens if I derive a VB.NET class from a C# class which in turn is derived from another VB.NET class? If I derive a C# class from a VB.NET class, then the derived members will be initialized before the base members. However, if I derive a VB.NET class from a C# class, then the base members will be initialized before any derived members. In case you have more than two levels of inheritance and you mix languages between levels, the sequence depends on the language of the derived class at each level (good luck).

IN A NUTSHELL

Clearly understand the sequence in which objects are initialized in C# versus VB.NET. Understanding the sequence will help avoid surprises from this rather odd inconsistency.

SEE ALSO

Gotcha #23, "Copy Constructor hampers exensibility,” Gotcha #24, "Clone() has limitations,” and Gotcha #28, "Polymorphism kicks in prematurely.”

GOTCHA #28 Polymorphism kicks in prematurely

Polymorphism is the most cherished feature in object-oriented programming. An important tenet in object modeling is that objects be kept in a valid state at all times. Ideally, you should never be able to invoke methods on an object until it has been fully initialized. Unfortunately in .NET, it isn’t difficult to violate this with the use of polymorphism. Unlike C++, in .NET polymorphism kicks in even before the execution of the constructor has completed. This behavior is similar to Java.

Let’s review polymorphism for a moment. It assures that the virtual/overridable method that is called is based on the real type of the object, and not just the type of the reference used to invoke it. For instance, say foo() is a virtual/overridable method on a base class, and a class that derives from the base overrides that method. Assume also that you have two references baseReference and derivedReference of the base type and derived type. Let both of these references actually refer to the same instance of the derived class. Now, regardless of how the method is called, either as baseReference.foo() or derivedReference.foo(), the same method foo() in the derived class is invoked. This is due to the effect of polymorphism or dynamic binding.

While this sounds great, the problem is that polymorphism enters into the picture before the derived class’s constructor is even called. Consider Example 3-21.

Example 3-21. Polymorphism during construction

C# (PolymorphismTooSoon)

//Room.cs
using System;

namespace ProblemPolymorphismConstruction
{
    public class Room
    {
        public void OpenWindow()
        {
            Console.WriteLine("Room window open");
        }
        public void CloseWindow()
        {
            Console.WriteLine("Room window closed");
        }
    }
}
//ExecutiveRoom.cs.
using System;

namespace ProblemPolymorphismConstruction
{
    public class ExecutiveRoom : Room
    {
    }
}

//Employee.cs
using System;

namespace ProblemPolymorphismConstruction
{
    public class Employee
    {

        public Employee()
        {
            Console.WriteLine("Employee's constructor called");


            Work();
        }

        public virtual void Work()
        {
            Console.WriteLine("Employee is working");
        }
    }
}

//Manager.cs
using System;

namespace ProblemPolymorphismConstruction
{

    public class Manager : Employee
    {
        private Room theRoom = null;
        private int managementLevel = 0;

        public Manager(int level)
        {
            Console.WriteLine("Manager's constructor called");

            managementLevel = level;

            if (level < 2)
                theRoom = new Room();
            else                theRoom = new ExecutiveRoom();
        }


        public override void Work()
        {
            Console.WriteLine("Manager's work called");


            theRoom.OpenWindow();
            base.Work();
        }
    }
}

//User.cs
using System;

namespace ProblemPolymorphismConstruction
{
    class User
    {
        static void Main(string[] args)
        {
            Console.WriteLine("Creating Manager");

            Manager mgr = new Manager(1);

            Console.WriteLine("Done");
        }
    }
}

VB.NET (PolymorphismTooSoon)

'Room.vb

Public Class Room
    Public Sub OpenWindow()
        Console.WriteLine("Room window open")
    End Sub

    Public Sub CloseWindow()
        Console.WriteLine("Room window closed")
    End Sub
End Class

'ExecutiveRoom.vb
Public Class ExecutiveRoom
    Inherits Room
End Class
'Employee.vb

Public Class Employee

    Public Sub New()
        Console.WriteLine("Employee's constructor called")


        Work()
    End Sub

    Public Overridable Sub Work()
        Console.WriteLine("Employee is working")
    End Sub
End Class

'Manager.vb

Public Class Manager
    Inherits Employee
    Private theRoom As Room = Nothing
    Private managementLevel As Integer = 0

    Public Sub New(ByVal level As Integer)
        Console.WriteLine("Manager's constructor called")

        managementLevel = level

        If level < 2 Then
            theRoom = New Room
        Else
            theRoom = New ExecutiveRoom
        End If
    End Sub
Public Overrides Sub Work()
        Console.WriteLine("Manager's work called")

theRoom.OpenWindow()
        MyBase.Work()
    End Sub
End Class

'User.vb
Module User

    Sub Main()
        Console.WriteLine("Creating Manager")
Dim mgr As New Manager(1)

        Console.WriteLine("Done")
    End Sub
End Module

In the example given above, you have a Room class with OpenWindow() and CloseWindow() methods. The ExecutiveRoom derives from Room, but does not have any additional functionality as yet. The Employee has a constructor that invokes its Work() method. The Work() method, however, is declared virtual/overridable in the Employee class.

In the Manager class, which inherits from Employee, you have a reference of type Room. Depending on the Manager’s level, in the constructor of the Manager, you assign the theRoom reference to either an instance of Room or an instance of ExecutiveRoom. In the overridden Work() method in the Manager class, you invoke the method on theRoom to open the window and then invoke the base class’s Work() method. Looks reasonable so far, doesn’t it? But when you execute this program you get a NullReferenceException as shown in Figure 3-18.

Exception from Example 3-21

Figure 3-18. Exception from Example 3-21

Notice that in the creation of the Manager object, the Employee’s constructor is called first. From the Employee’s constructor, the call to Work() polymorphically calls Manager.Work().

Why? In the Employee constructor, even though the self reference this/Me is of type Employee, the real instance is of type Manager. But at this point, the constructor of Manager has not been invoked. As a result, the reference theRoom is still null/Nothing. The Work() method, however, assumes that the object has been constructed and tries to access theRoom. Hence the NullReferenceException.

Ideally, no method should ever be called on an object until its constructor has completed. However, the above example shows that there are situations where this can happen.

As a side note, if you initialize theRoom at the point of declaration to a Room instance, you half-fix the problem. The C# code will run fine, but the VB.NET code will still throw the exception. The reason for this? The difference in the sequence of initialization between the two languages, as discussed in Gotcha #27, "Object initialization sequence isn’t consistent.”

IN A NUTSHELL

image with no caption

Understand the consequence of calling virtual/overridable methods from within a constructor. If you need to further initialize your object, provide an Init() method that users of your object can call after the constructor completes. This even has a name: two-phase construction.

GOTCHA #29 Unit testing private methodstesting private methods

NUnit is an excellent tool that allows you to write unit-testing code, thereby improving the robustness of your application. (Refer to Gotcha #8, "Division operation isn’t consistent between types" for a brief introduction to NUnit.) It serves as a critical aid in refactoring—helping you identify the code changes you need to make as your design evolves. Developers who have started using the tool find it hard to imagine writing any code and refactoring it without the test harness and the support provided by NUnit.

Where should your test cases go? The pundits recommend that you place the test classes (called the “test fixture”) in the same assembly as the code you’re testing. This allows you to test not only the public members of a class, but the internal members as well. While this sounds great, there is one problem. Doing so might force you to make some of the class’s methods internal instead of private in order to test them. Also, just to make your class testable, sometimes you might find yourself writing methods that aren’t otherwise needed. I’ve even heard the suggestion to use compiler flags to make a method internal for testing and private for release. Such options make the code less readable and result in some very unpleasant code-maintenance nightmares. Relaxing the access control for the sake of testing is also not desirable, especially when there is an alternative. Consider Example 3-22 to test a simple User class.

Example 3-22. NUnit test for a simple User

C# (TestPrivate)

//Test.cs
using System;
using NUnit.Framework;
using System.Security.Cryptography;

namespace UnitTest
{
    [TestFixture]
    public class Test
    {
        private User theUser;

        [SetUp]
        public void CreateUser()
        {
            theUser = new User();
        }

        [Test]
        public void TestSetPassword()
        {
            string PASSWORD = "Cod!ng";


            theUser.ChangePassword(null, PASSWORD);
            // How do you assert that the password has been set?
            // You can rely on calling the GetPassword method to do this.
            // However, do you really want to provide a
            // method to get the password?

            // OK, let's write one for now.

            byte[] hashCode = new SHA256Managed().ComputeHash(
                System.Text.Encoding.ASCII.GetBytes(PASSWORD));

            string hashCodeString = BitConverter.ToString(hashCode);


            Assert.AreEqual(hashCodeString, theUser.GetPassword());
        }
    }
}

//User.cs
using System;
using System.Security.Cryptography;

namespace UnitTest
{    public class User
    {
        private string password;

        public void ChangePassword(
                string oldPassword, string thePassword)
        {
            // Make sure that the caller is either creating
            // a new password, or knows the old password

            if ((password == null && oldPassword == null)
                || CreateHash(oldPassword) == password)
            {
                password = CreateHash(thePassword);
            }
            else
            {
                throw new ApplicationException("Invalid password");
            }
        }

        internal string GetPassword()
        {
            return password;
        }

        private string CreateHash(string input)
        {
            byte[] hashCode = new SHA256Managed().ComputeHash(
                System.Text.Encoding.ASCII.GetBytes(input));

            return BitConverter.ToString(hashCode);
        }
    }
}

VB.NET (TestPrivate)

'Test.vb
Imports NUnit.Framework
Imports System.Security.Cryptography

<TestFixture()> _
Public Class Test

    Private theUser As User

    <SetUp()> _
    Public Sub CreateCalculator()        theUser = New User
    End Sub

    <Test()> _
  Public Sub TestSetPassword()
        Dim PASSWORD As String = "Cod!ng"


        theUser.ChangePassword(Nothing, PASSWORD)
        'How do you assert that the password has been set?
        'You can rely on calling the GetPassword method to do this.
        'However, do you really want to provide a
        'method to get the password?

        'OK, let's write one for now.

        Dim hashCode() As Byte = New SHA256Managed().ComputeHash( _
                     System.Text.Encoding.ASCII.GetBytes(PASSWORD))

        Dim hashCodeString As String = BitConverter.ToString(hashCode)


        Assert.AreEqual(hashCodeString, theUser.GetPassword())
    End Sub
End Class


'User.vb
Imports System
Imports System.Security.Cryptography

Public Class User
    Private password As String

    Public Sub ChangePassword(ByVal oldPassword As String, _
            ByVal thePassword As String)
        'Make sure that the caller is either creating a new password,
        'or knows the old password
        If (password Is Nothing And oldPassword Is Nothing) OrElse _
                CreateHash(oldPassword) = password Then
            password = CreateHash(thePassword)
        Else
            Throw New ApplicationException("Invalid password")
        End If
    End Sub

    Friend Function GetPassword() As String
        Return password
    End Function

    Private Function CreateHash(ByVal input As String) As String
        Dim hashCode() As Byte = New SHA256Managed().ComputeHash( _
                     System.Text.Encoding.ASCII.GetBytes(input))

        Return BitConverter.ToString(hashCode)

    End Function
End Class

In this example, you have a User class that needs to be tested. You are writing a test case for the SetPassword() method. After the call to SetPassword(), you want to check if the password has been set correctly. How do you do that? From within the Test class, you can access the public members and internal/friend members of the User class (since Test is in the same assembly as User). The only option I can think of here is to write an internal/friend method named GetPassword() in the User class to make SetPassword() testable. This might not be desirable. You might not want to expose the password. Furthermore, you might not need GetPassword() in the application, since you are writing it just to test SetPassword().

Why not make the test case a nested class of the User class? (Of course, I am not suggesting that you make all test fixtures nested classes. But if you need your test case to access the private inner workings of a class, writing it as a nested class accomplishes this.) The modified code with Test as a nested class is shown in Example 3-23.

Example 3-23. Test as nested class

C# (TestPrivate)

using System;
using System.Security.Cryptography;
using NUnit.Framework;

namespace UnitTest
{

    public class User
    {
        private string password;

        public void ChangePassword(
            string oldPassword, string thePassword)
        {
            if ((password == null && oldPassword == null)
                || CreateHash(oldPassword) == password)
            {
                password = CreateHash(thePassword);
            }
            else
            {                throw new ApplicationException("Invalid password");
            }
        }

        private string CreateHash(string input)
        {
            byte[] hashCode = new SHA256Managed().ComputeHash(
                System.Text.Encoding.ASCII.GetBytes(input));

            return BitConverter.ToString(hashCode);
        }

        // In .NET 2.0, with Partial Classes, this can be
        // in a separate file
        [TestFixture]
public class Test
        {
            private User theUser;

            [SetUp]
            public void CreateUser()
            {
                theUser = new User();
            }

            [Test]
            public void TestSetPassword()
            {
                string PASSWORD = "Cod!ng";

                theUser.ChangePassword(null, PASSWORD);

                Assert.AreEqual(theUser.password,
                    theUser.CreateHash(PASSWORD));
            }
        }
    }
}

VB.NET (TestPrivate)

'User.vb
Imports System
Imports System.Security.Cryptography
Imports NUnit.Framework

Public Class User
    Private password As String
    Public Sub ChangePassword(ByVal oldPassword As String, _
            ByVal thePassword As String)
        If (password Is Nothing And oldPassword Is Nothing) OrElse _
                CreateHash(oldPassword) = password Then
            password = CreateHash(thePassword)
        Else
            Throw New ApplicationException("Invalid password")
        End If
    End Sub

    Private Function CreateHash(ByVal input As String) As String
        Dim hashCode() As Byte = New SHA256Managed().ComputeHash( _
                     System.Text.Encoding.ASCII.GetBytes(input))

        Return BitConverter.ToString(hashCode)

    End Function

    'In .NET 2.0, with Partial Classes, this can be in a separate file
    <TestFixture()> _
Public Class Test

        Private theUser As User

        <SetUp()> _
        Public Sub CreateCalculator()
            theUser = New User
        End Sub

        <Test()> _
      Public Sub TestSetPassword()
            Dim PASSWORD As String = "Cod!ng"

            theUser.ChangePassword(Nothing, PASSWORD)

            Assert.AreEqual(theUser.password, _
                theUser.CreateHash(PASSWORD))
        End Sub
    End Class

End Class

In this code, the Test class is a nested class of the User class. Nested classes have full access to private members of the nesting class. This allows a more convenient way to test the class implementation without compromising the encapsulation or access control. The test case being executed under NUnit is shown in Figure 3-19.

One disadvantage of this approach is that the class file User.cs (or User.vb) now becomes larger. Furthermore, you may not want to release your test cases with your

NUnit test executing a nested test fixture

Figure 3-19. NUnit test executing a nested test fixture

assembly. This will not be an issue in the next release of .NET, where partial classes are allowed. Then you’ll be able to write the User class in one or more files and keep the test cases in different files. That will also make it easier to remove test cases from the build in production.

IN A NUTSHELL

If you find yourself changing the accessibility of fields and methods just so you can test them, or you start introducing methods only for the sake of testing other methods (e.g., GetPassword() in Example 3-22), consider writing those tests as nested classes. The tests that depend only on non-private members of a class should still be written as higher-level classes in the same project.



[2] Thanks to Ruby Hjelte for bringing this to my attention during a recent project.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.145.19.7