Serialization

At this point, we are experts on reading and writing various datatypes to and from streams. However, what we haven't considered is dealing with perhaps the most important datatype, an object. How do you save an object to the stream and load it back such that it is in the same state when it was saved?

One way is to provide explicit save and load methods on your class that take a Stream object as an argument. Within each method, you can explicitly save or load each member field of the class.

There is nothing wrong with this technique; but when dealing with large numbers of classes, it soon becomes painful to add the logic to each of the classes. Why can't we write a generic mechanism to load and save any arbitrary object? After all, the metadata contains all the information about all the fields of a class. It would be easy to enumerate through all the fields of an object in a generic manner to save or to load them. .NET does provide such a generic mechanism.

The process of saving the state of an object into a stream is a common programming task, referred to as serialization.

Serialization is an important part of the .NET Framework. The remoting infrastructure and services depend on serialization. For example, the serialized representation of an object can be taken to a different machine where the object can be reconstructed. Given an efficient serialization framework, an object may simply be serialized to a stream of bytes in memory and transmitted to the remote machine.

Under .NET, the type that requires serialization has to indicate this by means of the System.SerializableAttribute class-level attribute. This is illustrated in the following code excerpt:

// Project Serialization

[Serializable]
class Foo {
     ...
}

If the serialization is attempted on a type that is not marked as Serializable, the system throws a SerializationException. This exception is also thrown if any object in the serialization graph is not marked as Serializable.

By default, all the member fields defined in the type get serialized. To omit a specific field from serialization, you can apply the System.NonSerializedAttribute attribute to the field. This is illustrated in the following code excerpt:

// Project Serialization

[Serializable]
class Foo {
     private int m_i;
     [NonSerialized] private double m_d;
     private string m_s;
     ...
}

The object is now ready for serialization, but how do you initiate the serialization process?

Formatters

Serialization can be initiated either by an application or by the runtime. The application initiates serialization, for example, to store the state of the object on a disk file when the application is being exited. The runtime initiates serialization, for example, when the object is being passed to a different application domain, perhaps to a different machine.

The format of the serialized data depends on why the object is being serialized. For example, if the object is being transferred to a different computer using HTTP, storing data in a SOAP-compliant XML format makes sense. However, if the object is being stored to a file or being transferred using TCP, storing data in a binary form is more efficient. Therefore, it makes sense to decouple the logic of serialization and the format of the serialized data.

Under .NET, the format of the output is controlled by what is called a formatter, or a type that implements a standard interface IFormatter. Here are some methods on the interface that are relevant to our current discussion:

public interface IFormatter {
     void Serialize(Stream stream, Object root);
     void Deserialize(Stream stream);
     StreamingContext Context {get; set;}
     ...
}

The Serialize method serializes an object (and all its children) to a stream.

The Deserialize method reads the data back from the stream and reconstructs the state of the object.

The Context property provides a mechanism for the initiator to supply additional information to the object being serialized. Its type, StreamingContext, exposes two properties, State and Context.

The StreamingContext.State property is an enumeration of type ContextSreamingStates that is used to indicate why the data is being serialized. For example, a value of CrossMachine implies the serialized data is for a remote computer. A value of CrossProcess implies the data is for a different process on the local computer. Look into the SDK documentation for the rest of the enumeration values.

The StreamingContext.Context property provides a way to supply any additional information, in the form of an object, to the object being serialized.

Note that you are not required to specify a StreamingContext object in your code. The runtime provides an appropriate context to the object being serialized.

Enough with IFormatter! Let's see how we can use a formatter.

The BCL provides a formatter called BinaryFormatter to serialize an object to a binary format. The output is very compact and can be parsed quickly. The following code excerpt illustrates the use of this formatter:

// Project Serialization

public static void UseBinaryStream() {
     Foo f1 = new Foo(10, 20.5, "Jay");

     // Open a file for binary write
     FileStream fsw = File.OpenWrite("Output.bin");
     BinaryFormatter bf = new BinaryFormatter();
     bf.Serialize(fsw, f1);
     fsw.Close();

     // Open the file for reading and recreate Foo
     FileStream fsr = File.OpenRead("Output.bin");
     Foo f2 = (Foo) bf.Deserialize(fsr);
     fsr.Close();
}

It is worth noting that serialization of an object is not limited to its public member fields. To store the state of the object with complete fidelity, all the member fields, including the private ones, are serialized.

As the serialization logic is independent of the stream being used, it is relatively trivial to modify the preceding code to use a MemoryStream instead. Project Serialization also contains such an example.

There is yet another useful formatter the BCL provides called the SoapFormatter. This formatter generates SOAP-compliant XML-based output. If you replace BinaryFormatter with SoapFormatter in the preceding code, here is how the output looks (project SoapFormatter):

<SOAP-ENV:Envelope
     xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
     xmlns:xsd=http://www.w3.org/2001/XMLSchema
     xmlns:SOAP-ENC="http://schemas.xmlsoap.org/soap/encoding/"
     xmlns:SOAP-ENV="http://schemas.xmlsoap.org/soap/envelope/"
     SOAP-ENV:encodingStyle=
          http://schemas.xmlsoap.org/soap/encoding/
     xmlns:a1="http://schemas.microsoft.com/clr/assem/main">
     <SOAP-ENV:Body>
       <a1:Foo id="ref-1">
          <m_i>10</m_i>
          <m_s id="ref-3">Jay</m_s>
       </a1:Foo>
     </SOAP-ENV:Body>
</SOAP-ENV:Envelope>

A final note on formatters: Although the BCL-provided formatters will meet most of your needs, it is possible to create your own formatter. The BCL supplies an abstract base class called Formatter that provides some helper methods for implementing the IFormatter interface. Adventurous readers can implement their own formatter by inheriting from this class.

Custom Serialization

Occasionally, an object itself may wish to finely control how it gets serialized. For example, the object might want to save its internal state in a more compressed way if the data is being written to a file. Perhaps the object does not wish to save some data if it is being serialized for the purpose of being rebuilt on a different computer.

To provide custom serialization, a class has to support a standard interface, ISerializable. Here is its definition:

public interface ISerializable {
     void GetObjectData(SerializationInfo info,
       StreamingContext context);
}

When serialization is in progress, the framework checks if the object implements the ISerializable interface, in which case it calls GetObjectData on the object. This gives the object a chance to serialize itself. The parameter SerializationInfo holds the serialization data. The object can inject its own data into SerializationInfo by means of a method called AddValue. This is illustrated in the following code excerpt:

// Project CustomSerialization

[Serializable]
class Foo : ISerializable {
     ...
     public void GetObjectData(SerializationInfo info,
         StreamingContext ctx) {
       info.AddValue("My iVal", m_i);
       info.AddValue("My dVal", m_d);
       info.AddValue("My sVal", m_s);
     }
}

Method AddValue can be called multiple times to inject multiple entries. Each entry forms a key–value pair when the key is in the form of a string. The value could be of any base datatype. The SerializationInfo class provides many overloaded AddValue methods to deal with various base datatypes.

Parameter StreamingContext provides the contextual information. Recall that this information is set either explicitly by the application or implicitly by the runtime.

Note that implementing ISerializable on a class doesn't preclude the need for the [Serializable] attribute. Without this attribute present on the class, the common language runtime does not even consider serializing the instances of the class.

Deserialization

Now you know how to save an object's state. How do you read the data back to restore the state of the object? Implementing just the ISerialize interface is not enough. You also have to provide an overloaded constructor for your class that takes SerializationInfo and StreamingContext as the parameters. This is illustrated in the following code excerpt:

// Project CustomSerialization

[Serializable]
class Foo : ISerializable {
     ...

     public Foo(SerializationInfo info, StreamingContext ctx) {
       m_i = info.GetInt32("My iVal");
       m_d = info.GetDouble("My dVal");
       m_s = info.GetString("My sVal");
     }
}

During deserialization, the runtime calls this constructor, giving the object a chance to initialize its internal state. The parameter SerializationInfo provides many GetXXX methods to retrieve various base datatypes.

Deserialization Completion

Deserialization is quite simple for objects that have no dependencies on other objects. In real life, the root object being serialized points to many other objects that in turn point to other objects. Sometimes, from an object's perspective, it is desirable to know if the deserialization process is complete; that is, if the entire object graph has been deserialized.

An object that wants to receive a notification at the end of the deserialization must implement a standard interface, IDeserializationCallback. This interface defines just one method, OnDeserialization, that the run-time calls at the end of the deserialization. You can implement the interface as shown in the following code excerpt:

// Project Deserialize

[Serializable]
class Foo : ISerializable , IDeserializationCallback {
     ...
     public void OnDeserialization(Object sender) {
       Console.WriteLine("Deserialization complete");
     }
}

XML Serializer

In the new era of communication, XML has become a standard format for information exchange between businesses. The BinaryFormatter provides a compact format, but it works only between .NET applications. Serializing to XML creates a message that is readable on any platform by anyone. When developing business applications, it is becoming quite common to write to or read from XML documents. The format of the XML document typically conforms to a given XML Schema Definition (XSD) schema (.xsd) document.

.NET provides a class, XmlSerializer (namespace System.Xml .Serialization), that enables you to control how objects can be serialized into XML output and how objects can be rebuilt from XML input.

Technically, XmlSerializer belongs to the XML Class Library and not the BCL. However, I am covering it here because it is relevant to our current discussion on serialization.

Using XmlSerializer is similar to using a formatter. The following code excerpt demonstrates its usage. Here, an instance of class BookInfo is serialized to a document. Later the document is deserialized to create a new instance of BookInfo:

// Project XmlSerialize

public class BookInfo {
     public String ISBN {
       get{return m_ISBN;}
       set{m_ISBN = value;}
     }
     public String Title {
       get{return m_Title;}
       set{m_Title = value;}
     }
     public String Author {
       get{return m_Author;}
       set{m_Author = value;}
     }

     public float Price = 0.0f;
     private String m_Title = "";
     private String m_Author = "";
     private String m_ISBN = "";
}

static public void Run() {
     BookInfo b1 = new BookInfo();
     b1.ISBN = "0130886742";
     b1.Title = "COM+ Programming";
     b1.Author = "Pradeep Tapadiya";
     b1.Price = 40.0f;

     // Open a file for XML output
     FileStream fsw = File.OpenWrite("Output00.xml");
     XmlSerializer xs = new XmlSerializer(typeof(BookInfo));
     xs.Serialize(fsw, b1);
     fsw.Close();

     // Open the file for reading
     FileStream fsr = File.OpenRead("Output00.xml");
     BookInfo b2 = (BookInfo) xs.Deserialize(fsr);
     fsr.Close();
}

XmlSerializer serializes the public fields and properties of a type. Contrast this to a formatter, which can save even the private fields and does not do anything special for properties. Also, the serializer does not pay any attention either to the [Serializable] attribute or to the ISerializable interface.

Here is the output when the preceding program is executed:

<?xml version="1.0"?>
<BookInfo xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
       xmlns:xsd="http://www.w3.org/2001/XMLSchema">
  <Price>40</Price>
  <ISBN>0130886742</ISBN>
  <Title>COM+ Programming</Title>
  <Author>Pradeep Tapadiya</Author>
</BookInfo>

As can be seen from the output, each of the public fields and properties is saved as an XML element. The name of the root node matches that of the class and the name of each XML element matches that of the corresponding field or property of the class.

The XML serialization mechanism, however, provides a flexible way to format the output. As I mentioned earlier, the format of the XML document typically conforms to a given XSD schema. Let's say, for example, that the XSD schema for the book is defined as follows:

<xsd:schema targetNamespace=""
     xmlns:xsd="http://www.w3.org/2001/XMLSchema" >
  <xsd:element name="book">
    <xsd:complexType>
      <xsd:sequence>
        <xsd:element name="name" type="xsd:string" />
        <xsd:element name="author" type="xsd:string" />
      </xsd:sequence>
      <xsd:attribute name="isbn" type="xsd:string" />
    </xsd:complexType>
  </xsd:element>
</xsd:schema>

Details about XSD schema can be found in the SDK documentation. The schema presented here essentially states that the root node should be named book, the ISBN should be an attribute named isbn, and the name and author are XML elements of type string. Given this, an instance of the output may look like the following:

<book isbn="0130886742">
  <name>COM+ Programming</name>
  <author>Pradeep Tapadiya</author>
</book>

To customize the output, you can define XML serialization attributes on the class and its public elements. Table 5.4 shows some common attributes and their usage.

Table 5.4. XML Serialization Attributes
AttributeDescription
XmlRootTo identify the class or struct as the root node. Typically used to assign a different element name to the root other than the class name itself.
XmlElementThe public property or field should be serialized as an XML element. Typically used to name the element other than the field name itself.
XmlAttributeThe public property or field should be serialized as an XML attribute. Can also rename the attribute to a different value than the field itself.
XmlArrayThe public property or field should be serialized as an array. Useful when an array of objects need to be serialized.
XmlArrayItemTo identify a type that can be placed into a serialized array.
XmlIgnoreDo not serialize the specific public property or field.

Using these attributes, we can revise our BookInfo class as follows:

// Project XmlSerialize

[XmlRoot(ElementName="book")]
public class BookInfo {
     [XmlAttribute(AttributeName="isbn")]
     public String ISBN {
       ...
     }
     [XmlElement(ElementName="name")]
     public String Title {
       ...
     }

     [XmlElement(ElementName="author")]
     public String Author {
       ...
     }

     [XmlIgnore]
     public float Price = 0.0f;

     ...
}

Serializer versus Formatter

Although SoapFormatter also can be used to serialize an object into XML, formatters and the XML serializer solve two different problems. A formatter is used to serialize an object with the utmost fidelity. The XML serializer, on the other hand, is used to process XML documents that typically conform to a given XSD schema. It is not associated with the runtime serialization architecture as formatters are and is controlled by a different set of attributes than those used by the formatters.


Although the ability to produce XML documents that conform to a given schema is very powerful, it also has some limitations that you should be aware of. One such limitation that we have already seen is that the private fields cannot be serialized. Another limitation is that if an object graph contains circular references, then the object cannot be serialized.

As XSD schemas are so frequently used to specify XML formats, the SDK provides a tool called the XML Schema Definition Tool (xsd.exe) that lets you generate a strongly typed C# class based on existing XSD schema. For example, assuming the XSD schema for the book is defined in file BookSchema.xsd, the following command generates an output file, BookSchema.cs, containing the corresponding C# class:

xsd.exe BookSchema.xsd /c

It is also possible to generate an XSD schema either from the XML output definition or from an assembly that defines the type to be serialized. The following command line, for example, uses XML instance data from the file BookInstance.xml and generates an XSD schema in the file BookInstance.xsd:

xsd.exe BookInstance.xml

The samples dealing with xsd.exe can be found under the project UsingXSD. More information on xsd.exe can be found in the SDK documentation.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.138.141.202