Advanced Topics

This section contains some advanced topics. You can skip this section if you wish and visit it later. The topics include the following:

  • Building assemblies that contain multiple modules as well as other files such as HTML files.

  • Loading resources such as strings and images programmatically and localizing resources by means of satellite assemblies.

  • The internals of manifest and digital signing of strong-named assemblies.

  • Using Reflection to inspect metadata and to invoke methods programmatically.

Multifile Assemblies

So far, all the assemblies we have built consisted of a single module. Although this is the most common case for an assembly, it is also possible to build assemblies containing multiple modules.

From the assembly consumer's perspective (the external view), an assembly contains MSIL code, metadata, manifest, and resources. From the assembly developer's perspective (the internal view), an assembly consists of one or more modules, each of which contains metadata and manifest and may additionally contain MSIL code and resources.

Why do we need a multimodule assembly? There are two cases in which creating a multimodule assembly makes sense:

  1. To mix code from multiple programming languages. The respective language compiler can produce modules that can all be combined into a single assembly.

  2. A more compelling reason is that a module is loaded into memory only if a type from the module is accessed. This allows you to lazy load a module. This is quite useful over a slow link. For example, if an assembly is being accessed over the Internet, the runtime downloads a module only if needed. You could put the frequently used types in one module and less frequently used types in another. If the client doesn't access the less frequently used types, the corresponding module is never downloaded, thereby improving performance.

Adding Modules

Let's see how we can develop a multimodule assembly. We can take ConsoleGreeting.cs and WindowsGreeting.cs from our previous chapter and build a module for each of the source files. We then create an assembly using these two modules.

The C# compiler provides a command-line switch, -t:module, to generate a module file. Using this option, the modules ConsoleGreeting.mod and WindowGreeting.mod can be generated as follows (Project MultiModule-Assembly):

csc.exe -t:module -out:ConsoleGreeting.mod ConsoleGreeting.cs
csc.exe -t:module -out:WindowsGreeting.mod WindowsGreeting.cs

Note that in this example, one module is built per source file. In general, a module can be based on any number of source files. The first release of Visual Studio .NET does not have any support for building modules.

To create an assembly from the modules, the C# compiler provides another command switch, -addmodule. Using this switch, any number of modules can be added to an assembly, as shown in the following code (Project MultiModuleAssembly):

csc.exe -t:library -out:Greeting.dll 
     -addmodule:ConsoleGreeting.mod 
     -addmodule:WindowsGreeting.mod

The generated assembly technically consists of three modules. The file Greeting.dll is the prime module. Recall that a prime module stores the assembly manifest.

Note that module Greeting.dll does not contain any MSIL code. It is not necessary for a module to contain any code.

Although csc.exe can be used to build a multimodule assembly, you can also use the Assembly Linker (al.exe) for this purpose. Using this tool, our assembly could have been created as follows:

al.exe -t:library -out:Greeting.dll 
     ConsoleGreeting.mod WindowsGreeting.mod

Adding Non-PE Files

An assembly is not just limited to module files. It can also contain non-PE files such as bitmap files or HTML files. The assembly linker provides a switch, -link, to link external files to the assembly. The following command line, for example, adds ReadMe.htm as an external file to the assembly:

al.exe -t:library -out:Greeting.dll -link:ReadMe.htm 
     ConsoleGreeting.mod WindowsGreeting.mod

Resources

Consider the following sample code that displays a greeting to the console:

using System;
class MyApp
{
     public static void Main() {
       String greeting = "Hello everyone!";
       Console.WriteLine(greeting);
     }
}

In this code string, “Hello everyone!” is hard-coded. However, the .NET Framework also provides a way to define the string in an external file. That external file can then be embedded into the assembly. The application can subsequently load the resource programmatically.

What's the big deal about moving strings to an external file? For one, it simplifies localization, the process of customizing application for multiple human languages. The team responsible for translating the strings (and images, if need be) to various languages won't have to deal with the code; they just deal with text-based files. A second reason, and a more important one, is that resources for a specific culture can be embedded into a resource-only assembly and a proper resource can be loaded based on the current culture setting of the application. Such resource-only assemblies are referred to as satellite assemblies.

Embedding Resources

Embedding a resource in an assembly is a multistep process. In the first step, the resources have to be defined in an XML format referred to as the ResX format. A .ResX file can hold strings as well as binary images. The necessary XML schema is documented in the SDK under the ResourceSchema property of ResXResourceWriter class.

Utilities to Manipulate Resources

ResX files are text based. To store a binary image file (such as a .jpg or a .bmp file) into the .ResX file, the binary data has to be converted into ASCII format using Base64 encoding. The SDK samples include a tool, ResXGen, that takes a binary image file as input and converts it to XML-formatted .ResX output file. The source code for ResXGen is also provided.

Some other useful utilities in the SDK samples include ResDump, a tool to enumerate the resources in a .resources file, and ResEditor, a GUI tool to add string and image resources to a .ResX or a .resources file.


If you are dealing with just the string resources, as in our case, then .NET provides an alternative to using .ResX files. Name–value pairs can be defined in a text file (typically with extension .txt) where the name is a string that describes the resource and the value is the resource string itself. The following excerpt shows the content of our text-based resource file:

Greeting = Hello everyone!

In the second step, the .txt or the .ResX file has to be converted to a binary resource file (.resources). The framework provides a tool called the Resource File Generator (ResGen.exe) that can be used to generate such a file. ResGen.exe expects an input filename and an optional output filename as a parameter. If an output filename is not specified, it creates a binary resource file that has the same name as the input file but has a .resources extension. It is not important to know the format of this file.

Assuming our text-based resource file is named MyStrings.txt, the following command line generates a .resources binary resource file:

resgen.exe MyStrings.txt MyStrings.resources

Converting between Resource File Formats

Resgen.exe is capable of converting any resource file (.txt, .resx, or .resources) to any other resource file. The conversion is done based on the extension of the input and the output filenames. Just remember that text files can only contain string resources. If a .resx file or a .resources file contains any images, converting it to a .txt file will lose the image information.


The last step is to embed the binary resource file into an assembly. Assuming our new code, revised for loading a string dynamically, is stored in HelloAll.cs, the following command line creates an assembly HelloAll.exe that has MyStrings.resources embedded in it:

csc.exe -t:exe -out:HelloAll.exe 
     -res:MyStrings.resources HelloAll.cs

So how does the revised code in HelloAll.cs look? Here it is with the changes highlighted:

// Project EmbeddedResources

using System;
using System.Resources;

class MyApp
{
     public static void Main() {
       ResourceManager rm = new
								         ResourceManager("MyStrings", typeof(MyApp).Assembly);
								       String greeting = rm.GetString("Greeting");
       Console.WriteLine(greeting);
     }
}

The framework provides a class ResourceManager (namespace System.Resources) that can be used to load resources dynamically. ResourceManager defines many overloaded constructors. The one that we are using requires two parameters. The first parameter is the root name of the resource file. For the MyStrings.resources file, the root name is MyStrings. The second parameter is the assembly where this resource can be found. As we know that the resource is in the same assembly as MyApp is, we can ask the runtime to locate the assembly that contains the definition of type MyApp, as shown in the code.

Once the ResourceManager object has been created, we can load the string by calling the GetString method on the object. This method takes the resource identifier as a parameter.

Note that ResourceManager also provides a more generic method, GetObject, that can be used to load any type of resource (text or image). For example, we could also use the following line of code to load our string:

String greeting2 = (String) rm.GetObject("Greeting");

Loading an image is equally simple, as shown in the following line of code:

Image img = (Image) rm.GetObject("MyImage");

It is left as an exercise for you to extend the sample program to deal with embedded images.

Satellite Assemblies

Let's build a multilanguage application. We will extend our earlier example to deal with U.S. English and Spanish.

Recall that the common language runtime supports the notion of culture-neutral assemblies that do not have any culture-specific settings; it contains resources that can run under any culture.

Typically, the main assembly is built to be culture neutral. The idea is that if a requested resource for a particular culture is not found in the satellite assembly, the runtime can fall back to the main assembly to load the resource.

For our example, we will build the satellite assembly with Spanish culture. The main assembly will embed an English language resource string, but will be built culture-neutral.

Let's create two subdirectories, en-US and es-ES. The importance of the directory names will become evident when we examine how the runtime locates satellite assemblies. I have also created a file MyStrings.txt under both the subdirectories, one with English strings and the other with Spanish strings.

The first step is to generate .resources files for both the languages. The command lines are shown here:

resgen.exe en-USMyStrings.txt en-USMyStrings.en-US.resources
resgen.exe es-ESMyStrings.txt es-ESMyStrings.es-ES.resources

Notice the filenames for the output files. The standard convention is to specify a resource filename as <root-file-name>.<culture-name> .resources.

The next step is to build the Spanish satellite assembly, as shown in the following command line:

al.exe -out:es-ESHelloAll.Resources.dll -c:es-ES 
     -embed:es-ESMyStrings.es-ES.resources

The assembly linker supports a switch, –c:<culture-name>, to specify the culture for the assembly, as shown in the command line.

For proper lookup during runtime, the root filename of the satellite assembly should be the same as that of the main assembly and the extension should be marked .Resources.dll. For example, the satellite assembly for HelloAll.exe would be HelloAll.Resources.dll. The filename is not case sensitive.

A culture string can also be assigned in the source code using the AssemblyCultureAttribute (namespace System.Reflection) assembly-level attribute (attributes are covered in the next chapter). Here is an example:

// Set assembly's culture to U.S. English
[assembly: AssemblyCulture("en-US")]

The main assembly is typically culture neutral and hence should not be assigned any culture string.

To build the main assembly, the following command can be used:

csc -res:en-USMyStrings.en-US.resources,MyStrings.resources 
     HelloAll.cs

Note the comma-separated syntax for the resource being embedded; the first part is the actual filename of the resource and the second part is the name given to the resource when stored in the assembly. This is because of the way the resource manager looks up a resource.

To understand the lookup algorithm, it would help us to make the following assumptions:

  1. The name of the main assembly is HelloAll.exe.

  2. The root name of the resource file is MyStrings. In the code, this is the first parameter to the resource manager's constructor.

  3. The current culture is es-ES.

  4. The identifier for the string to be loaded is Greeting.

Here is how the resource manager performs the lookup:

  1. Try to locate culture-specific HelloAll.Resources.Dll in the GAC.

  2. Try to locate HelloAll.Resources.Dll in the es-ES subdirectory.

  3. Verify whether the culture of the assembly is es-ES.

  4. Try to locate the resource file MyStrings.es-ES.resources from the satellite assembly.

  5. Try to load the named resource, Greeting, from MyStrings.es-ES .resources.

  6. If any of these steps fail, try to locate the culture-neutral resource file MyStrings.resources from the main assembly.

  7. If Step 5 fails, throw an exception of type MissingManifestResourceException.

  8. Otherwise, try to load the named resource, Greeting. If the resource is not found, return a null reference.

In this algorithm, the filenames and directory names are all case-insensitive.

There are ways to customize this behavior. For example, there is an overloaded method GetString that can be used to load a string from a specific culture, irrespective of the current cultural settings. As a matter of fact, even the current culture can be changed, as shown in the following code excerpt, taken from HelloAll.cs:

// Project SatelliteAssemblies

class MyApp
{
     public static void Main(String[] args) {
       if (args.Length == 1) {
         String sCulture = args[0];
         CultureInfo culture = new CultureInfo(sCulture);
         Thread.CurrentThread.CurrentUICulture = culture;
       }

       ...
     }
}

.NET classifies the properties of a culture into two groups—CurrentCulture and CurrentUICulture. The first one is used for sorting and formatting purposes and the second one is used for user-interface purposes. This distinction was created, for example, to support large enterprises that want their employees to be able to use the local language for the user interface but always have currencies and dates formatted the same way. The resources are loaded (by ResourceManager) using the CurrentUICulture setting (unless a specific culture setting is explicitly requested in the call). Each thread within an application can have a different culture. The preceding code sets the culture of the current thread based on the command-line arguments passed. Run the program as follows to display the greeting in Spanish:

HelloAll.exe es-ES

Manifest Tables

Internally, the metadata is organized as a block of binary data that consists of several tables. These tables can be broadly classified into two groups—type metadata tables and manifest metadata (or simply manifest) tables. The type metadata contains information on each type within the module. The manifest metadata contains some record-keeping information.

Each module in the assembly stores type metadata tables as well as manifest tables. Recall that there is one module within the assembly with a manifest that contains some extra information pertaining to the assembly as a whole. The module is called the prime module and the manifest it stores is referred to as the assembly manifest.

When dealing with assemblies, it is not really that important to know the manifest of each module in the assembly. What is important is to know the contents of the assembly manifest. Table 3.3 describes some important tables in the assembly manifest, which is valuable information.

In general, unless explicitly stated, a manifest refers to the assembly manifest.

Table 3.3. Assembly Manifest Tables
Table NameDescription
AssemblyDefContains a single entry containing the assembly's name, version, culture, and so on.
FileDefContains one entry for each module and resource file that is part of the assembly.
ManifestResourceDefContains information on each resource that is part of the assembly.
ExportedTypeDefContains one entry for each public type exported from all the modules in the assembly.
AssemblyRefContains one entry for each assembly referenced by the module. Note that each module individually defines the list of referenced assemblies.
ModuleRefContains one entry identifying the module. It includes the module's filename and its MVID.

All the assembly manifest tables listed in Table 3.3, except ExportedTypeDef, can also be found in the manifest for other (nonprime) modules in the assembly. For a nonprime module, the AssemblyDef table does not contain any entries.

Storing Assembly References

When an application is built, the compiler stores the name, the version number, the culture, and the public key token (if any) of the referenced assemblies in the AssemblyRef table of the manifest.

It is interesting to note that the AssemblyRef table stores the public key token (instead of the public key) for each referenced assembly. This is done to conserve file space.


It is interesting to learn the internals of building a strong-named assembly.

The FileDef metadata table of the assembly's manifest contains the list of files that make up the assembly. As each file is added to the manifest, the file's content is hashed and stored along with the file name. The default hash algorithm is SHA-1 but can be changed in two ways: using the AssemblyAlgIDAttribute (namespace System.Reflection) attribute or with al.exe's –algid switch.

Once the PE file containing the assembly manifest (the prime module) has been built, the PE file's entire content is hashed. The hash algorithm used here is always SHA-1 and cannot be changed (although this may change in later releases). This hash value (typically around 160 bytes in size) is signed with the specified private key and the resulting Rivest-Shamir-Adleman (RSA) digital signature is stored in a reserved section (not included in the hash) within the PE file. Another section of the PE file, called the .NET runtime header, is updated to reflect the location where the digital signature is embedded within the file.

The specified public key is also embedded in the manifest. The public–private key mechanism guarantees that no other company can produce an assembly with the same public key, as long as your company does not share the key pair with others. Those interested in learning more about the public–private key and the RSA digital signature can see the MSDN documentation on cryptographic APIs (CryptoAPI).

The bottom line of this whole process is that it provides the common language runtime a foolproof way to ensure that a shared assembly has not been tampered with. When the assembly is being installed into the GAC, the system hashes the PE file's contents and compares the hash value with the RSA digital signature embedded in the file (after it is unsigned with the public key). If the values are identical, then the file's content has not been tampered with. This is a very fast check. Similar strategies have been used for signing e-mails.

In the case of a multimodule assembly, the integrity check is performed only on the module that contains the manifest. For all other modules, the check is performed when the module gets loaded at runtime.

Note that only a strong-named assembly can be installed into the GAC. Attempts to install any other assembly result in an error.

It should be noted that the strong-named mechanism only guarantees that an assembly, once created, has not been tampered with. It doesn't tell you who the publisher of the assembly is. If the publisher wants to associate its identity with the assembly, then the publisher must use Microsoft's Authenticode technology. Covering Authenticode is beyond the scope of this book

Delayed Signing

Once a public–private key pair is generated, the private key should never be compromised. Many companies prefer that the private key be accessed only by a few privileged people in the company. The public key can be freely distributed.

Inability to access the private key could be a huge burden during developing and testing the assembly. Fortunately, the framework provides a mechanism to develop an assembly without using a private key at all; just the public key is sufficient. This mechanism is called delayed signing.

The Strong Name tool (sn.exe) provides switch -p to extract the public key from a strong-named file (the file containing the public–private key pair). The following command, for example, extracts the public key from MyKey.snk and stores it in MyKey.public:

sn.exe –p MyKey.snk MyKey.public

MyKey.snk can now be stored away in a safe place and the public key file is distributed to the developers.

The next step is to embed the public key information (using the familiar AssemblyKeyFile attribute) and indicate to the compiler that the signing of the assembly is being delayed. This is done by means of the assembly-level attribute AssemblyDelaySign. Relevant code is shown here (Project DelayedSigning):

[assembly: AssemblyKeyFile("MyKey.public")]
[assembly: AssemblyDelaySign(true)]

Obtaining Public Key Token

Here is a quick way to obtain the public key token from a strong-named file; that is, the file containing the key pair. First extract the public key from the key pair using sn.exe and save it in a file. For example, the following command extracts the public key from MyKey.snk and saves it in MyKey.public:

sn.exe –p MyKey.snk MyKey.public

Next, run sn.exe –t on the public key file, as shown here:

sn.exe –t MyKey.public

The command displays the public key token in the console window.

Note that running the command on the original file that contains both the public and the private key also generates a public key token. However, this value is bogus because the file does not store any extra information to indicate that it contains something besides the public key. As a result, sn.exe cannot determine if the file contains extra information. It simply runs through the bits and returns a result.


When the source is compiled, the compiler embeds the public key information (in the AssemblyDef table) so that other assemblies that reference this assembly can generate and use the public key token. In addition, the compiler leaves enough space in the resulting PE file for the RSA digital signature (the compiler can determine how much space is needed).

Note that, instead of using the assembly-level attributes, delayed signing can also be accomplished by using al.exe's –keyfile and –delaysign switches.

At this point, the resulting assembly does not have a valid signature. If we try to install this assembly in the GAC, the .NET Framework assumes that the assembly has been tampered with and fails to load. To force the runtime to accept the assembly, you must tell it to skip the verification of this assembly. This is accomplished using sn.exe with -Vr switch, as shown here:

sn.exe –Vr ConsoleGreeting.dll

A bit of a warning is in order. You should never do something like this with an assembly you don't know about.

The assembly can now be installed in the GAC:

gacutil.exe –i ConsoleGreeting.dll

At this point, you can go ahead and test your application as normal.

Note that the -Vr switch does not actually modify the assembly. Instead, it adds the specified assembly's strong name to a list of assemblies for which verification should be skipped on the local machine. The list is stored as a set of subkeys under the registry key HKLMSoftwareMicrosoftStrongNameVerification. If you plan to install the assembly on a different machine, you need to run sn.exe -Vr on the file once again on the new machine.

Once you are ready to ship the assembly, you can be signed with the original private key, as shown here:

sn.exe –R ConsoleGreeting.dll MyKey.snk

Reflection

All .NET-compliant compilers are required to generate full metadata information about every class type in the compiled source code file. Among other things, this metadata contains a list of modules in the assembly, a declaration for each type in the assembly, the base class and the interfaces the type inherits from, the name and type (methods, properties, events, and fields) of all its members, and the signature of each of the methods. Figure 3.4 shows a simplified layout of the information contained in the type metadata.

Figure 3.4. Metadata hierarchy within an assembly.


The type metadata is organized in a hierarchical fashion:

  • An assembly contains a list of modules.

  • A module contains a list of types and global methods. Note that C# does not allow defining global methods. A method, even if defined as static, still needs to be defined as part of a type.

  • A type contains a list of methods, a list of fields, a list of properties, and a list of events.

  • A method contains a list of parameters and a return type.

Not shown in Figure 3.4 is that all the elements of the code including assembly, module, type, method, and so on, also contain a list of associated attributes.

Here is the sample code that corresponds to the type Calculator used in Figure 3.4. The code is compiled into an assembly named MyCalculator.exe:

// Project Reflection/MyCalculator

using System;

namespace MyCompany
{
     public class Calculator
     {
       public int Add(int x, int y) {
         return (x+y);
       }

       public static void Main() {
         Calculator calc = new Calculator();
         int sum = calc.Add(10, 20);
         Console.WriteLine(sum);
       }
     }
}

The .NET Framework provides a set of classes under the namespace System.Reflection to inspect the metadata. The following code excerpt illustrates how you can load an assembly and obtain information on modules, types, and methods contained in the assembly:

// Project Reflection/MetadataViewer
// Note: Code has been modified slightly for easy reading

public static void DumpAssembly(String assemblyName) {
     Assembly assembly = Assembly.Load(assemblyName);
     Console.WriteLine("Assembly={0}", assembly.FullName);

     // Get the list of modules in the assembly
     Module[] moduleList = assembly.GetModules();
     foreach(Module mod in moduleList) {
       Console.WriteLine("Module={0}", mod.Name);

       // Get a list of types in the module
       Type[] typeList = mod.GetTypes();
       foreach(Type type in typeList) {
         Console.WriteLine(" Type={0}", type.Name);

         // Get a list of methods in the type
         MethodInfo[] methodList = type.GetMethods();
         foreach(MethodInfo method in methodList) {
           Console.Write("  Method= {0} {1}(",
             method.ReturnType, method.Name);
           ParameterInfo[] pL = method.GetParameters();
           for(int i=0;i<pL.Length;i++) {
             Console.Write("{0} {1}",
               pL[i].ParameterType, pL[i].Name);
           } // for each param
         } // for each method
       } // for each type
     } // for each module
}

Information on classes Assembly, Module, Type, MethodInfo, ParameterInfo can be obtained from the SDK documentation. Each of these classes encapsulate a specific element of the code.

Here is the output when the program is run against MyCalculator.exe:

Assembly=MyCalculator, Version=0.0.0.0, Culture=neutral, 
     PublicKeyToken=null
Module=mycalculator.exe
  Type=Calculator
     Method= System.Int32 GetHashCode()
     Method= System.Boolean Equals(System.Object obj)
     Method= System.String ToString()
     Method= System.Int32 Add(System.Int32 x, System.Int32 y)
     Method= System.Void Main()
     Method= System.Type GetType()

As can be seen, assembly MyCalculatore.exe contains one module. This module contains type Calculator, which defines methods Add and Main, as expected.

You may be wondering where the other four methods on Calculator came from. These methods are defined by class System.Object. Recall that every type under .NET directly or indirectly inherits from System.Object.

Using Reflection, it is also possible to invoke a method programmatically. In the following code excerpt, method Add on an instance of Calculator is invoked dynamically:

// Project Reflection-MetadataViewer

public static void MethodInvokeDemo() {
     Type t =
       Type.GetType("MyCompany.Calculator, MyCalculator");
     object calc = Activator.CreateInstance(t);

     MethodInfo mi = t.GetMethod("Add");
     object[] pL = new object[] { 10 /*x*/, 20 /*y*/};
     int sum = (int) mi.Invoke(calc, pL);
     Console.WriteLine("Sum={0}", sum);
}

Under .NET, just as an assembly can be represented by a display name, a type can also be represented by a display name. The syntax for the display name for a type is:

Namespace.TypeName <,assembly name>

Given the display name of a type, the type can be loaded by calling a static method Type.GetType and an instance of the type (Calculator in our case) by calling a static method System.Activator.CreateInstance. The rest of the code obtains the method information that we are interested in invoking, packs the method parameters in an array, and invokes the method on the Calculator instance.

This concludes our basic introduction to Reflection under .NET. For other possibilities with Reflection and metadata inspection, check the SDK documentation as well as the SDK samples. You may also wish to check out the System.Reflection.Emit namespace; it contains classes to allow generating metadata and MSIL instructions programmatically and optionally generate a PE file on the disk.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.145.178.157