This chapter discusses the organization, deployment, and execution of assemblies and modules. It also provides a detailed examination of the metadata segment responsible for assembly and module identity and interaction: the manifest. As you might recall from Chapter 1, an assembly can include several modules. Any module of a multimodule assembly can—and does, as a rule—carry its own manifest, but only one module per assembly carries the manifest that contains the assembly’s identity. This module is referred to as the prime module. Thus, each assembly, whether multimodule or single-module, contains only one prime module.
What Is an Assembly?
An assembly is a deployment unit, a building block of a managed application. Assemblies are reusable, allowing different applications to use the same assembly. Assemblies carry a full self-description in their metadata, including version information that allows the common language runtime to use a specific version of an assembly for a particular application.
This arrangement eliminates what’s known as DLL Hell, the situation created when upgrading one application renders another application inoperative because both happen to use identically named DLL(s) of different versions.
Assemblies are classified as either private or shared. Structurally and functionally, these two kinds of assemblies are the same, but they differ in how they are named and deployed and in the level of version checks performed by the loader.
A private assembly is considered part of a particular application, not intended for use by other applications. A private assembly is deployed in the same directory as the application or in a subdirectory of this directory. This kind of deployment shields the private assembly from other applications, which should not have access to it.
Being part of a particular application, a private assembly is usually created by the same author (person, group, or organization) as other components specific to this application and is thus considered to be primarily the author’s responsibility. Consequently, naming and versioning requirements are relaxed for private assemblies, and the common language runtime does not enforce these requirements. The name of a private assembly must be unique within the application.
A shared assembly is not part of a particular application and is designed to be used widely by various applications. Shared assemblies are usually authored by groups or organizations other than those responsible for the applications that use these assemblies. A prominent example of shared assemblies is the set of assemblies constituting the .NET Framework class library.
As a result of such positioning, the naming and versioning requirements for shared assemblies are much stricter than those for private assemblies. Names of shared assemblies must be globally unique. Additional assembly identification is provided by strong names, which use cryptographic public/private key pairs to ensure the strong name’s uniqueness and to prevent name spoofing. The central part of the strong name is the strong name signature (mentioned in Chapter 5)—a hash of the assembly’s prime module encrypted with the publisher’s private key. Assembly metadata carries the publisher’s public key, which is used to verify the strong name signature. A strong name also provides the consumer of the shared assembly with information about the identity of the assembly publisher. If the common language runtime cryptographic checks pass, the consumer can be sure that the assembly comes from the expected publisher, assuming that the publisher’s private encryption key was not compromised.
Shared assemblies are deployed into the machine-wide repository called global assembly cache (GAC). The GAC stores multiple versions of shared assemblies side by side. The loader looks for the shared assemblies in the GAC.
Under some circumstances, an application might need to deploy a shared assembly in its directory to ensure that the appropriate version is loaded. In such a case, the shared assembly is being used as a private assembly, so it is not in fact shared, whether it is strong named or not.
Application Domains As Logical Units of Execution
Operating systems and runtimes typically provide some form of isolation between applications running on the system. This isolation is necessary to ensure that code running in one application cannot adversely affect other, unrelated applications. In modern operating systems, this isolation is achieved by using hardware-enforced process boundaries, where a process, occupying a unique virtual address space, runs exactly one application and scopes the resources that are available for that process to use.
Managed code execution has similar needs for isolation. Such isolation can be provided at a lower cost in a managed application, however, considering that managed applications run under the control of the common language runtime and are verified to be type-safe.
The runtime allows multiple applications to be run in a single operating system process, using a construct called an application domain to isolate the applications from one another. Since all memory allocation requested by an application is done by the CLR, it is easy for the CLR to give an application access to only those objects that were allocated by the application and to block the application’s attempts to access objects allocated in another application domain. In many respects, application domains are the CLR equivalent of an operating system process.
Specifically, isolation in managed applications means the following:
The following examples describe scenarios in which it is useful to run multiple applications in the same process:
Hosting environments such as ASP.NET or Internet Explorer need to run managed code on behalf of the user and take advantage of the application isolation features provided by application domains. In fact, it is the host that determines where the application domain boundaries lie and in what domain user code is run, as these examples show:
Since isolation demands that the code or resources of one application must not be directly accessible from code running in another application, no direct calls are allowed between objects in different application domains. Cross-domain communications are limited to either copying objects or creating special proxy objects, which are the object’s “representatives” in other domains, giving the code in other domains access to instance fields and methods of the object. In regard to cross-domain communications, the objects fall into one of the following three categories:
The CLR relies on the verifiable type safety of the code (discussed in Chapter 13) to provide fault isolation between domains at a much lower cost than that incurred by the process isolation used in operating systems. The isolation is based on static type verification, and as a result, the hardware ring transitions or process switches are not necessary.
Manifest
The metadata that describes an assembly and its modules is referred to as a manifest. The manifest carries the following information:
Figure 6-1 shows the mutual references that take place between the metadata tables constituting the manifest.
Figure 6-1. Mutual references between the manifest’s metadata tables
Assembly Metadata Table and Declaration
The Assembly metadata table contains at most one record, which appears in the prime module’s metadata. The table has the following column structure:
In ILAsm, the Assembly is declared in the following way (for example):
.assembly mscorlib
{
.publickey = (00 00 00 00 00 00 00 00 04 00 00 00 00 00 00 00 )
.hash algorithm 0x00008004
.ver 2:0:0:0
}
The ILAsm syntax of the Assembly declaration is
.assembly<flags> <name> { <assemblyDecl>* }
0p class="noindent">where <flags> ::=
<none> // (0x0000) Assembly cannot be retargeted,0p class="noindent">and <assemblyDecl> ::=
// platform is defined by PE and CLR headers.
|cil // (0x0010) Assembly is pure MSIL
| x86 // (0x0020) Assembly is x86-specific
| ia64 // (0x0030) Assembly is IA64-specific
| amd64 // (0x0040) Assembly is AMD64-specific
| arm // (0x0050) Assembly is ARM-specific (introduced in v4.5)
| retargetable // (0x0100) Assembly can be retargeted
| windowsruntime // (0x0200) Assembly is a Windows Runtime metadata stub
// or managed app (introduced in v4.5)
.hash algorithm<int32> // Set hash algorithm ID
| .ver<int32>:<int32>:<int32>:<int32>// Set version numbers
| .publickey= ( <bytes> ) // Set public encryption key
| .locale<quotedString> // Set assembly culture
| <securityDecl> // Set requested permissions
| <customAttrDecl> // Define custom attribute(s)
In this declaration, <int32> denotes an integer number, 4 bytes in size. The notation <bytes> represents a sequence of two-digit hexadecimal numbers, each representing 1 byte; this form, bytearray, is often used in ILAsm to represent binary objects of arbitrary size. Finally, <quotedString> denotes, in general, a composite quoted string, such as a construct like "ABC"+"DEF"+"GHI". The concatenation with the plus sign is useful for defining very long strings, although in this case we don’t need concatenation for strings such as en-US or nl-BE.
AssemblyRef Metadata Table and Declaration
The AssemblyRef (assembly reference) metadata table defines the external dependencies of an assembly or a module. Both prime and nonprime modules can—and do, as a rule—contain this table. The only assembly that does not depend on any other assembly, and hence has an empty AssemblyRef table, is Mscorlib.dll, the root assembly of the .NET Framework class library.
The column structure of the AssemblyRef table is as follows:
In ILAsm, an AssemblyRef is declared in the following way (for example):
.assembly extern mscorlib
{
.publickeytoken= (B7 7A 5C 56 19 34 E0 89 )
.ver2:0:0:0
}
The ILAsm syntax for an AssemblyRef declaration is
.assembly extern<flags> <name> { <assemblyRefDecl>* }
0p class="noindent">where <flags> ::=
<none> // (0x0000) Must be so for versions older than 4.5.
| cil // (0x0010) Assembly is pure MSIL
| x86 // (0x0020) Assembly is x86-specific
| ia64 // (0x0030) Assembly is IA64-specific
| amd64 // (0x0040) Assembly is AMD64-specific
| arm // (0x0050) Assembly is ARM-specific (introduced in v4.5)
| windowsruntime // (0x0200) Assembly is a Windows Runtime metadata stub
// or managed app (introduced in v4.5)
and<assemblyRefDecl> ::=
| .ver<int32>:<int32>:<int32>:<int32>// Set version numbers
| .publickey= ( <bytes> ) // Set public encryption key
| .publickeytoken= ( <bytes> ) // Set public encryption key token
| .locale<quotedString> // Set assembly locale (culture)
| .hash= ( <bytes> ) // Set hash value
| <customAttrDecl> // Define custom attribute(s)
As you might have noticed, ILAsm does not provide a way to set the flags in the AssemblyRef declaration except flags specific to v4.5 or later. The explanation is simple: in older versions, the only flag relevant to an AssemblyRef is the flag indicating whether the AssemblyRef carries a full unhashed public encryption key, and this flag is set only when the .publickey directive is used.
When referencing a strong-named assembly, you are required to specify .publickeytoken (or .publickey, which is rarely used in AssemblyRefs) and .ver. The only exception to this rule among the strong-named assemblies is Mscorlib.dll.
If .locale is not specified, the referenced assembly is presumed to be “culture neutral.”
An interesting situation arises when you need to use two or more versions of the same assembly side by side. An assembly is identified by its name, version, public key (or public key token), and culture. It would be extremely cumbersome to list all these identifications every time you reference an assembly: “I want to call method Bar of class Foo from assembly SomeOtherAssembly, and I want the version number such-and-such, the culture nl-BE, and....” Of course, if you didn’t need to use different versions side by side, you could simply refer to an assembly by name.
ILAsm provides an AssemblyRef aliasing mechanism to deal with such situations. The AssemblyRef declaration can be extended as shown here:
.assembly extern<flags> <name> as<alias> { <assemblyRefDecl>* }
Whenever you need to reference this assembly, you can use its <alias>, as shown in this example:
.assembly extern SomeOtherAssembly as OldSomeOther
{ .ver1:1:1:1 }
.assembly extern SomeOtherAssembly as NewSomeOther
{ .ver1:3:2:1 }
...
call int32[OldSomeOther]Foo::Bar(string)
...
call int32[NewSomeOther]Foo::Bar(string)
...
The alias is not part of metadata. Rather, it is simply a language tool, needed to identify a particular AssemblyRef among several identically named AssemblyRefs. The IL disassembler generates aliases for AssemblyRefs whenever it finds identically named AssemblyRefs in the module metadata.
Autodetection of Referenced Assemblies
Version 2.0 of the IL assembler introduced a way to reference the assemblies without specifying their version, public key token, and other attributes:
.assembly extern<name> as<alias> { auto}
When the keyword auto is specified, the ILAsm compiler queries the GAC and tries to find an assembly with the specified name. If it succeeds, it reads the assembly attributes (version, public key, culture) and puts these attributes into the generated AssemblyRef metadata record.
Note that the autodetection feature works only for referenced assemblies installed in the GAC.
The referenced assembly attributes may be partially specified and combined with autodetection, thus narrowing the search; for example:
.assembly extern OtherAssembly { .ver1:3:*:* auto}
The previous directive will prompt the IL assembler to query the GAC looking for an assembly named OtherAssembly with the major version number equal to 1 and the minor version number equal to 3 and with any build and revision numbers. If such assembly is found in the GAC, then its missing attributes are retrieved and put into the respective entries of the AssemblyRef record.
If more than one assembly matching the search criteria is found, the one with the highest version is taken.
In this regard, the IL assembler differs from other managed compilers (VB, C#, VC++), as those compilers require the specification of referenced assemblies via the file path instead of querying the GAC. This might play a bad trick on a programmer, because the CLR loader always tries to load the assemblies from the GAC first (as is described in the next section), and in the unlikely event of a mismatch between referenced assemblies installed in the GAC and those specified by the file path, the application will be executed against assemblies different from those it was built against.
The autodetection feature was introduced in version 2.0 of the IL assembler.
The Loader in Search of Assemblies
When you define an AssemblyRef in the metadata, you expect the loader to find exactly this assembly and load it into the application domain. Let’s have a look at the process of finding an external assembly and binding it to the referencing application.
Given an AssemblyRef, the process of binding to that assembly is influenced by these factors:
As illustrated in Figure 6-2, the loader performs the following steps to locate a referenced assembly.
Figure 6-2. Searching for a referenced assembly
The .NET Framework retrieves its configuration from a set of configuration files. Each file represents settings that have different scopes. For example, the configuration file supplied with the installation of the common language runtime has settings that can affect all applications that use that version of the CLR. The configuration file supplied with an application (application configuration file) has settings that affect only that one application; this configuration file resides in the application directory. A publisher policy file is supplied by the publisher of a shared assembly, and it contains information about the assembly compatibility and redirects an assembly reference to a new version of the shared component. A publisher policy file is usually issued when the shared component is updated by its publisher. The publisher policy settings take precedence over the settings of the application configuration file. The administrator policy file, Machine.config, resides in the Configuration subdirectory of the CLR installation directory. This file contains settings defined by the administrator for this machine and takes precedence over any other configuration file. Overrides specified in the Machine.config file affect all applications running on this machine and cannot be in turn overridden.
Note that starting with v4.0, the machine-wide policies described here are not enforced by the CLR (see Chapter 17 for details).
In version 2.0 or later of the CLR running under a 64-bit operating system, the problems with assembly binding are exacerbated by the possible presence of both 32-bit and 64-bit versions of assemblies. To deal with the problem, the binding mechanism of the v2.0+ assembly loader uses the following classification of the assemblies:
This classification is called Processor Architecture and is an additional part of full assembly identity in versions 2.0+. The Processor Architecture is derived from the Machine entry of the COFF header, the type of the Optional NT header, and the two least significant bits (flags ILONLY and 32BITREQUIRED) of the CLR header flags (see Chapter 4 for details):
You should be careful declaring your assembly platform agnostic. To be truly platform agnostic, the assembly has to have no presumptions of pointer size, no unmanaged exports or imports, no embedded native code, and no thread-local storage (.tls section), and it has to reference no platform-specific assemblies or platform-specific unmanaged DLLs. The last condition is the worst of them all, because it is transitive. Many times developers have written an application (EXE) and declared it platform agnostic, only to discover that it crashed on 64-bit platforms: the application, being platform agnostic, created a 64-bit process and then tried to load a 32-bit specific referenced assembly into the 64-bit process. Kaboom! Or it tried to load a platform-agnostic assembly A, which in turn referenced assembly B, and B just happened to P/Invoke a 32-bit unmanaged DLL (see Chapter 18). Kaboom! The bright side of it is that such problems are usually discovered right away, not after the application has been shipped.
Versions 2.0 and later of the runtime consider all assemblies produced for versions 1.0 and 1.1 as 32-bit specific assemblies. It is only fair: versions 1.0 and 1.1 of the runtime did not support 64-bit platforms. The assemblies produced for versions 1.0 and 1.1 are identified by the metadata stream header (see Chapter 5); the version specified in this header is 1.0 for v1.0 and v1.1 assemblies and is 2.0 for v2.0+ assemblies.
Module Metadata Table and Declaration
The Module metadata table contains a single record that provides the identification of the current module. The column structure of the table is as follows:
Since only one entry of the Module record can be set explicitly (the Name entry), the module declaration in ILAsm is quite simple:
.module<name>
ModuleRef Metadata Table and Declaration
The ModuleRef metadata table contains descriptors of other modules referenced in the current module. The set of “other modules” includes both managed and unmanaged modules.
The relevant managed modules are the other modules of the current assembly. In ILAsm, they should be declared explicitly, and their declarations should be paired with File declarations (discussed in the following section). IL assembler does not verify whether the referenced modules are present at compile time.
The unmanaged modules described in the ModuleRef table are simply unmanaged DLLs containing methods called from the current module using the platform invocation mechanism—P/Invoke, discussed in Chapter 18. These ModuleRef records usually are not paired with File records. They need not be explicitly declared in ILAsm because in ILAsm the DLL name is part of the P/Invoke specification, so the IL assembler emits respective ModuleRef records automatically.
There is one reason, however, to pair a ModuleRef record referring to an unmanaged module with a File record: you should do that if you want this unmanaged DLL to be part of your deployment. In this case, the unmanaged DLL will reside together with managed modules constituting your assembly, and it does not have to be on the path to be discovered.
A ModuleRef record contains only one entry, the Name entry, which is an offset in the #Strings stream. The ModuleRef declaration in ILAsm is not much more sophisticated than the declaration of Module:
.module extern<name>
As in the case of Module, <name> in ModuleRef is the name of the executable file with its extension but without a path, not exceeding 512 bytes in UTF-8 encoding.
File Metadata Table and Declaration
The File metadata table describes other files of the same assembly that are referenced in the current module. In single-module assemblies, this table is empty (unless you want to specify unmanaged DLLs as part of your deployment, as described earlier). The table has the following column structure:
The File declaration in ILAsm is
.file<flag> <name> .hash= ( <bytes> )
where<flag> ::=
<none> // The file is a managed PE file
| nometadata // The file is a pure resource file
If the hash value is not explicitly specified, the IL assembler finds the named file and computes the hash value using the hash algorithm specified in the Assembly declaration. If the file is not available at compile time, the HashValue entry of the respective File record is set to 0.
The File declaration can also carry the .entrypoint directive, as shown in this example:
.file MainClass.dll
.hash= (01 02 03 04 05 06 ... )
.entrypoint
This sort of File declaration can occur only in the prime module of a multimodule assembly and only when the entry point method is defined in a nonprime module of the assembly. This clause of the File declaration does not affect the metadata, but it puts the appropriate file token in the EntryPointToken entry of the common language runtime header. See Chapter 4 for details about EntryPointToken and the CLR header.
The prime module of an assembly, especially a runnable application (EXE), must have a valid token in the EntryPointToken field of the CLR header; and this token must be either a Method token, if the entry point method is defined in the prime module, or a File token. In the latter case, the loader loads the relevant module and inspects its common language runtime header, which must contain a valid Method token in the EntryPointToken field.
Managed Resource Metadata and Declaration
A resource is nonexecutable data that is logically deployed as part of an application. The data can take any number of forms such as strings, images, persisted objects, and so on. As Chapter 4 described, resources can be either managed or unmanaged (platform specific). These two kinds of resources have different formats and are accessed using managed and unmanaged APIs, respectively.
An application often must be customized for different cultures. A culture is a set of preferences based on a user’s language, sublanguage, and cultural conventions. In the .NET Framework, the culture is described by the CultureInfo class from the .NET Framework class library. A culture is used to customize operations such as formatting dates and numbers, sorting strings, and so on.
You might also need to customize an application for different countries or regions. A region defines a set of standards for a particular country or region of the world. In the .NET Framework, the class library describes a region using the RegionInfo class. A region is used to customize operations such as formatting currency symbols.
Localization of an application is the process of connecting the application’s executable code with the application’s resources that have been customized for specific cultures. Although a culture and a region together constitute a locale, localization is not concerned with customizing an application to a specific region. The .NET Framework and the common language runtime do not support the localization of component metadata, instead relying solely on the managed resources for this task.
The .NET Framework uses a hub-and-spoke model for packaging and deploying resources. The hub is the main assembly, which contains the executable code and the resources for a single culture (referred to as the neutral culture). The neutral culture is the fallback culture for the application. Each spoke connects to a satellite assembly that contains the resources for a single culture. Satellite assemblies do not contain code.
The advantages of this model are obvious. First, resources for new cultures can be added incrementally after an application is deployed. Second, an application needs to load only those satellite assemblies that contain the resources needed for a particular run.
The resources used in or exposed by an assembly can reside in one of the following locations:
The resource data is not directly used or validated by the deployment subsystem or the loader, so it can be of any kind.
All resource data embedded in a managed PE file resides in a contiguous block inside the .text section. The Resources data directory in the CLR header provides the RVA and size of embedded managed resources. Each individual resource is preceded by a 4-byte unsigned integer holding the resource’s length in bytes. Figure 6-3 shows the layout of embedded managed resources.
Figure 6-3. The layout of embedded managed resources
The ManifestResource metadata table, describing the managed resources, has the following column structure:
ILAsm syntax for the declaration of a managed resource is
.mresource<flag> <name> { <mResourceDecl>* }
where<flag> ::= public| private and <mResourceDecl> ::=
.assembly extern<alias> // Resource is imported from another
// assembly
| .file<name> at<int32> // Resource resides in another
// file of this assembly;
// <int32> is the offset
| <customAttrDecl> // Define custom attribute for this resource
The default flag value is private.
The directives .assembly extern and .file in the context of a managed resource declaration refer to the resource’s Implementation entry and are mutually exclusive. If Implementation references the AssemblyRef or File before it has been declared, the ILAsm compiler will diagnose an error.
If the Implementation entry is empty, the resource is presumed embedded in the current module. In this case, the IL assembler creates the PE file, loads the resource from the file according to the resource’s name, and writes it into the .text section of the PE file, automatically setting the Offset entry of the ManifestResource record. When the IL disassembler disassembles a PE file into a text file, the embedded managed resources are saved into binary files named after these resources, which allows the IL assembler to easily pick them up if the PE file needs to be reassembled.
There is a little catch there: names of managed resources may contain characters inappropriate for filenames. In such cases, the managed resources cannot be saved under their true names; on the other hand, you cannot change the resource names, because the resources are addressed by these names in the application. To deal with this situation, version 2.0 of ILAsm introduced aliasing of the managed resources similar to aliasing of referenced assemblies:
.mresource<flag> <name> as<filename> { <mResourceDecl>* }
The previous directive prompts the IL assembler to load the resource from file <filename> and create the respective ManifestResource metadata record with name <name>. The IL disassembler v2.0+, when saving the managed resources to files, analyses the names of the resources and if it finds colon, semicolon, comma, or backslash characters, it creates an alias for the resource, replacing these characters with exclamation mark, commercial “at” (@), ampersand (&), and currency sign ($), respectively. Then the resource is saved in the alias-named file.
ILAsm does not offer any language constructs to address the managed resources because IL lacks the means to do so. Managed APIs provided by the .NET Framework class library—specifically, the System.Resources.ResourceManager class—are used to load and manipulate managed resources.
ExportedType Metadata Table and Declaration
The ExportedType metadata table contains information about the public classes (visible outside the assembly) that are declared in nonprime modules of the assembly. Only the prime module’s manifest can carry this table.
This table is needed because the loader expects the prime module of an assembly to hold information about all classes exported by the assembly. The union of the classes defined in the prime module and those in the ExportedType table gives the loader the full picture.
On the other hand, the intersection of the classes defined in the prime module and those in the ExportedType table must be nil. As a result, the ExportedType table can be nonempty only in the prime module of a multimodule assembly: if there are no nonprime modules, then all classes defined by this assembly reside in the prime module itself.
In versions 2.0+, the ExportedType table serves an additional function: it contains so-called class forwarders, which are close conceptually to reexports in the unmanaged world or a postal address forwarding in everyday life. A forwarder indicates to which assembly class such-and-such (which used to reside in this assembly) has been moved. The forwarding mechanism, obviously, allows you to refactor your multiassembly product without the need for all your customers to rebuild their applications.
The ExportedType table has the following column structure:
Some explanation is in order. Any time a type (class) is referenced in a module, the resolution scope should be provided to indicate where the referenced class is defined (in the current module, in another module of this assembly, or in another assembly). If the resolution scope is not provided, the referenced type should be declared in the current module. However, if this type cannot be found in the module referencing it and if the manifest of the prime module carries an identically named pseudo-ExportedType record indicating where the type is actually defined, the loader is nevertheless able to resolve the type reference. None of the current Microsoft managed compilers, excluding the IL assembler, uses this rather bizarre technique. The IL assembler has to be able to, for obvious reasons.
The exported types are declared in ILAsm as
.class extern<flag> <namespace>.<name> { <expTypeDecl> * }
where<flag> ::= public | nested public | forwarder and where <expTypeDecl> ::=
.file<name> // File where exported class is defined
| .class extern<namespace>.<name>// Enclosing exported type
| .class<int32>// Set TypeDefId explicitly (don't do that!)
| .assembly extern<name>// Forwarder
| <customAttrDecl>// Define custom attribute for this ExportedType
The directives .assembly extern, .file, and .class extern define the Implementation entry and are mutually exclusive. As in the case of the .mresource declaration, respective AssemblyRef, File, or ExportedType must be declared before being referenced by the Implementation entry.
It is fairly obvious that if Implementation is specified as .class extern, we are dealing with a nested exported type, and Flags must be set to nested public. Inversely, if Implementation is specified as .file, we are dealing with a top-level unnested class, and Flags must be set to public.
Order of Manifest Declarations in ILAsm
The general rule in ILAsm (and not only in ILAsm) is “declare, then reference.” In other words, it’s always safer, and in some cases outright required, to declare a metadata item before referencing it. There are times when you can reference a yet-undeclared item, such as calling a method that is defined later in the source code. But you cannot do this in the manifest declarations.
If you reexamine Figure 6-1, which illustrates the mutual references between the manifest metadata tables, you can discern the following list of dependencies:
To comply with the “declare, then reference” rule, the following sequence of declarations is recommended for ILAsm programs, with the manifest declarations preceding all other declarations in the source code:
Remember that only the manifests of prime modules carry Assembly and ExportedType declarations.
Single-Module and Multimodule Assemblies
A single-module assembly consists of a sole prime module. Manifests of single-module assemblies as a rule carry neither File nor ExportedType tables: there are no other files to declare, and all types are defined in the prime module. However, you might want to declare a File record for an unmanaged DLL you want to be part of the deployment, or your single-module assembly might use type forwarding via the ExportedType table.
The advantages of single-module assemblies include lower overhead, easier deployment, and slightly greater security. Overhead is lower because only one set of headers and metadata tables must be read, transmitted, and analyzed. Assembly deployment is simpler because only one PE file must be deployed. And the level of security can be slightly higher because the prime module of the assembly can be protected with a strong name signature, which is extremely difficult to counterfeit and virtually guarantees the authenticity of the prime module. Nonprime modules are authenticated only by their hash values (referenced in File records of the prime module) and are theoretically easier to spoof.
Manifests of the modules of a multimodule assembly carry File tables, and the manifest of the prime module of such an assembly might or might not carry ExportedType tables, depending on whether any public types are defined in nonprime modules.
The advantages of multimodule assemblies include easier development and...lower overhead. (No, I am not pulling your leg.) Both advantages stem from the obvious modularity of the multimodule assemblies.
Multimodule assemblies are easier to develop because if you distribute the functionality among the modules well, you can develop the modules independently and then incrementally add to the assembly. (I didn’t say that a multimodule assembly was easier to design.)
Lower overhead at run time results from the way the loader operates: it loads the modules only when they are referenced. So if only part of your assembly’s functionality is engaged in a certain execution session, only part of the modules constituting your assembly might be loaded. Of course, you cannot count on any such effect if the functionality is spread all over the modules and if classes defined in different modules cross-reference each other.
A well-known technique for building a multimodule assembly from a set of modules is based on a “spokesperson” approach: the modules are analyzed, and an additional prime module is created, carrying nothing but the manifest and (maybe) a strong name signature. Such a prime module carries no functionality or positive definitions of its own whatsoever; it is only a front for functional modules, a “spokesperson” dealing with the loader on behalf of the functional modules. The Assembly Linker tool, distributed with the .NET Framework, uses this technique to build multimodule assemblies from sets of nonprime modules.
Summary of Metadata Validity Rules
In this section, I’ll summarize the validity rules for metadata contained in a manifest. Since some of these rules have a direct bearing on how the loader functions, the respective checks are performed at run time. Other rules describe “well-formed” metadata; violating one of these rules might result in rather peculiar effects during the program execution, but it does not represent a crash or security breach hazard, so the loader does not perform these checks. You can find the complete set of metadata validity rules in Partition II of the ECMA/ISO standard; the sections that follow here review the most important of them.
ILAsm does allow you to generate invalid metadata. Thus, it’s extremely important to carefully check your modules after compilation.
To find out whether any of the metadata in a module is invalid, you can run the PEVerify utility, included in the .NET Framework SDK, using the option /MD (metadata validation). Alternatively, you can invoke the IL disassembler. Choose View, MetaInfo, and Validate, and then press Ctrl+M. Both utilities use the Metadata Validator (MDValidator), which is built into the common language runtime.
Assembly Table Validity Rules
AssemblyRef Table Validity Rules
Module Table Validity Rules
ModuleRef Table Validity Rules
File Table Validity Rules
S ::= con | aux | lpt | prn | nul | com
N ::= 0..9
C ::= $ | :
ManifestResource Table Validity Rules
ExportedType Table Validity Rules
3.22.171.136