Chapter 8. Customizing How Assemblies Are Loaded

In Chapter 7, I describe the CLR default behavior for locating and loading assemblies. This default deployment model works well for common application scenarios ranging from rich client executables to Web applications to controls embedded in Web pages. Part of the reason the default deployment model has gained such broad acceptance is that it promotes concepts that address problems (most notably DLL Hell) that were prevalent in the native Microsoft Windows programming model. The CLR encourages a private deployment model that helps keep applications isolated and makes them easier to install, uninstall, and replicate. In addition, the CLR provides a hierarchical version policy system that gives application developers, administrators, and component vendors a say in which version of an assembly is loaded.

Although the characteristics of the default deployment model are positive, the CLR implementation of this model has three fundamental assumptions that sometimes make it difficult to realize these benefits in other application scenarios. First, the CLR assumes that all assemblies are contained in files stored on disk. Assumptions are even made about the extensions of these files in some cases. For example, all dependent assemblies stored in an application’s directory must have a .dll extension. Second, the CLR assumes that all assemblies for an application are stored either in the application’s base directory, in the global assembly cache (GAC), or at locations identified in the application’s configuration file. Finally, the CLR assumes that the hierarchy of version policies applies equally in all scenarios. A prime example of how these built-in assumptions make it hard to adapt the CLR deployment model to a new application environment is Microsoft SQL Server 2005. SQL Server 2005 has some deployment requirements that are at odds with how the CLR works by default. For example, SQL Server stores all assemblies directly in the database—not as individual files in the file system. Second, the level of control over which versions of assemblies get loaded as part of a SQL Server application differs quite dramatically from other scenarios, such as rich client or Web applications. Specifically, SQL Server would like to disable the ability to run a different version of an assembly than the one that was originally installed with the application. This requirement arises from the fact that SQL Server can persist instances of managed objects as data stored in the database. This occurs, for example, if a user defines a database column whose type is a class written in managed code (that is, a user-defined database type, in SQL Server terminology). In these cases, the data is persisted by serializing the object instance directly into the table. If, later, a different version of the type were used to deserialize the object, a mismatch of fields might occur and the type would not load. Similar issues occur if managed objects are used as part of the definition of a database index. If one version of a type is used to create the index and another version is used to read it, there’s a chance that the index might be invalid, resulting in dramatically decreased performance. In a production environment that requires a nearly perfect degree of reliability and consistent performance, the chance of failure introduced by loading a new version of an assembly cannot be tolerated. So SQL Server requires that the assemblies defined as part of the application are exactly the ones used when the application is run.

I can imagine many other scenarios in which you’d like to store assemblies somewhere other than in a standard portable executable file (PE file) on disk, search for them in a location the CLR normally wouldn’t look, or customize the way the default version policy system works. For example, instead of storing assemblies on disk, you might need to generate assemblies dynamically using the classes in the System.Reflection.Emit namespace. You might also want to load assemblies out of a "container" file such as a .cab or a .jar file. Furthermore, you might need to implement a new mechanism for locating assemblies. You can find this useful if you’re moving an existing application model from a different platform to Microsoft .NET Framework, for example.

You can take two approaches to customize the CLR default deployment model to accommodate the scenarios I’ve described. First, you can use some of the events and methods on the System.AppDomain and System.Reflection.Assembly classes (namely, the AppDomain.Load(byte[]...), Assembly.Load(byte[]...) methods, and the AppDomain.AssemblyResolve() event). Although this approach enables you to customize the CLR entirely from within managed code, you can control only certain aspects of the assembly loading process. The second way you can customize the default deployment model is to use the CLR hosting APIs to write a host that implements an assembly loading manager. This approach requires you to write unmanaged code and requires more effort, but you can customize the CLR to a much greater extent because you are integrating with the CLR at a much lower level. In fact, the amount of customization available when writing an assembly loading manager is so extensive that you can completely replace the CLR assembly loading implementation. This is the approach that SQL Server has taken to implement its custom deployment model.

The goal of this chapter is to describe these two approaches in enough detail that you can decide which approach best fits your scenario.

The Cocoon Deployment Model

To demonstrate how to customize the CLR default behavior for locating and loading assemblies, I need to introduce a new deployment model in which you change all three of the builtin assumptions discussed earlier. Specifically, you need a model in which the assemblies are stored in a format other than the standard PE file format on disk, are found in places other than in the application’s base directory or the global assembly cache, and have different versioning rules. To this end, I introduce a new deployment model called a cocoon.[1]

A cocoon is a new packaging format for applications. A single cocoon file contains all the assemblies needed to run an application (minus the assemblies shipped as part of the .NET Framework). Packaging all of an application’s files into one single file simplifies deployment because the application is more self-contained: it can be installed, removed, or copied simply by moving a single file around. Cocoon files are very similar in concept to .cab files. After I describe how cocoon files are structured and built, I’ll walk through the steps needed to write a CLR host that runs applications contained in cocoons. In going through this exercise, I discuss the details of how to write an assembly loading manager. Toward the end of the chapter, I write a program that runs cocoons completely in managed code using the events and methods of System.AppDomain and System.Reflection.Assembly. This second program won’t provide the same level of customization as the CLR host does, but it will serve to demonstrate the different capabilities offered by the two approaches.

My implementation of the cocoon deployment model is based on object linking and embedding (OLE) structured storage files. The structured storage technology lends itself particularly well to this scenario because it includes concepts that map directly to directories and files on disk (namely, storages and streams). If you’re not familiar with structured storage, or your knowledge is a bit rusty, you can find plenty of documentation on the Microsoft Developer Network (MSDN) or in the platform SDK.

Cocoons are built by a utility I wrote called makecocoon.exe. This utility packages all executable files in the directory from which it’s run into a structured storage file with a .cocoon extension. Each file in the directory ends up as a stream in the .cocoon file. The name of the stream is set to the name of the file on disk, minus its file extension.

Makecocoon.exe takes as input the executable containing the entry point for the application and the name of the type within that executable that contains the main method. The name of the .cocoon file created by makecocoon.exe is based on the name of the main executable for the application. For example, consider an application called hrtracker that is contained in the directory shown in the following listing:

Volume in drive C has no label.
 Volume Serial Number is 18EE-14D2

 Directory of C:HRTracker

10/03/2003 10:30 AM     <DIR>         .
10/03/2003 10:30 AM     <DIR>         ..
10/01/2003 04:37 PM             50,688 HRTracker.exe
10/01/2003 04:36 PM            122,880 Benefits.dll
09/24/2003 01:56 PM             16,384 Employee.dll
10/01/2003 04:36 PM            453,348 Payroll.dll
              5 File(s)         643,300 bytes
              2 Dir(s)   45,701,091,328 bytes free

The following command would create a cocoon file named hrtracker.cocoon:

MakeCocoon HRTracker.exe HRTracker.Application

Hrtracker.cocoon contains a stream for each assembly as shown in Figure 8-1.

A .cocoon file for the HRTracker application

Figure 8-1. A .cocoon file for the HRTracker application

In addition to the streams containing the main executable and its dependent assemblies, each .cocoon file also contains three additional streams. These extra streams contain data that is needed by the programs I write later in the chapter to run executables contained in .cocoon files. The first of these streams contains the name of the type in the main executable that contains the main method. This stream, called _entryPoint, is needed so you know which type to instantiate to run the application contained in the .cocoon file. The need for the other two streams isn’t quite as obvious. To understand the role these streams play, I need to introduce the notion of CLR binding identities.

CLR Binding Identities

Recall from Chapter 7 that assemblies can be referenced by strings consisting of the assembly’s friendly name and optional values for the public key used to sign the assembly, the assembly’s version, and the culture for any resources that the assembly contains. Working with these string-based identities can be problematic for three reasons:

  • Parsing these strings can be error prone. As I’ve shown, these strings must adhere to a specific format to be understood by the CLR. To parse them correctly, you must understand all the rules the CLR enforces on format of the strings.

  • Assembly identities evolve over time. For example, .NET Framework version 2.0 of the CLR introduces support for 64-bit processors. As part of this support, the string that identifies an assembly can now specify a dependency on a particular processor architecture. Assemblies can contain native code that depends on either a 64- or 32-bit processor, or the assembly can consist entirely of Microsoft intermediate language (MSIL) code, which is processor independent.

  • Determining equality between string-based identities is nearly impossible to get right. As discussed in Chapter 7, the steps taken by the CLR to resolve a reference to an assembly are quite involved. Given a reference to a specific assembly, the CLR can choose to load a different assembly depending on whether you are referencing an assembly with a strong name or a weak name, whether version policy is present, and so on. Attempting to duplicate these rules yourself would be prohibitively difficult. Later in this chapter I write a CLR host that implements an assembly loading manager to customize how the CLR loads assemblies. As you’ll see, an assembly loading manager can completely take over the process of loading an assembly—including the steps needed to determine which assembly to load given a reference.

To help alleviate these problems, the CLR hosting interfaces provide a set of methods that make it easy to work with string-based identities. These methods are part of an interface called ICLRAssemblyIdentityManager. Given an assembly, ICLRAssemblyIdentityManager gives you back a fully qualified string identity in the correct format. These canonical textual strings are what I was referring to earlier as binding identities. The nice thing about binding identities is that you can (and should) treat them as opaque identifiers for assemblies. So, you don’t need to parse them or interpret their contents in any way. The methods on ICLRAssemblyIdentityManager and the methods provided by the interfaces you use as part of an assembly loading manager handle all that for you. In fact, if you ever find yourself looking inside a binding identity, it’s likely you’re doing something wrong.

The extra streams included in each .cocoon file are needed because the assembly loading manager I write later in the chapter requires the use of binding identities. I use them in two specific places, hence the need for two additional binding identity–related streams in the .cocoon files. The first place I use a binding identity is to load the executable containing the entry point for the application in the cocoon. Remember that one of the goals of writing an assembly loading manager is to force the CLR to call the host to resolve references to assemblies contained in cocoon files. For this to work properly, all assemblies must be referenced by a full identity—the CLR will not call the assembly loading manager for partial references. I’ve added a stream to the .cocoon file that contains the binding identity (remember, these are fully qualified) for the assembly containing the application’s entry point. This stream is called _exeBindingIdentity.

I also need to use binding identities when the CLR calls the assembly loading manager to resolve a reference to an assembly. As you’ll see, the CLR passes the assembly reference to resolve in the form of a binding identity. You must know which stream in the .cocoon file contains the assembly with the given binding identity. The easiest way to implement this would have been simply to name the streams in the cocoon based on the binding identity of the assembly the stream contains. Unfortunately, OLE structured storage places constraints on how streams can be named, and binding identities violate those constraints. To work around this limitation, I name the assembly streams based on the assembly’s friendly name and create an index stream that maps binding identities to the names of the streams containing the assemblies. The name of this mapping stream is called _index. The format of the _index stream is shown in Figure 8-2.

The _index stream in a .cocoon file

Figure 8-2. The _index stream in a .cocoon file

Now that you understand the need for the additional streams I had to create, take a look at the overall structure of a .cocoon file. To summarize, each .cocoon file has the following streams:

  • One stream for each assembly in the directory from which makecocoon.exe is run. These streams are named based on the simple name of the assembly they contain.

  • A stream named _entryPoint that identifies the type that contains the application’s main routine.

  • A stream named _exeBindingIdentity that contains the binding identity for the application’s .exe file.

  • A stream named _index that contains entries that map a given binding identity to the stream within the .cocoon that contains that assembly.

The platform SDK contains a utility called DocFile Viewer that you can use to look at the contents of structured storage files. Figure 8-3 shows the contents of the HRTracker cocoon file using DocFile Viewer.

A .cocoon file as shown in DocFile Viewer

Figure 8-3. A .cocoon file as shown in DocFile Viewer

Obtaining Binding Identities

Now that you’ve seen the role that binding identities will play in the cocoon scenario, take a look at the steps involved in obtaining these identities. As I mentioned, the ICLRAssemblyIdentityManager interface includes methods that return binding identities for a given assembly. In addition to returning binding identities, ICLRAssemblyIdentityManager also has methods that help determine the list of an assembly’s references, the list of files the CLR will look for when attempting to resolve a reference to an assembly, and so on. The complete list of methods on ICLRAssemblyIdentityManager is shown in Table 8-1.

Table 8-1. The Methods on ICLRAssemblyIdentityManager

Method Name

Description

GetBindingIdentityFromFile

Returns a binding identity for an assembly given a path to its manifest.

GetBindingIdentityFromStream

Returns a binding identity for an assembly given a stream that contains the assembly.

GetCLRAssemblyReferenceList

Translates string-based assembly references to binding identities. The list of binding identities returned from GetCLRAssemblyReferenceList is used in several places throughout the CLR hosting interfaces. For example, you use GetCLRAssemblyReferenceList later in this chapter as part of the assembly loading manager implementation. In addition, GetCLRAssemblyReferenceList is used in Chapter 9 when I talk about how to load assemblies domain neutral.

GetReferencedAssembliesFromFile

Given the filename of an assembly manifest, this method returns the list of that assembly’s references.

GetReferencedAssembliesFromStream

Given a stream containing an assembly, this method returns the list of that assembly’s references.

GetProbingAssembliesFromReference

Recall from Chapter 6 that one of the steps the CLR follows to resolve an assembly reference is to probe for that assembly in the ApplicationBase directory structure. GetProbingAssembliesFromReference returns the list of the files the CLR would look for when attempting to resolve a reference to a given assembly.

As shown in the table, ICLRAssemblyIdentityManager enables you to supply the assembly for which you’d like a binding identity by either providing a pathname to the file containing that assembly’s manifest or by supplying a pointer to an IStream that contains the assembly’s contents. Given these methods, two steps are involved in obtaining a binding identity for an assembly:

  1. Obtain a pointer to ICLRAssemblyIdentityManager.

  2. Call GetBindingIdentityFromFile (or Stream) to get a binding identity.

Step 1: Obtaining a Pointer to ICLRAssemblyIdentityManager

Unfortunately, obtaining a pointer to an ICLRAssemblyIdentityManager is more involved than obtaining pointers to the rest of the hosting interfaces implemented by the CLR. You may recall from Chapter 2 that a host typically uses the ICLRControl interface to request pointers to the hosting interfaces implemented by the CLR. ICLRAssemblyIdentityManager doesn’t follow this pattern. Instead, you must call a function named GetCLRIdentityManager to get a pointer of type ICLRAssemblyIdentityManager. Here’s the definition of GetCLRIdentityManager from mscoree.idl:

STDAPI GetCLRIdentityManager(REFIID riid, IUnknown **ppManager);

To make matters more complicated, GetCLRIdentityManager is implemented in the main CLR runtime DLL, mscorwks.dll, not from the startup shim (mscoree.dll) like the other functions we’ve used, such as CorBindToRuntimeEx. Even though GetCLRIdentityManager is implemented in mscorwks.dll, you must still go through mscoree.dll to access it. Recall from Chapter 3 that all accesses to the CLR from unmanaged code must go through mscoree.dll to make sure the proper CLR runtime DLLs are loaded when multiple versions are installed on the machine. The end result of this is that you must access GetCLRIdentityManager dynamically through a function pointer obtained from the GetRealProcAddress function exported from mscoree.dll. GetRealProcAddress redirects the request for a particular function to the proper version of mscorwks.dll. The following sample code uses GetRealProcAddress to get a pointer to the GetCLRIdentityManager function and calls through that function pointer to get an interface of type ICLRAssemblyIdentityManager:

// Declare a type for our pointer to GetCLRIdentityManager.
typedef HRESULT (__stdcall *CLRIdentityManagerProc)(REFIID, IUnknown **);

// Declare variables to hold both the function pointer and the
// interface of type ICLRAssemblyIdentityManager.
CLRIdentityManagerProc pIdentityManagerProc = NULL;
ICLRAssemblyIdentityManager *pIdentityManager = NULL;
// Use GetRealProcAddress to get a pointer to GetCLRIdentityManager.
HRESULT hr = GetRealProcAddress("GetCLRIdentityManager",
   (void **)&pIdentityManagerProc);

// Call GetCLRIdentityManager to get a pointer to ICLRAssemblyIdentityManager.
hr = (pIdentityManagerProc)(IID_ICLRAssemblyIdentityManager,
   (IUnknown **)&pIdentityManager);

Step 2: Calling GetBindingIdentityFromFile (or Stream)

Now that you’ve got a pointer of type ICLRAssemblyIdentityManager, you can call either GetBindingIdentityFromFile or GetBindingIdentityFromStream to obtain a binding identity for an assembly. Mscoree.idl defines these two methods as follows:

interface ICLRAssemblyIdentityManager : IUnknown
{
    HRESULT GetBindingIdentityFromFile(
        [in]    LPCWSTR     pwzFilePath,
        [in]    DWORD       dwFlags,
        [out, size_is(*pcchBufferSize)]   LPWSTR  pwzBuffer,
        [in, out]   DWORD   *pcchBufferSize
    );

    HRESULT GetBindingIdentityFromStream(
        [in]        IStream         *pStream,
        [in]        DWORD           dwFlags,
        [out, size_is(*pcchBufferSize)]    LPWSTR  pwzBuffer,
        [in, out]   DWORD           *pcchBufferSize
    );

    // other methods omitted
}

Makecocoon.exe deals with files, so it uses GetBindingIdentityFromFile exclusively. As discussed, the _index stream requires a binding identity for every file in the cocoon. So, GetBindingIdentityFromFile is called by makecocoon as it iterates through the files in the directory in preparation to add them to a cocoon. GetBindingIdentityFromFile takes as input a buffer in which it will store the binding identity for the assembly you request. However, binding identities vary in size based on certain factors, including the assembly’s friendly name, whether it has a strong name, and so on. Given this, there’s no way to know how much buffer space to allocate beforehand. As a result, the GetBindingIdentityFromFile method is designed to be called twice in succession. On the first call to GetBindingIdentityFromFile, you pass NULL for the buffer in which the binding identity is to be stored and 0 for the pcchBufferSize parameter. The CLR determines how much buffer space is required for the binding identity you are asking for and returns the required size in pcchBufferSize. Next, you allocate a buffer of the requested size and call GetBindingIdentityFromFile again, passing it the allocated buffer. After this second call returns, pwzBuffer contains the binding identity. The following code shows how you call GetBindingIdentityFromFile twice to obtain a binding identity for a given assembly:

// Call once to get the required buffer size. pszFileName
// contains the path to the manifest of the assembly for which you'd like
// a binding identity.
DWORD cbBuffer = 0;
HRESULT hr = m_pIdentityManager->GetTextualIdentityFromFile(
                  pszFileName,
                  0,
                  NULL,
                  &cbBuffer);

// Allocate a buffer is size cbBuffer. This example uses UNICODE strings,
// hence the multiplication by sizeof(wchar_t).
wchar_t *pBindingIdentity = (wchar_t *)malloc(cbBuffer*sizeof(wchar_t));

// Call again to actually get the binding identity.
hr = m_pIdentityManager->GetTextualIdentityFromFile(
                  pszFileName,
                  0,
                  pBindingIdentity,
                  &cbBuffer);
// pBindingIdentity now contains the binding identity.

// ...

// Remember to free the string containing the binding identity.
free(pBindingIdentity);

The Makecocoon.exe Program

Now that you’ve looked at all the pieces required to build makecocoon.exe, take a closer look at how the program works. Makecocoon.exe begins by creating a structured storage file based on the name of the executable file passed in. It then enumerates the contents of the directory looking for files with a .dll extension. For each .dll file, makecocoon.exe maps a view of the file’s contents into memory using the Win32 memory-mapped file APIs. Given the view of the file in memory, makecocoon.exe creates a new stream in the structure storage file and writes the contents of the mapped memory to that stream. As each stream is created, I build up a data structure that contains the name of the stream and the binding identity of the assembly contained in that stream. This data structure is eventually written to the _index stream I described earlier.

The source code for makecocoon.exe’s primary source file is given in Example 8-1. The program includes a few other files that contain helper classes for obtaining an ICLRAssemblyIdentityManager and for maintaining the index data structure. The complete source code can be found at this book’s companion Web site.

Example 8-1. Makecocoon.cpp

//
// MakeCocoon.cpp
//
// Takes a directory of files and makes a "cocoon." MakeCocoon.exe takes
// as input the main executable to wrap in the cocoon. It streams that
// executable, plus all DLLs in the same directory into an OLE structured
// storage file.

#include "stdafx.h"
#include "CStreamIndex.h"
#include "CCLRIdentityManager.h"

// Given an assembly file on disk, this function creates a stream under
// pRootStorage and writes the bytes of the assembly to that stream. It also
// creates an entry in the index that maps the name of the new stream to the
// binding identity of the file it contains.
HRESULT CreateStreamForAssembly(IStorage *pRootStorage,
                                CStreamIndex *pStreamIndex,
                                LPWSTR pAssemblyFileName)
{
   // Make sure you can open the file.
   HANDLE hFile = CreateFile(pAssemblyFileName, GENERIC_READ, 0, NULL,
       OPEN_EXISTING, FILE_ATTRIBUTE_NORMAL, NULL);

   if (hFile == INVALID_HANDLE_VALUE)
   {
      printf("Error opening file: %s
", pAssemblyFileName);
      return E_FAIL;
   }

   wprintf(L"Creating Stream for Assembly in file: %s
", pAssemblyFileName);

   // Get the file size so you know how many bytes to write to the OLE
   // structured storage file.
   DWORD dwSize = GetFileSize(hFile, NULL);

   // Map the file into memory.
   HANDLE hFileMapping = CreateFileMapping(hFile, NULL, PAGE_READONLY, 0,
                                              dwSize, NULL);
   PVOID pFile = MapViewOfFile(hFileMapping, FILE_MAP_READ, 0, 0, 0);

   // Pull the file extension off the name so you're left with just the
   // simple assembly name.
   wchar_t wszSimpleAsmName[MAX_PATH];
   ZeroMemory(wszSimpleAsmName, MAX_PATH*2);
   wcsncpy(wszSimpleAsmName, pAssemblyFileName, wcslen(pAssemblyFileName)-4);

   // Create a stream in which to store the assembly.
   IStream *pMainStream = NULL;
   HRESULT hr = pRootStorage->CreateStream(wszSimpleAsmName,
             STGM_DIRECT | STGM_CREATE | STGM_WRITE | STGM_SHARE_EXCLUSIVE,
             0, 0, &pMainStream);
   assert(SUCCEEDED(hr));

   // Write the assembly into the stream.
   ULONG ulSizeWritten = 0;
   hr = pMainStream->Write(pFile, dwSize, &ulSizeWritten);
   assert(SUCCEEDED(hr));
   assert(ulSizeWritten == dwSize);

   // Clean up - release the Stream, Unmap the file, and close handles.
   pMainStream->Release();
   UnmapViewOfFile(pFile);
   CloseHandle(hFileMapping);
   CloseHandle(hFile);

   // Add an entry to the index for this stream.
   CCLRIdentityManager *pIdentityManager = new CCLRIdentityManager();

   wchar_t *pBindingIdentity = pIdentityManager
      ->GetBindingIdentityForFile(pAssemblyFileName);
   assert(pBindingIdentity);

   hr = pStreamIndex->AddIndexEntry(wszSimpleAsmName, pBindingIdentity);
   assert(SUCCEEDED(hr));

   free(pBindingIdentity);
   delete pIdentityManager;

   return hr;
}
// Create a stream that holds a string. Use this to write entry point data
// into the storage and to record the binding identity of the assembly
// containing the application's executable.
HRESULT CreateStreamForString(IStorage *pRootStorage, wchar_t *pszStreamName, wchar_t *pszSt
ring)
{
   wprintf(L"Creating String Stream containing: %s
", pszString);

   // Create a stream in which to store the string.
   IStream *pStringStream = NULL;
   HRESULT hr = pRootStorage->CreateStream(pszStreamName,
      STGM_DIRECT | STGM_CREATE | STGM_WRITE | STGM_SHARE_EXCLUSIVE,
      0, 0, &pStringStream);
   assert(SUCCEEDED(hr));

   // Write the string to the stream.
   ULONG ulSizeWritten = 0;
   DWORD dwSize = wcslen(pszString)*sizeof(wchar_t);
   hr = pStringStream->Write(pszString, dwSize, &ulSizeWritten);
   assert(SUCCEEDED(hr));
   assert(ulSizeWritten == dwSize);

   pStringStream->Release();

   return S_OK;

}

int wmain(int argc, wchar_t* argv[])
{
   // Make sure the correct number of arguments was passed.
   if (argc != 3)
   {
      wprintf(L"Usage: MakeCocoon <exe file name> <name of type containing
         Main()>
");
      return 0;
   }

   // Construct the filename for the cocoon. I use the name of the exe
   // minus ".exe" + the ".cocoon" extension.
   wchar_t wszCocoonName[MAX_PATH];
   ZeroMemory(wszCocoonName, MAX_PATH*2);
   wcsncpy(wszCocoonName, argv[1], wcslen(argv[1])-4);
   wcscat(wszCocoonName, L".cocoon");

   // Create the structured storage file in which to store the assemblies.
   wprintf(L"Creating Cocoon: %s
", wszCocoonName);
   IStorage *pRootStorage = NULL;
   HRESULT hr = StgCreateDocfile(wszCocoonName,
     STGM_DIRECT | STGM_READWRITE | STGM_CREATE | STGM_SHARE_EXCLUSIVE,
     0, &pRootStorage);
   assert(SUCCEEDED(hr));

   // Create the index you'll use to map stream names to binding identities.
   CStreamIndex *pStreamIndex = new CStreamIndex(pRootStorage);
// Initialize and start the CLR.
ICLRRuntimeHost *pCLR = NULL;
 hr = CorBindToRuntimeEx(
   L"v2.0.41013",
   L"wks",
   STARTUP_CONCURRENT_GC,
   CLSID_CLRRuntimeHost,
   IID_ICLRRuntimeHost,
   (PVOID*) &pCLR);

 assert(SUCCEEDED(hr));

 pCLR->Start();

 // Obtain an identity manager. This is a helper class that wraps the
 // methods provided by ICLRAssemblyIdentityManager.
 CCLRIdentityManager *pIdentityManager = new CCLRIdentityManager();

 // Get the binding identity for the application's executable.
 wchar_t *pExeIdentity = pIdentityManager
   ->GetBindingIdentityForFile(argv[1]);
 assert(pExeIdentity);

 // Create a stream to hold the binding identity of the exe file.
 hr = CreateStreamForString(pRootStorage, L"_exeBindingIdentity",
   pExeIdentity);
 assert(SUCCEEDED(hr));

 free(pExeIdentity);
 delete pIdentityManager;

 // Create a stream that contains the name of the type containing the
 // application's main() method.
 hr = CreateStreamForString(pRootStorage, L"_entryPoint", argv[2]);
 assert(SUCCEEDED(hr));

 // Create a stream for the exe file.
 hr = CreateStreamForAssembly(pRootStorage, pStreamIndex, argv[1]);
 assert(SUCCEEDED(hr));

 // Loop through the current directory creating streams for all
 // dependent assemblies.
 wchar_t bCurrentDir[MAX_PATH];
 GetCurrentDirectory(MAX_PATH, bCurrentDir);
 wcsncat(bCurrentDir, L"\*",2);

 WIN32_FIND_DATA fileData;
 HANDLE hFind = FindFirstFile(bCurrentDir, &fileData);

 while (FindNextFile(hFind, &fileData) != 0)
{
    // Determine if the file is a DLL - ignore everything else.
    wchar_t *pDllExtension = wcsstr(fileData.cFileName, L".dll");
    if (pDllExtension)
    {
           // Create a stream in the Compound File for the assembly.
           hr = CreateStreamForAssembly(pRootStorage, pStreamIndex,
               fileData.cFileName);
           assert(SUCCEEDED(hr));
       }
    }

    // Write the index to the structured storage file. This creates the
    // _index stream.
    pStreamIndex->WriteStream();

    // Clean up.
    delete pStreamIndex;
    FindClose(hFind);
    pRootStorage->Release();

    return 0;
}

Implementing an Assembly Loading Manager

As described, writing a host using the CLR hosting APIs offers you the most control over how assemblies are loaded into an application domain. To demonstrate the range of customizations available, I write a host that runs applications encased in the cocoons described earlier. The host, runcocoon.exe, takes the name of the cocoon to run as input and uses the methods in the System.Reflection namespace to invoke the application’s main entry point to start it running. As the application runs, you’ll load its assemblies out of the .cocoon file instead of letting the CLR follow its default rules.

You’ll also implement different versioning rules than the ones the CLR would normally enforce. Specifically, you always use the assemblies that are contained in the cocoon as you’re running the application. If a different version of one of the assemblies is placed on disk somewhere, and version policy has been set that would normally cause that version to be used, you can generally ignore it. This keeps the cocoon static and isolated—changes made to the system by other applications won’t affect it. However, there is one scenario in which you would consider version policy. (This is why I said you would only generally ignore policy earlier.) The CLR default version policy system comprises three levels as discussed in Chapter 7: application, publisher, and administrator. In this versioning scheme, you ignore application and publisher policy, but pay attention to administrator policy. A primary purpose of this policy level is to give an administrator a way to specify that a particular version of an assembly should not be used on the system because of a security vulnerability, a consistent crash, or some other fatal flaw. So if a banned assembly is contained in a cocoon you are trying to run, you’ll fail to load it. Instead, you can print an error message and stop running the application. This new policy system is reasonable behavior and gives me a good chance to demonstrate how the hosting API can be used to implement a custom version policy scheme.

Recall from Chapter 2 that the COM interfaces in the hosting API are grouped into a set of managers. All of the interfaces in a given manager work together to provide a coherent set of functionality. One of these managers, the assembly loading manager, contains the two COM interfaces the host must implement to satisfy the requirements of the cocoon scenario: IHostAssemblyManager and IHostAssemblyStore. In addition to the implementations of these two interfaces, the runcocoon.exe host also contains an application domain manager, a host control object, and the main program logic that ties it all together (see Figure 8-4). In the next several sections, I describe each of these primary components of the host.

The architecture of the runcocoon.exe host

Figure 8-4. The architecture of the runcocoon.exe host

Implementing the IHostAssemblyManager Interface

IHostAssemblyManager is the primary interface in the assembly loading manager. That is, it is the interface the CLR asks for through the host control mechanism to determine whether you’d like to customize how the CLR loads assemblies. (Recall from Chapter 2 that the CLR calls the host’s implementation of IHostControl::GetHostManager at startup once for every primary interface to determine which managers a host supports.) The methods on IHostAssemblyManager are described in Table 8-2.

Table 8-2. The Methods on IHostAssemblyManager

Method

Description

GetNonHostStoreAssemblies

Returns a list of assemblies that should be loaded by the CLR rather than by the host.

GetAssemblyStore

Returns the host’s implementation of the IHostAssemblyStore interface. The CLR calls methods on IHostAssemblyStore to enable the host to load an assembly.

GetHostApplicationPolicy

In Chapter 6, I discuss how application-level policy can be specified for an application domain using the ConfigurationFile property on AppDomainSetup. GetHostApplicationPolicy provides an alternate way to specify application-level policy.

In addition to its role as the primary interface in the assembly loading manager, IHostAssemblyManager provides two key capabilities. First, it allows the host to specify the list of assemblies that should be loaded by the CLR instead of being redirected to the host. Second, IHostAssemblyManager allows the host to return its implementation of the other interface in this manager—IHostAssemblyStore. I discuss these capabilities in the next two sections.

Specifying Non-Host-Loaded Assemblies

When a host provides an implementation of the assembly loading manager, the CLR calls the host directly to load an assembly instead of going through its normal resolution and loading process. Although this is exactly what is needed for the assemblies you’ve stored in the cocoon, or for the assemblies users have stored in the database in the SQL Server scenario (for example), it’s almost always the case that the host does not want to take over the responsibility for loading some assemblies, namely, those assemblies that are shipped by Microsoft as part of the .NET Framework platform. These include assemblies such as System, System.Windows.Forms, and System.Web. Although it’s possible for a host to load these assemblies, doing so leads to complications downstream. To imagine the issues you can run into, consider what would happen if the cocoons you are running contained the .NET Framework assemblies in addition to the assemblies that make up the application. Although this would make the cocoons even more selfcontained, it causes some additional implementation concerns that you, as the writer of the host, might not be willing to tackle. For example, recall from Chapter 3 that the CLR automatically enforces that the .NET Framework assemblies loaded into a process are the versions that were built and tested with the CLR that has been loaded. If a host were to load the .NET Framework itself, this benefit would be lost. In theory, you could figure out which assemblies to load based on a list of published version numbers, but even so, the host would just be guessing—only Microsoft as the builder of the .NET Framework knows the exact set of assemblies that are meant to work together.

The second complication involves handling servicing releases to the .NET Framework assemblies. Occasionally, Microsoft releases updates to the .NET Framework assemblies in the form of single bug fix releases, service packs, and so on. These updates are made directly to the .NET Framework assemblies stored in the global assembly cache. If a host were to package these assemblies in a custom format and load them from that format as I discussed doing with cocoons, the applications run by the host would not pick up these bug fix releases because the versions of those assemblies stored in the global assembly cache would not be used. Although you could argue this extra isolation is desired, there are clearly cases when the host would want the applications it runs to use the updated .NET Framework assemblies. The classic case when this behavior is desired is to pick up a bug fix that closes a security vulnerability, for example. If a host did want to be responsible for loading the .NET Framework assemblies, it could work around the servicing issue by loading the assemblies directly out of the global assembly cache itself. However, the process of doing so is not straightforward and, therefore, the benefits aren’t likely worth the extra work. The .NET Framework SDK does include a set of APIs that enables you to enumerate the contents of the global assembly cache (see fusion.h in the Include directory of the SDK), but there are no APIs that enable you to load an assembly directly from the cache. It might be tempting to think that a managed API such as System.Reflection.Assembly.Load could be used to load the .NET Framework assembly, but because you’ve hooked the loading process by implementing an assembly loading manager, that call would just get redirected back to your host anyway!

Because of the complexities of dealing with service releases and of guaranteeing the .NET Framework assemblies match the CLR that is loaded, most hosts choose to load only those assemblies that their users have built as part of the applications they are hosting and let the CLR load the .NET Framework assemblies.

In the cocoon scenario, it’s now clear that you’ll load the assemblies built as part of the application out of the cocoon, but you’ll let the CLR load the .NET Framework assemblies out of the global assembly cache. However, there is one more assembly I haven’t considered yet: the CocoonHostRuntime assembly that contains the application domain manager. This assembly is neither a .NET Framework assembly, nor is it written by the user as part of the application. Rather, it is part of the host. You must decide whether you should load it yourself or leave it up to the CLR. In this case, there is no clear-cut answer. On one hand, you could include a copy of CocoonHostRuntime with each cocoon and load it yourself, or you could carry it along with the runcocoon.exe host and have the CLR load it. For this sample, choose the latter. The CocoonHostRuntime assembly will be deployed to the same directory as runcocoon.exe and loaded by the CLR from there. Figure 8-5 gives a summary of the various assemblies involved in running a cocoon, including from where and by whom they are loaded.

Assembly loading in the cocoon scenario

Figure 8-5. Assembly loading in the cocoon scenario

The CLR determines which assemblies it should load as opposed to which assemblies it should ask the host to load by calling the host’s implementation of IHostAssemblyManager::GetNonHostStoreAssemblies. As the host, you have two choices for specifying how the CLR should behave regarding assemblies you’d like it to load. First, you can provide an exact list of assemblies you’d like the CLR to load. In this scenario, the CLR will load all assemblies in the list you provide and will call your implementation of IHostAssemblyStore for all others. Your other option is to let the CLR try to load all assemblies first by looking in the global assembly cache. If an assembly is found in the cache, it is loaded—the host is never asked. On the other hand, if the assembly could not be found in the global assembly cache, the host is asked to resolve the reference. In this case, if the host doesn’t successfully resolve the reference, the CLR continues to look for the assembly by probing in the ApplicationBase directory structure.

These options both have their pros and cons. The advantage of telling the CLR exactly which assemblies you’d like it to load ensures that you as the host will always have complete control over the assemblies that you’d like to load. For example, in the cocoon scenario, this means that the CLR will never load an assembly from the global assembly cache that is also contained in the cocoon file. This preserves the isolation in that you always know that the assemblies encased in the cocoon files are the ones that are loaded.

On the other hand, the disadvantage of providing an exact list of assemblies for the CLR to load is that this list can become stale as the deployment environment changes around you. For example, say you’ve asked the CLR to load version 2.0.5000 of System.XML. Later, a version policy statement is issued (either by the publisher or the administrator) that redirects all references to System.XML from version 2.0.5000 to 2.0.6000. The CLR will apply this version policy and look to see whether the resulting reference is to an assembly you’ve asked it to load. In this case, the resulting reference will not be in the list of assemblies you’ve asked the CLR to load (because the version is different), so the CLR will call your implementation of IHostAssemblyStore to load the assembly. In this particular case, you can work around this by not providing a version number when you tell the CLR to load System.XML. Doing so, however, results in looser binding semantics than you might want. Either way, you can see how the installation of new assemblies and the presence of version policy can invalidate the list you provide.

As discussed, the alternative is to let the CLR look for all assemblies in the global assembly cache before giving the host the opportunity to load the assembly. Although this approach gets around the fragility problems you might see when you’re providing a full list, it can result in the CLR loading some assemblies you wish it wouldn’t. For example, say that an application in a cocoon file uses a statistical package in an assembly named AcmeStatistics. AcmeStatistics has a strong name and is packaged in the cocoon file along with the application that uses it. Furthermore, assume that another, completely unrelated application installed the AcmeStatistics assembly in the global assembly cache. If the CLR is given the first chance to load all assemblies, it’s possible that the copy of AcmeStatistics in the global assembly cache will be loaded instead of the copy contained in the cocoon file. If the AcmeStatistics assembly in the global assembly cache is exactly the same as the copy in the cocoon file, it really doesn’t matter from where it is loaded. However, because you are allowing the CLR to load AcmeStatistics from a location other than the cocoon, it is possible that the assembly that is loaded differs from the one contained in the cocoon file. For example, it could be that the copy of AcmeStatistics in the global assembly cache is a service release that just happens to have the same version number as the one in the cocoon. It’s also possible that a version policy statement is present that redirects all references to the version of AcmeStatistics contained in the cocoon to the version in the global assembly cache.

GetNonHostStoreAssemblies returns a list of the assemblies you’d like the CLR to load. The list of assemblies is in the form of a pointer to an interface called ICLRAssemblyReferenceList, as you can see in the following definition from mscoree.idl:

interface IHostAssemblyManager: IUnknown
{
    HRESULT GetNonHostStoreAssemblies
            (
            [out] ICLRAssemblyReferenceList **ppReferenceList
            );

    // Other methods omitted.
}

Telling the CLR to attempt to load all assemblies first is straightforward—you just return NULL for *ppReferenceList as shown in the following example:

HRESULT STDMETHODCALLTYPE CCocoonAssemblyManager::GetNonHostStoreAssemblies(
                  ICLRAssemblyReferenceList **ppReferenceList)
{
     *ppReferenceList = NULL;
     return S_OK;
}

If, on the other hand, you’d like to supply the CLR with an exact list of assemblies to load, you must obtain a pointer to an ICLRAssemblyReferenceList that describes your list. Obtaining such an interface is done using the GetCLRAssemblyReferenceList method from ICLRAssemblyIdentityManager discussed earlier. Here’s the definition of GetCLRAssemblyReferenceList from mscoree.idl:

interface ICLRAssemblyIdentityManager : IUnknown
{
    HRESULT GetCLRAssemblyReferenceList(
        [in]   LPCWSTR  *ppwzAssemblyReferences,
        [in]   DWORD    dwNumOfReferences,
        [out]  ICLRAssemblyReferenceList    **ppReferenceList
    );
    // Other methods omitted.
}

GetCLRAssemblyReferenceList takes an array of string-based identities describing the assemblies you’d like the CLR to load. These identities are in the standard string-based form used in Chapter 7; that is:

"<assemblyName, Version=<version>, PublicKeyToken=<token>,
   culture=<culture>"

Given an array of assembly identities, along with a count of the number of items in the array, GetCLRAssemblyReferenceList returns an interface of type ICLRAssemblyReferenceList that you can pass to GetNonHostStoreAssemblies.

The string-based identities you pass to GetCLRAssemblyReferenceList can be either fully qualified (that is, they contain values for the public key token, version, and culture in addition to the required friendly name) or partial. The ability to specify partial identities in this case comes in handy, especially when referring to the .NET Framework assemblies. Recall from Chapter 3 that the CLR ensures that the .NET Framework assemblies that are loaded match the CLR that is running in the process. As a result, there’s really no need to specify a version number when referring to an assembly that is part of the .NET Framework. In this case, it’s much better to leave it to the CLR to determine which version to load.

Runcocoon.exe’s implementation of GetNonHostStoreAssemblies tells the CLR to load the mscorlib, System, and CocoonHostRuntime assemblies as shown in the following code snippet. References to all other assemblies are redirected to the implementation of IHostAssemblyStore.

// The names of the assemblies you'd like the CLR to load
const wchar_t *wszNonHostAssemblies[] =
{

   L"CocoonHostRuntime, PublicKeyToken=38c3b24e4a6ee45e",
   L"mscorlib, PublicKeyToken=b77a5c561934e089",
   L"System, PublicKeyToken=b77a5c561934e089",
};

// RunCocoon's implementation of GetNonHostStoreAssemblies
HRESULT STDMETHODCALLTYPE CCocoonAssemblyManager::GetNonHostStoreAssemblies(
               ICLRAssemblyReferenceList **ppReferenceList)
{

   // GetIdentityManager is a private method that uses GetRealProcAddress to
   // call GetCLRIdentityManager to get the ICLRAssemblyIdentityManager
   // interface.
   ICLRAssemblyIdentityManager *pIdentityManager = GetIdentityManager();

   DWORD dwCount =
      sizeof(wszNonHostAssemblies)/sizeof(wszNonHostAssemblies[0]);


   HRESULT hr = pIdentityManager->GetCLRAssemblyReferenceList(
                                 wszNonHostAssemblies,
                                 dwCount,
                                 ppReferenceList);
   assert(SUCCEEDED(hr));

   pIdentityManager->Release();
   return S_OK;
}

Another, less obvious use for GetNonHostStoreAssemblies is to enable a host to prevent a particular assembly from ever being loaded into a process. I discuss some of the motivation for this in Chapter 12, but for now, suffice it to say that some .NET Framework assemblies just don’t make sense in certain hosting environments. For example, it probably doesn’t make sense for the System.Windows.Forms assembly ever to be loaded in a server environment such as Microsoft ASP.NET or SQL Server. By not including such an assembly in the list returned from GetNonHostStoreAssemblies, and then refusing to load it yourself when your implementation of IHostAssemblyStore is called, you can prevent particular assemblies from ever being loaded.

Returning an Assembly Store

GetAssemblyStore, the final method to discuss on IHostAssemblyManager, is used to return an implementation of the IHostAssemblyStore interface. IHostAssemblyStore is the interface you implement to load assemblies out of the cocoon file as described in the next section. All hosts that implement GetCLRLoadedAssemblies will likely want to implement GetAssemblyStore, too. After all, without an implementation of IHostAssemblyStore, there would be no way to load any assembly other than those returned from GetNonHostStoreAssemblies. GetAssemblyStore’s only parameter is a pointer into which you’ll return the implementation of IHostAssemblyStore as shown in the following method signature:

interface IHostAssemblyManager: IUnknown
{
    // Other methods omitted ...
    HRESULT GetAssemblyStore([out] IHostAssemblyStore **ppAssemblyStore);
};

Runcocoon.exe’s implementation of IHostAssemblyStore is contained in a class called CCocoonAssem blyStore. The implementation of GetAssemblyStore is straightforward: simply create a new instance of CCocoonAssemblyStore, cast it to a pointer to the IHostAssemblyStore interface, and return it through the out parameter as shown in the following code:

HRESULT STDMETHODCALLTYPE CCocoonAssemblyManager::GetAssemblyStore(
               IHostAssemblyStore **ppAssemblyStore)
{
    m_pAssemblyStore = new CCocoonAssemblyStore(m_pRootStorage);

    *ppAssemblyStore = (IHostAssemblyStore *)m_pAssemblyStore;
    return S_OK;
}

Implementing the IHostAssemblyStore Interface

The IHostAssemblyStore interface contains methods that hosts can use to load assemblies from formats other than standard PE files stored in the file system. It is this interface that enables SQL Server to load assemblies directly out of the database and the runcocoon.exe host to load assemblies directly from OLE structured storage files. Instead of returning the assembly as a filename on disk, implementers of IHostAssemblyStore return assemblies in the form of a pointer to an IStream interface. This enables you to load assemblies from literally anywhere that you can store or construct a contiguous set of bytes. It is IHostAssemblyStore that also enables you to customize how the default CLR version policy system works.

As you’ve seen, the CLR determines whether a host wishes to implement a custom assembly store using a two-step process. First, it calls the host implementation of IHostControl, passing in the IID for IHostAssemblyManager. This tells the CLR that the host implements the assembly loading manager. Next, the CLR calls IHostAssemblyManager::GetAssemblyStore to get a pointer to the IHostAssemblyStore interface representing the custom store.

IHostAssemblyStore contains two methods: one the CLR calls to resolve references to assemblies (ProvideAssembly), and another that is called to resolve references to individual files within an assembly (ProvideModule). This latter method is called only for assemblies that consist of more than one file.

Take a look at how to implement these two methods. As before, I describe them in the context of the runcocoon.exe host.

Resolving Assembly References

If a host provides an implementation of IHostAssemblyStore, the CLR will call the ProvideAssembly method to resolve all references to assemblies not contained in the non-host-store assemblies list. The input to ProvideAssembly is a structure called AssemblyBindInfo that contains not only the identity of the assembly to load, but also information about how the default CLR version policy system would affect the reference to that assembly. Assemblies resolved out of a custom store are returned to the CLR in the form of a pointer to an IStream interface. If you’re not familiar with how IStream works, there’s plenty of information on MSDN or in the platform SDK.

In addition to the AssemblyBindInfo structure and the stream through which the assembly is returned, ProvideAssembly also contains parameters that enable you to return debugging information, to associate any host-specific context data with a particular assembly bind, and to specify a unique identity for the assembly you are returning (more on this later). Here’s the definition for ProvideAssembly from mscoree.idl:

interface IHostAssemblyStore: IUnknown
{
    HRESULT ProvideAssembly
            (
            [in] AssemblyBindInfo *pBindInfo,
            [out] UINT64          *pAssemblyId,
            [out] UINT64          *pContext,
            [out] IStream        **ppStmAssemblyImage,
            [out] IStream        **ppStmPDB);
// Other method definitions omitted...
}

The AssemblyBindInfo Structure

The AssemblyBindInfo structure has four fields:

typedef struct _AssemblyBindInfo
{
    DWORD           dwAppDomainId;
    LPCWSTR         lpReferencedIdentity;
    LPCWSTR         lpPostPolicyIdentity;
    DWORD           ePolicyLevel;
} AssemblyBindInfo;

The first field, dwAppDomainId, identifies the application domain into which the assembly will be loaded. This field isn’t particularly useful in the runcocoon.exe host because there is only one application domain. To understand why this field is needed, consider what would happen if the host were capable of running multiple cocoons simultaneously in the same process. In this case, you’d probably choose to load each cocoon into its own application domain. Given the fact that there is only one implementation of IHostAssemblyStore per process, you’d have no way of identifying which cocoon file to load the requested assembly from without the dwAppDomainId field. The way you’d likely implement this is to keep a table that maps application domain IDs to .cocoon files. Then when ProvideAssembly was called, you’d use the dwAppDomainId you’re passed to find the appropriate cocoon file in the table. The unique identifier for an application domain can be obtained using the Id property on System.AppDomain.

The rest of the fields in the AssemblyBindInfo structure identify the assembly that the host needs to load. This information is contained in three fields: lpReferencedIdentity, ePolicyLevel, and lpPostPolicyIdentity. The first field, lpReferencedIdentity, contains the identity of the original assembly as referenced by its caller. The ePolicyLevel field indicates whether that original reference would be redirected by any version policy were the CLR to load that assembly. (The values for ePolicyLevel are defined by the EBindPolicyLevels enumeration discussed later in the chapter.) That is, the ePolicyLevel field tells you whether any version policy that would affect the reference is present on the system. Finally, the lpPostPolicyIdentity field contains the assembly that would be referenced if the policy identified in ePolicyLevel were actually applied. Look at the following example to see how the values of these fields work together. Consider the case in which code is running in one of the cocoons that loads an assembly like so:

Assembly a = Assembly.Load("Customers, Version=1.1.0.0,
   PublicKeyToken=865931ab473067d1, culture=neutral");

Furthermore, say the administrator of the machine has used the .NET Configuration tool to specify some version policy for the Customers assembly. Specifically, the administrator has specified policy that redirects all references to version 1.1 of Customers to version 2.0. In XML, that policy would look like the following:

<configuration>
  <runtime>
    <assemblyBinding xmlns="urn:schemas-microsoft-com:asm.v1">
      <dependentAssembly>
        <assemblyIdentity name="Customers"
          publicKeyToken="865931ab473067d1" />
           <bindingRedirect oldVersion="1.1.0.0" newVersion="2.0.0.0" />
      </dependentAssembly>
    </assemblyBinding>
  </runtime>
</configuration>

In this situation, the relevant fields in the AssemblyBindInfo structure would have the following values when ProvideAssembly is called:

lpReferencedIdentity = "Customers, Version=1.1.0.0,
   PublicKeyToken=865931ab473067d1, Culture=neutral";

lpPostPolicyIdentity = "Customers, version=2.0.0.0, culture=neutral
   publickeytoken=865931ab473067d1, processorarchitecture=msil";

ePolicyLevel = ePolicyLevelAdmin

Note

Note

You might notice that the format of lpPostPolicyIdentity is slightly different from the format of lpReferencedIdentity. Specifically, the keywords version, publickeytoken, and culture have a different case, and a new element called processorarchitecture appears. lpPostPolicyIdentity looks a bit different because it is a binding identity, whereas lpReferencedIdentity is the literal string from the assembly reference (the call to Assembly.Load in the example).

Implementers of ProvideAssembly can use the information in lpReferencedIdentity, lpPostPolicyIdentity, and ePolicyLevel for informational purposes only. That is, the CLR will enforce that the assembly you return from ProvideAssembly has a binding identity that matches lpPostPolicyIdentity—you are not free to return an assembly with any identity you want. In some ways this restriction is unfortunate because it limits the flexibility of what you can do with an assembly loading manager. Nevertheless, a host still has control over version policy because you control how policy is applied within your application domain .

Even though the implementation of ProvideAssembly cannot return an assembly the CLR doesn’t expect, you can still implement some versioning rules quite easily. To demonstrate what you can do, let’s implement some versioning rules to ensure that the assemblies stored in the cocoon file are the exact ones you load at run time. Specifically, you won’t load a different version of an assembly based on the existence of version policy. You can avoid application-level policy easily because you control that for your application domain. As for publisher policy, there are a few ways to keep that from affecting you. As discussed in Chapter 6, the AppDomainSetup object has a property called DisallowPublisherPolicy you can set to cause all publisher policy statements to be ignored for code running in a particular application domain. Alternatively, you can specify the same setting using the <publisherPolicy> element of your application configuration file. This is the approach I’ve taken with runcocoon.exe. If you download the samples for this book, you’ll find a file called runcocoon.exe.config in the same directory as runcocoon.exe. This file uses the <publisherPolicy> element to turn off publisher policy for the applications contained in cocoon files:

<?xml version="1.0"?>
<configuration>
  <runtime>
    <assemblyBinding xmlns="urn:schemas-microsoft-com:asm.v1">
      <publisherPolicy apply="no" />
    </assemblyBinding>
  </runtime>
</configuration>

Now that I’ve discussed how to prevent application and publisher policy from affecting your host, turn your attention to policy specified by the administrator. The primary use of administratorspecified version policy is to provide a mechanism that administrators can use to prevent a particular version of an assembly from being used. Administrators use this policy to prevent any application from using an assembly that has a known security vulnerability, causes a fatal crash, and so on. It’s generally good practice to honor any policy set by an administrator. To that end, runcocoon.exe host will not load any assembly that an administrator has explicitly disallowed through version policy. However, instead of loading the alternate version the administrator calls for, you can simply print an error message and discontinue execution. In a way, you’re taking the middle ground here: you are honoring what the administrator says by not loading the referenced assembly, but you’re not opening yourself up to the possibility of unintended behavior by executing an assembly that wasn’t originally tested as part of the application contained in the cocoon.

Testing to see whether the administrator has issued a version policy statement for an assembly in the cocoon is easy. Simply check for the appropriate value in the AssemblyBindInfo structure and return a "file not found" HRESULT to tell the CLR you can’t find the assembly (that is, you won’t load it). At this point, execution of the cocoon would stop with an exception. The following snippet from runcocoon’s implementation of ProvideAssembly shows how to do this:

HRESULT STDMETHODCALLTYPE CCocoonAssemblyStore::ProvideAssembly(
                  AssemblyBindInfo *pBindInfo,
                  UINT64           *pAssemblyId,
                  UINT64           *pContext,
                  IStream          **ppStmAssemblyImage,
                  IStream          **ppStmPDB)
{
   // Check to see if administrator policy was applied. If so, print an error
   // to the command line and return "file not found." This will cause the
   // execution of the cocoon to stop with an exception.
   if (pBindInfo->ePolicyLevel == ePolicyLevelAdmin)
   {
      wprintf(L"Administrator Version Policy is present that redirects:
         %s to %s . Stopping execution
",
         pBindInfo->lpReferencedIdentity, pBindInfo->lpPostPolicyIdentity);
      return HRESULT_FROM_WIN32(ERROR_FILE_NOT_FOUND);
   }

// Rest of the function omitted for brevity... }

The EBindPolicyLevels Enumeration

Most of the values in the EBindPolicyLevels enumeration are self-explanatory because they map directly to the levels of the default version policy scheme, such as application, publisher, or administrator. A few, however, don’t fit that pattern and require additional explanation. Here’s the definition of the enumeration from mscoree.idl:

typedef enum
{
    ePolicyLevelNone = 0x0,
    ePolicyLevelRetargetable = 0x1,
    ePolicyUnifiedToCLR = 0x2,
    ePolicyLevelApp = 0x4,
    ePolicyLevelPublisher = 0x8,
    ePolicyLevelHost = 0x10,
    ePolicyLevelAdmin = 0x20
} EBindPolicyLevels;

The first value that might not look familiar is ePolicyLevelRetargetable. Although it’s not likely you’ll ever see this value in your implementation of ProvideAssembly, it’s worth spending a few minutes understanding for what it could be used. ePolicyLevelRetargetable is related to a feature in the CLR to support the different implementations of the CLR as described in the European Computer Manufacturers Association (ECMA) standard. Because the CLR is part of an international standard, anyone can produce an implementation of it on any platform. The ePolicyLevelRetargetable value shows up if an implementation of the CLR other than the version shipped as part of the full .NET Framework chose to reference a different assembly than the one the application originally referenced. This is useful, for example, in alternate implementations that have different names for the .NET Framework assemblies.

The second value of EBindPolicyLevels that doesn’t fit the pattern of the familiar policy levels is ePolicyUnifiedToCLR. This value relates to the feature I discuss in Chapter 3 and Chapter 7 whereby a given CLR will load the matching versions of the .NET Framework assemblies. The term Unified comes from the sense that the CLR is unifying all references to the .NET Framework assemblies such that the set of assemblies that were shipped together are always used together. Two things would have to be true before you’d see this value passed to ProvideAssembly. First, your host would have to run an application that contains assemblies built with an older version of the CLR than the one running in the process (and hence would have references to the older .NET Framework assemblies). Second, your host would have to be responsible for loading some of the .NET Framework assemblies. In most scenarios, hosts don’t load the .NET Framework assemblies as I concluded during the discussion of the IHostAssemblyManager::GetNonHostStoreAssemblies method earlier in the chapter.

Associating Host-Specific Data with an Assembly

The pContext parameter to ProvideAssembly enables you to communicate any host-specific data about an assembly from the unmanaged portion of your host to the managed portion. pContext is a pointer to a 64-bit unsigned integer in which you can store any host-specific data to associate with the assembly you return from ProvideAssembly. This data can be retrieved in managed code using the HostContext property on System.Reflection.Assembly.

The SQL Server host provides a good example of how host-specific data for an assembly can be used. When an administrator registers an assembly in the SQL Server catalog, she indicates which predefined set of security permissions that assembly should be granted when it is run. SQL Server records this information in its catalog along with the contents of the assembly. When an assembly is returned from ProvideAssembly, SQL Server reads the data describing the requested permission set from the catalog and returns it in *pContext. On the managed side, this information is obtained from the HostContext property on Assembly and is used as input into the security policy system to make sure the proper permission set is granted to the assembly. More details about how to associate permissions with an assembly are provided in Chapter 10.

Assigning Assembly Identity

Before you look at runcocoon’s implementation of ProvideAssembly, I have one more topic to discuss: how assemblies are uniquely identified within the CLR data structures at run time. When the CLR loads an assembly from disk, it uses the fully qualified pathname of the file, in addition to the assembly’s name, to uniquely identify the assembly. The pathname is used as part of the internal identity of an assembly partly to ensure application correctness, but also as a performance optimization. If the CLR is asked to load the same physical file from disk multiple times, it can reuse the memory and data structures it has already set up for the assembly instead of loading the assembly multiple times.

When hosts take over the assembly loading process and return pointers to streams from ProvideAssembly, there is nothing (at least that can be computed cheaply) to uniquely identify the bytes pointed to by that stream within the CLR. It’s up to the host to associate a unique piece of data with each stream that serves the same purpose that the filename does for an assembly loaded from the file system. That is, it enables the CLR to uniquely identify the assembly internally so performance can be increased by preventing multiple loads of the same assembly. The ability for the host to provide this unique identity is the purpose of the pAssemblyId parameter to ProvideAssembly.

Upon return from ProvideAssembly, *pAssemblyId is intended to hold a 64-bit number that uniquely identifies the assembly. If multiple calls to ProvideAssembly result in the same number being returned in *pAssemblyId, the CLR assumes the assemblies are the same and reuses the bytes and data structures it already has instead of mapping the contents of the stream again. The CLR treats the unique number assigned by the host as an opaque identity—it never interprets the number in any way. Therefore, the semantics of this unique identifier are completely up to the host. The host can choose to use a value from an internal data structure, it can generate a unique value based on the assembly (such as a hash of the name), and so on.

The implementation of ProvideAssembly in runcocoon.exe uses a value from one of its internal data structures to uniquely identify assemblies to the CLR. Recall that each cocoon file has a stream named _index that contains a mapping of binding identities to the names of the streams containing the assemblies. When the CLR calls the implementation of ProvideAssembly, you would look through the index to find the name of the stream containing the assembly corresponding to the binding identity specified in pBindInfo->lpPostPolicyIdentity. When you find the appropriate index entry, remember its place in the index and use that as the assembly’s unique identifier. Given that each binding identity is unique within the index, the position makes a perfect unique identifier for an assembly. As an example, consider the contents of the _index stream shown in Figure 8-6.

The _index stream for the HRTracker application

Figure 8-6. The _index stream for the HRTracker application

When the CLR calls ProvideAssembly with a binding identity of

Payroll, version=10.0.0.0, culture=neutral, publickeytoken=3d9829272b3b00b1,
   processorarchitecture=msil

look in the index and find the requested identity in entry number 1. Return the number 1 in *pbAssemblyId. Subsequent requests for the same binding identity will cause you to find the same entry in the index and therefore to return a pointer to the same stream within the cocoon file.

Again, the value you return in *pbAssemblyId can be any number that serves to uniquely identify a given assembly in your host. The assembly’s position within the _index stream makes a perfect unique identifier in the cocoon scenario.

Loading Assemblies from a Cocoon

Now that I’ve covered the concepts needed to implement ProvideAssembly, look at the implementation in runcocoon.exe. The facts that assemblies are returned from ProvideAssembly in the form of pointers to streams and the cocoons are constructed of streams named after the assemblies they contain make the implementation pretty straightforward. After all, OLE structured storage files support streams directly, so there’s no need for you to provide a custom implementation of IStream. All you need to do is use the structured storage APIs to open streams based on assembly name and return those streams directly from ProvideAssembly.

To recap, the implementation of ProvideAssembly in runcocoon.exe contains the following logic:

  1. Check to see whether administrator policy is set for the referenced assembly. If so, display an error message, set the appropriate error code, and return.

  2. Extract the binding identity of the requested assembly from the AssemblyBindInfo.lpPost-PolicyIdentity field. Given this binding identity, look in the _index stream to find the name of the stream in the cocoon that contains the assembly corresponding to the requested binding identity. The implementation of ProvideAssembly does this using a helper class called CStreamIndex.

  3. Set a unique identifier for the assembly to the position of the requested assembly in the index.

  4. Open the stream that contains the assembly you’re looking for using the IStorage::Open stream structured storage API.

The implementation of ProvideAssembly from runcocoon.exe is shown in the following code. As described, ProvideAssembly uses some helper classes to get its job done. The source for these helper classes, along with the full source for runcocoon, can be found on this book’s companion Web site.

HRESULT STDMETHODCALLTYPE CCocoonAssemblyStore::ProvideAssembly(
                  AssemblyBindInfo *pBindInfo,
                  UINT64           *pAssemblyId,
                  UINT64           *pContext,
                  IStream          **ppStmAssemblyImage,
                  IStream          **ppStmPDB)
{

   assert(m_pCocoonStorage);
   wprintf(L"ProvideAssembly called for binding identity: %s
",
   pBindInfo->lpPostPolicyIdentity);

   // Check to see if administrator policy was applied. If so, print an error
   // to the command line and return "file not found." This will cause the
   // execution of the cocoon to stop with an exception.
   if (pBindInfo->ePolicyLevel == ePolicyLevelAdmin)
   {

      wprintf(L"Administrator Version Policy is present that redirects:
         %s to %s . Stopping execution
", pBindInfo->lpReferencedIdentity,
         pBindInfo->lpPostPolicyIdentity);
      return HRESULT_FROM_WIN32(ERROR_FILE_NOT_FOUND);
   }

   // The CStreamIndex class contains the contents of the _index stream.
   // Call this class to find and open the stream containing the
   // assembly described by AssemblyBindInfo.lpPostPolicyIdentity.
   HRESULT hr = m_pStreamIndex->GetStreamForBindingIdentity(
                                          pBindInfo->lpPostPolicyIdentity,
                                          pAssemblyId,
                                          ppStmAssemblyImage);

   // Don't use pContext for any host-specific data - set it to 0.
   // Also, don't return a stream containing debugging information
   // for this assembly.
   *pContext    = 0;
   *ppStmPDB    = NULL;

    return hr;
}

Resolving Module References

The ProvideModule method on IHostAssemblyStore exists to support assemblies that consist of multiple files. Before I get into the details of how ProvideModule works, some clarification of the terms assembly and module would be useful. Strictly speaking, an assembly is a collection of types and resources that act as a consistent unit in terms of versioning, deployment, security, and type visibility, among other characteristics. Nothing in the formal definition of an assembly says anything about how its contents are physically packaged. That is, the definition of an assembly does not dictate that all of an assembly’s contents must be contained in a single file. In practice, though, this is almost always the case. The primary reason for this is that most development tools present assemblies as single physical files. Nevertheless, the capability does exist using some tools to build assemblies consisting of multiple files. For example, both the C# .NET and Visual Basic .NET compilers support a command-line option called addmodule that enables you to construct an assembly from multiple stand-alone files. In addition, you can use the SDK tool al.exe to build multi-file assemblies.

When an assembly consists of multiple files, one of those files contains an assembly manifest. A manifest is metadata that describes various aspects of the assembly, including its name and the files that make up the assembly. For example, consider the case of an assembly called Statistics that consists of three files: a file called statistics.dll, which contains the manifest, and two other code files called curves.dll and probability.dll. A high-level view of the contents of each of the files in this assembly is shown in Figure 8-7.

The contents of the files in the Statistics assembly

Figure 8-7. The contents of the files in the Statistics assembly

For purposes of the discussion of the IHostAssemblyStore interface, the file containing the assembly manifest (statistics.dll in the example) is called the assembly, whereas the other files in the assembly are called modules. When the Statistics assembly is initially referenced in code, the CLR calls the ProvideAssembly method to get the stream that contains the contents of statistics.dll. Then, if code contained in either curves.dll or probability.dll is referenced, the CLR calls ProvideModule to get the contents of those files.

Now that you understand when ProvideModule would be used, look at how to implement it. Many of the concepts I cover in the discussion of ProvideAssembly apply to ProvideModule as well. For example, the contents of modules are returned as pointers to IStream interfaces just as they are for assemblies. Also, the concept of assigning a unique identity to the stream that is returned applies here as well. So, many of the parameters to ProvideModule should look familiar. Here’s its definition from mscoree.idl:

interface IHostAssemblyStore: IUnknown
{
// Other method definitions omitted...

    HRESULT ProvideModule
            (
            [in] ModuleBindInfo *pBindInfo,
            [out] DWORD         *pdwModuleId,
            [out] IStream      **ppStmModuleImage,
            [out] IStream      **ppStmPDB);
}

As you can probably guess, the pdwModuleId parameter is used to assign a unique identity to the stream, the ppStmModuleImage parameter is used to return the IStream pointer representing the module, and ppStmPDB is an IStream pointer to the debugging information. The parameter that’s new is the ModuleBindInfo parameter. This parameter serves the same logical purpose as does the AssemblyBindInfo parameter to ProvideAssembly—it identifies the module to be loaded. The ModuleBindInfo structure has three fields as shown in the following definition:

typedef struct _ModuleBindInfo
{
    DWORD                       dwAppDomainId;
    LPCWSTR                     lpAssemblyIdentity;
    LPCWSTR                     lpModuleName;
} ModuleBindInfo;

The first field, dwAppDomainId, identifies the application domain into which the module will be loaded. This field serves the same purpose as does the dwAppDomainId field in the AssemblyBindInfo structure. Because modules are always contained in part of an assembly, the implementer of ProvideModule must know which assembly contains the module being requested. The lpAssemblyIdentity field provides this information in the form of the string name of the containing assembly. The final field, lpModuleName, is the name of the module to load.

Not all CLR hosts support multi-file assemblies. As the creator of a new host, it’s up to you to decide how important multi-file assemblies are to your scenario. As I said earlier, the tools support for creating multi-file assemblies isn’t great, so in practice you don’t see many of these assemblies. If you look at the popular hosts that exist today, you’ll see a mixed bag of support: the ASP.NET, Microsoft Internet Explorer, and Default Host all support multi-file assemblies, but the SQL Server host doesn’t. For purposes of this example, I’ve chosen not to support multi-file assemblies in runcocoon.exe. Opting not to support this scenario is easy from a coding perspective. All you need to do is return the HRESULT corresponding to ERROR_FILE_NOT_FOUND from IHostAssemblyStore::ProvideModule as shown in the following example:

HRESULT STDMETHODCALLTYPE CCocoonAssemblyStore::ProvideModule(
                  ModuleBindInfo *pBindInfo,
                  DWORD          *pdwModuleId,
                  IStream        **ppStmModuleImage,
                  IStream        **ppStmPDB)
{
   return HRESULT_FROM_WIN32(ERROR_FILE_NOT_FOUND);
}

Bringing It All Together

The bulk of the implementation of the runcocoon.exe host is contained in the assembly loading manager I’ve been discussing in the last several sections. Let me now take that implementation and show what else is needed to make a fully functional runcocoon.exe. You need to take the following steps to complete the host:

  1. Open the .cocoon file passed as a command-line argument to runcocoon.exe.

  2. Initialize the CLR using CorBindToRuntimeEx.

  3. Create the objects that contain the implementation of the assembly loading manager and notify the CLR of their existence using a host control object.

  4. Use the application domain manager from the CocoonHostRuntime assembly to invoke the application contained in the cocoon. As the application runs, assemblies will be referenced and the implementation of IHostAssemblyStore will be called to load them from the cocoon.

The next several sections describe each step in detail.

Opening the Cocoon File

Because cocoons are OLE structured storage files, you can use the StgOpenStorage API from the platform SDK to open them. StgOpenStorage returns an IStorage pointer that you’ll save and use later to open the streams corresponding to the application’s assemblies. The following code from the main routine of runcocoon.exe uses StgOpenStorage to open the cocoon:

int wmain(int argc, wchar_t* argv[])
{
   HRESULT hr = S_OK;

   // Make sure a cocoon file was passed as a command-line argument.
   if (argc != 2)
   {
      wprintf(L"Usage: RunCocoon <cocoon file name>
");
      return 0;
   }

   // Open the cocoon using the structured storage APIs.
   IStorage *pRootStorage = NULL;
   hr = StgOpenStorage(argv[1], NULL,
      STGM_READ | STGM_DIRECT | STGM_SHARE_EXCLUSIVE,
      NULL, 0, &pRootStorage);

   if (!SUCCEEDED(hr))
   {
      wprintf(L"Error opening cocoon file: %s
", argv[1]);
      return 0;
   }
   // The rest of main omitted for brevity...
}

Initializing the CLR

After you’ve verified that you can open the cocoon, it’s time to initialize the CLR using CorBindToRuntimeEx. Your use of this API is straightforward: make sure .NET Framework version 2.0 of the CLR gets loaded, then save the pointer to ICLRRuntimeHost so you can use it later to start the CLR, set the host control object, and access the ICLRControl interface to register your application domain manager with the CLR:

// Start the CLR. Make sure .NET Framework 2.0 build is used.
   ICLRRuntimeHost *pCLR = NULL;
   hr = CorBindToRuntimeEx(
      L"v2.0.41013",
      L"wks",
      STARTUP_CONCURRENT_GC,
      CLSID_CLRRuntimeHost,
      IID_ICLRRuntimeHost,
      (PVOID*) &pCLR);

Creating the Assembly Loading Manager and Host Control Object

It’s now time to hook your implementation of the assembly loading manager into the CLR so you get called to load assemblies out of the cocoon. Do this in three steps:

  1. Create an instance of the CCocoonAssemblyManager class. This class provides your implementation of IHostAssemblyManager. Recall from earlier in the chapter that this interface contains the GetNonHostStoreAssemblies method and also provides the CLR with your custom assembly store implementation through the GetAssemblyStore method.

  2. Create an instance of your host control object that provides the CLR with the implementation of your assembly loading manager. Runcocoon’s host control object is contained in the class CHostControl. This class implements the IHostControl interface that the CLR uses to discover which managers a host supports. When the CLR calls IHostControl::Get-HostManager with the IID for IHostAssemblyManager, CHostControl returns your instance of CCocoonAssemblyManager (casted to a pointer to an IHostAssemblyStore interface, of course).

  3. The final step in hooking your assembly loading manager implementation into the CLR is to register your host control object with the CLR. Do this by passing an instance of CHostControl to ICLRRuntimeHost::SetHostControl.

The following code snippet from runcocoon’s main routine demonstrates these three steps:

int wmain(int argc, wchar_t* argv[])
{
   // The first part of main omitted...

   // Create an instance of CCocoonAssemblyManager. This class contains your
   // implementation of the assembly loading manager, specifically the
   // IHostAssemblyStore interface. Pass the IStorage for the cocoon's
   // root storage object to the constructor. CCocoonAssemblyManager saves
   // this pointer and uses it later to load assemblies from the cocoon using
   // IHostAssemblyStore.
   CCocoonAssemblyManager *pAsmManager = new
   CCocoonAssemblyManager(pRootStorage);
   assert(pAsmManager);

   // Create a host control that takes the new assembly loading manager. The
   // CHostControl class implements IHostControl, which the CLR calls at
   // startup to determine which managers you support. In this case,
   // support just the assembly loading manager.
   CHostControl *pHostControl = new CHostControl(NULL,
                       NULL,
                       NULL,
                       NULL,
                       NULL,
                       (IHostAssemblyManager *)pAsmManager,
                       NULL,
                       NULL,
                       NULL);

   // Tell the CLR about your host control object. Remember that you must do
   // this before calling ICLRRuntimeHost::Start.
   pCLR->SetHostControl((IHostControl *)pHostControl);

Invoking the Hosted Application

The application contained in the cocoon is invoked from runcocoon’s application domain manager. Your application domain manager is implemented by a class called CocoonDomainManager in the CocoonHostRuntime assembly. CocoonDomainManager has a Run method that takes the name of the assembly containing the application’s executable and the name of the type within that assembly that contains the main method. Run loads the assembly containing the executable using the Assembly.Load method. After the assembly is loaded, Run uses other methods in the System.Reflection namespace to launch the application. The code for the CocoonHostRuntime assembly is shown in Example 8-2.

Example 8-2. CocoonHostRuntime.cs

using System;
using System.Text;
using System.Reflection;

namespace CocoonHostRuntime
{
   public interface ICocoonDomainManager
   {
      void Run(string assemblyName, string typeName);
   }

   public class CocoonDomainManager : AppDomainManager, ICocoonDomainManager
   {
      public override void InitializeNewDomain(          AppDomainSetup appDomainInfo)
      {
         // Set the flags so that the unmanaged portion of your
         // host gets notified of your domain manager via
         // IHostControl::SetAppDomainManager.
         InitializationFlags =
            DomainManagerInitializationFlags.RegisterWithHost;
      }

      // Run the "main" method from <assemblyName>.<typeName>.
      public void Run(string assemblyName, string typeName)
      {
         try
         {
            Assembly asm = Assembly.Load(assemblyName);
            Type t = asm.GetType(typeName, true, true);
            MethodInfo m = t.GetMethod("Main");
            m.Invoke(null, null);
         }
         catch (Exception e)
         {
            Console.WriteLine("Exception executing entry point: "
               + e.Message);
         }

      }
   }
}

Before you can use CocoonDomainManager to execute your hosted application, you need to get the name of the assembly and the type containing the application’s main from the cocoon.

Recall that these names are contained in the _exeBindingIdentity and _entryPoint streams, respectively. The complete code for runcocoon’s main program is shown in Example 8-3.

Example 8-3. Runcocoon.cpp

//
// Runcocoon.cpp : The main program for the runcocoon host.
//

#include "stdafx.h"
#include "CHostControl.h"
#include "CStreamIndex.h"
#include "CCocoonAssemblyStore.h"
#include "CCocoonAssemblyManager.h"

// Returns the contents of pszStreamName and returns it in pszString.
// This method is used to read the _exeBindingIdentity and _entryPoint
// streams, which contain the binding identity of the assembly and the name of
// the type containing the application's entry point.
HRESULT GetStringFromStream(IStorage *pStorage, wchar_t *pszStreamName,
 wchar_t *pszString)
{

   IStream *pStream = NULL;
   HRESULT hr = pStorage->OpenStream(pszStreamName, 0,
                           STGM_READ | STGM_DIRECT | STGM_SHARE_EXCLUSIVE,
                           0, &pStream);
   assert(SUCCEEDED(hr));

   // Determine how many bytes to read based on the size of the stream.
   STATSTG stats;
   pStream->Stat(&stats, STATFLAG_DEFAULT);

   // Read the bytes into pszString.
   DWORD dwBytesRead = 0;
   hr = pStream->Read(pszString, stats.cbSize.LowPart, &dwBytesRead);
   assert(stats.cbSize.LowPart == dwBytesRead);
   assert(SUCCEEDED(hr));

   pStream->Release();

   return S_OK;
}

int wmain(int argc, wchar_t* argv[])
{
   HRESULT hr = S_OK;

   // Make sure a cocoon file was passed as a command-line argument.
   if (argc != 2)
   {
      wprintf(L"Usage: RunCocoon <cocoon file name>
");
      return 0;
   }
   // Open the cocoon using the structured storage APIs.
   IStorage *pRootStorage = NULL;
   hr = StgOpenStorage(argv[1], NULL,
      STGM_READ | STGM_DIRECT | STGM_SHARE_EXCLUSIVE,
      NULL, 0, &pRootStorage);

   if (!SUCCEEDED(hr))
   {
      wprintf(L"Error opening cocoon file: %s
", argv[1]);
      return 0;
   }

   // Start .NET Framework 2.0 version of the CLR.
   ICLRRuntimeHost *pCLR = NULL;
    hr = CorBindToRuntimeEx(
      L"v2.0.41013 ,
     L"wks",
      STARTUP_CONCURRENT_GC,
      CLSID_CLRRuntimeHost,
      IID_ICLRRuntimeHost,
      (PVOID*) &pCLR);

   assert(SUCCEEDED(hr));

   // Create an instance of CCocoonAssemblyManager. This class contains your
   // implementation of the assembly loading manager, specifically the
   // IHostAssemblyStore interface. Pass the IStorage for the cocoon's
   // root storage object to the constructor. CCocoonAssemblyManager saves
   // this pointer and uses it later to load assemblies from the cocoon using
   // IHostAssemblyStore.
   CCocoonAssemblyManager *pAsmManager = new
      CCocoonAssemblyManager(pRootStorage);
   assert(pAsmManager);

   // Create a host control object that takes the new assembly loading
   // manager. The CHostControl class implements IHostControl, which
   // the CLR calls at startup to determine which managers you support.
   // In this case, support just the assembly loading manager.
   CHostControl *pHostControl = new CHostControl(NULL,
                     NULL,
                     NULL,
                     NULL,
                     NULL,
                     (IHostAssemblyManager *)pAsmManager,
                     NULL,
                     NULL,
                     NULL);

   // Tell the CLR about your host control object. Remember that you
   // must do this before calling ICLRRuntimeHost::Start.
   hr = pCLR->SetHostControl((IHostControl *)pHostControl);
   assert(SUCCEEDED(hr));

   // Get the CLRControl object. Use this to set your AppDomainManager.
   ICLRControl *pCLRControl = NULL;
   hr = pCLR->GetCLRControl(&pCLRControl);
   assert(SUCCEEDED(hr));

   hr = pCLRControl->SetAppDomainManagerType(L"CocoonHostRuntime,
      Version=5.0.0.0, PublicKeyToken=38c3b24e4a6ee45e, Culture=neutral",
      L"CocoonHostRuntime.CocoonDomainManager");
   assert(SUCCEEDED(hr));

   // Start the CLR.
   hr = pCLR->Start();

   // Get the binding identity for the exe contained in the cocoon.
   wchar_t wszExeIdentity[MAX_PATH];
   ZeroMemory(wszExeIdentity, MAX_PATH*sizeof(wchar_t));
   hr = GetStringFromStream(pRootStorage, L"_exeBindingIdentity",
      wszExeIdentity);

   // Get the name of the type containing the application's main method.
   wchar_t wszEntryType[MAX_PATH];
   ZeroMemory(wszEntryType, MAX_PATH*sizeof(wchar_t));
   hr = GetStringFromStream(pRootStorage, L"_entryPoint", wszEntryType);

   // Launch the application using your domain manager.
   ICocoonDomainManager *pDomainManager =
   pHostControl->GetDomainManagerForDefaultDomain();
   assert(pDomainManager);

   hr = pDomainManager->Run(wszExeIdentity, wszEntryType);
   assert(SUCCEEDED(hr));

   pDomainManager->Release();
   pCLRControl->Release();
   pHostControl->Release();
   return 0;
}

The complete source code for the runcocoon host can be found on this book’s companion Web site.

Customizing How Assemblies Are Loaded Using Only Managed Code

By writing the runcocoon.exe host in unmanaged code, you were able to implement an assembly loading manager that enabled you to customize completely how the CLR loads assemblies. However, as you’ve seen, there were several new concepts to learn and a considerable amount of code to write. You can also customize the CLR assembly loading process to some degree by writing completely in managed code. The amount of customization available is less than what you can achieve by writing an assembly loading manager, but if what you need to accomplish can be done from within managed code, this approach can save you some time and effort.

In this section, you’ll rewrite the runcocoon.exe host in managed code. Doing so gives you a good chance to contrast the amount of customization available between an unmanaged CLR host and a managed extensible application. The extensible application, called runcocoonm.exe, will provide the same basic functionality that the unmanaged host did. That is, it will run applications encased in .cocoon files. You’ll invoke the application’s entry point just as you did in runcocoon.exe and load the application’s assemblies out of the cocoon file instead of having the CLR find them. Although on the surface the functionality you’ll be providing is the same, there are several subtle differences in the way the two programs work. Understanding these differences can help you decide which approach best meets your needs.

Before I describe how runcocoonm.exe is implemented, take a look at the pieces you need to build it. To start with, let me revisit the requirements I set for the cocoon deployment model:

  1. Assemblies must be loaded from formats other than standard executable files on disk. In this case, the assemblies must be loaded out of your custom deployment format, an OLE structured storage file.

  2. Assemblies must be loaded from a location other than the application’s base directory, the global assembly cache, or from locations described by codebases.

  3. The assemblies contained in the cocoon are the exact assemblies to be loaded when the application is run. The presence of external version policy won’t cause you to load a different assembly.

In the unmanaged implementation, these requirements were satisfied by writing an assembly loading manager. In managed code, you achieve a similar effect by using some of the managed methods and events on the System.AppDomain and the System.Reflection.Assembly classes. Specifically, the ability to load assemblies from alternate formats is provided by the versions of Assembly.Load and AppDomain.Load that enable you to specify an array of bytes containing the assembly you’d like to load. The ability to load assemblies from locations in which the CLR wouldn’t normally find them is provided by an event on System.AppDomain called AssemblyResolve. The third requirement—to be able to circumvent default version policy—isn’t directly provided in managed code. This is one of the primary limitations in what a managed program can do, as you’ll see in a bit.

The next few sections describe how the Assembly.Load(byte[]...) and AppDomain.Load(byte[]...) methods and the AppDomain.AssemblyResolve event work. Once you see how to implement the pieces, you bring them together in writing the runcocoonm.exe sample.

The Load(byte[]...) Methods

In runcocoon.exe, you returned assemblies from the cocoon to the CLR by returning a pointer to an IStream interface from IHostAssemblyStore::ProvideAssembly. You can achieve the same effect from within managed code by passing the assembly you’d like to load as a managed byte array to AppDomain.Load and Assembly.Load. The following partial class definitions show the versions of Load that accept a byte array as input:

   public sealed class AppDomain : MarshalByRefObject, _AppDomain,
      IEvidenceFactory
   {
     //...
     public Assembly Load(byte[] rawAssembly)

     public Assembly Load(byte[] rawAssembly,
                          byte[] rawSymbolStore)

     public Assembly Load(byte[] rawAssembly,
                          byte[] rawSymbolStore,
                          Evidence securityEvidence)
     // ...
}

  public class Assembly : IEvidenceFactory, ICustomAttributeProvider,
     ISerializable
  {
     // ...
     static public Assembly Load(byte[] rawAssembly)
     static public Assembly Load(byte[] rawAssembly,
                                 byte[] rawSymbolStore)
     static public Assembly Load(byte[] rawAssembly,
                                 byte[] rawSymbolStore,
                                 Evidence securityEvidence)
     // ...
  }

As you can see by these definitions, both Assembly.Load and AppDomain.Load also enable you to pass a byte array containing the debugging file. In the unmanaged implementation, you accomplished this by returning an IStream pointer to the debugging file from IHostAssemblyStore::ProvideAssembly. I’m going to skip the Evidence parameter for now and leave it for the discussion of security in Chapter 10.

On the surface, the forms of Load that accept a byte array and IHostAssemblyStore::ProvideAssembly provide the same functionality—they both enable you to load an assembly from any store you choose. However, using the managed Load method to achieve this is much less efficient. To understand why, you need to take a high-level look at how memory is used by the CLR when it loads an assembly for execution. Before the CLR can run the code in an assembly, it reads the contents of the assembly into memory, verifies that it is well formed, and builds several internal data structures. All of this is done in a heap allocated by a component of the CLR called the class loader. Because these heaps hold only native CLR data structures and not managed objects, I refer to them as unmanaged heaps. In contrast, every process in which managed code is run has a heap where the managed objects are stored. This is the heap that is managed by the CLR garbage collector. I call this heap the managed heap. When an assembly is loaded from an IStream*, the CLR calls IStream::Read to pull the contents of the assembly into an unmanaged heap where it can be verified and then executed. Because the contents of the assembly can be directly loaded into an unmanaged heap, loading from an IStream* is very efficient, as shown in Figure 8-8.

Assembly loading from IStream*

Figure 8-8. Assembly loading from IStream*

Loading an assembly with the managed Load method is less efficient because extra copies of the assembly must be made in memory before the CLR can execute it. The Load method takes an array of managed byte objects. Because these objects are managed, they must live in the managed heap. However, the CLR ultimately needs a copy of the bytes in an unmanaged heap to run them. As a result, an extra copy is made by the CLR to move the assembly’s contents from the managed heap to an unmanaged heap. In the cocoon scenario, the case is even worse. You start by reading the contents of a stream into unmanaged memory. From there, you marshal those bytes to the managed heap so Load can be called. The CLR implementation of Load then copies the bytes back to unmanaged memory again! So, you’ve made two full copies of the assembly before it can be executed. This situation is shown in Figure 8-9.

Assembly loaded from managed byte array (byte[ ])

Figure 8-9. Assembly loaded from managed byte array (byte[ ])

In addition to the fact that multiple memory copies must be made to prepare an assembly for execution, the Load method is also less efficient because it doesn’t provide a way for the caller to assign a unique identifier to the assembly. Recall that the CLR uses a unique identifier internally to prevent loading the same assembly multiple times. When loading an assembly from a file, the fully qualified filename is used as this unique identifier. When loading an assembly returned from IHostAssemblyStore::ProvideAssembly, the host creates a unique identity and returns it in the pAssemblyId parameter. Because there is no way to specify a unique identifier for an assembly loaded from a managed byte array, the CLR has no way to tell whether the same assembly is being loaded multiple times, so it must treat each call to Load as a separate assembly. As a result, much more memory is used than would be in scenarios when one assembly is loaded multiple times.

Despite its limitations, Load is still commonly used to load assemblies from custom formats. The reason, of course, is that it’s far easier to use than implementing an entire assembly loading manager. If you need to load an assembly from something other than a standard PE file, try using Load first. If you find that the performance is inadequate for your scenario, you can always go back and reimplement part of your application in unmanaged code to take advantage of an assembly loading manager.

The AssemblyResolve Event

The CLR raises the AppDomain.AssemblyResolve event when it cannot resolve a reference to an assembly. Managed programs can load assemblies from locations in which the CLR wouldn’t normally find them by providing a handler for this event.

The key difference between resolving assemblies by handling the AssemblyResolve event and by implementing an assembly loading manager is that the AssemblyResolve event is raised after the CLR has failed to locate an assembly where the assembly loading manager (specifically IHostAssemblyStore::ProvideAssembly) is called, before the CLR even starts looking. This difference has huge implications in that it prevents you from building an application model that is completely isolated from the way the CLR applies version policy and loads assemblies by default. As an example, consider how the cocoon deployment model is affected by this difference in behavior. Consider the case in which one of the assemblies contained in the cocoon file is also present in the global assembly cache. Because the CLR would look in the GAC first, the assembly would be found there and the event would never be raised, so you’d never have the chance to load the assembly out of the cocoon. Furthermore, consider the case in which version policy is present on the system for an assembly contained in the cocoon. Because policy is evaluated as part of the CLR normal resolution process, this could cause a different version of the assembly to be loaded—again without you ever getting the chance to affect this. So, as you can see, although the AssemblyResolve event does enable you to load an assembly from a location in which the CLR wouldn’t normally look, it doesn’t provide the same level of customization that you can achieve by writing an assembly loading manager in unmanaged code.

To use the AppDomain.AssemblyResolve event, you simply create a delegate of type System.ResolveEventHandler and add it to the application domain’s list of handlers for the event as shown in the following code snippet:

class ResolveClass
{
   static Assembly AssemblyResolveHandler(Object sender, ResolveEventArgs e)
   {
      // Locate or create an assembly depending on your scenario and return
      // it.
      Assembly asm = ...
      return asm;

   }
  static void Main(string[] args)
  {
     // ...

     // Set up the delegate for the assembly resolve event.
     Thread.GetDomain().AssemblyResolve +=new
     ResolveEventHandler(ResolveClass.AssemblyResolveHandler);

     //...

  }
}

As you can see, the AssemblyResolve event takes as input an object of type ResolveEventArgs and returns an instance of an Assembly. ResolveEventArgs has a public property called Name that contains the string name of the assembly the CLR could not locate. Upon return from the event handler, the CLR checks the assembly it has been given to make sure it has the identity given in the Name property. As long as the assembly you return has the correct identity, you’re free to take whatever steps you need in your event handler to find the assembly.

The Runcocoonm Sample

Now that you’ve seen how the AssemblyResolve event and the Load(byte[]...) methods work, it’s easy to put them together to implement the managed version of the cocoon host. You’ll create a delegate to handle the AssemblyResolve event and add it to the default application domain’s list of handlers. Then, just as you did in the unmanaged implementation, you’ll use the CocoonHostRuntime assembly to invoke the main entry point for the application contained in the cocoon. The CLR will fail to find the assemblies in the cocoon, so it will raise the AssemblyResolve event and your handler will get called. In your handler, you’ll use the ResolveEventArgs.Name property to determine which assembly you need to load. You pull that assembly out of the cocoon file as a managed array of bytes and call Assembly.Load.

Because the cocoon files are OLE structured storage files, you need some intermediate layer that reads the assembly using the structured storage interfaces (specifically IStorage and IStream) and then returns the contents as a series of bytes. I’ve written an unmanaged helper DLL called cocoonreader.dll that performs this task. The code in the runcocoonm.exe program uses the CLR Platform Invoke services to call cocoonreader.dll. The architecture of runcocoonm.exe is shown in Figure 8-10.

Runcocoonm architecture

Figure 8-10. Runcocoonm architecture

The code for the complete program is shown in following listings. Example 8-4 contains the code for cocoonreader.dll, and Example 8-5 shows the code for runcocoonm.exe.

Example 8-4. Cocoonreader.dll

//
// Cocoonreader.cpp: Contains utilities used by runcocoonm.exe to read
// assemblies out of OLE structured storage cocoon files.
//

#include "stdafx.h"

// Opens a cocoon file given a name. Each call to CocoonOpenCocoon must
// be matched by a call to CocoonCloseCocoon.
extern "C" __declspec(dllexport) HRESULT CocoonOpenCocoon(
LPWSTR pszCocoonName, IStorage **pRootStorage)
{
   return StgOpenStorage(pszCocoonName, NULL, STGM_READ | STGM_DIRECT |
      STGM_SHARE_EXCLUSIVE, NULL, 0, pRootStorage);

}

// Closes the cocoon file by releasing the cocoon's root storage.
extern "C" __declspec(dllexport) HRESULT CocoonCloseCocoon(
IStorage *pRootStorage)
{
   if (pRootStorage) pRootStorage->Release();
   return S_OK;

}

// Opens a stream within a cocoon given a name. Each call to CocoonOpenStream
// must be matched by a call to CocoonCloseStream.
extern "C" __declspec(dllexport) HRESULT CocoonOpenStream(
IStorage *pRootStorage, LPWSTR pszStreamName, IStream **pStream)

{
   return pRootStorage->OpenStream(pszStreamName, 0, STGM_READ | STGM_DIRECT |
      STGM_SHARE_EXCLUSIVE, 0, pStream);
}

// Closes a stream.
extern "C" __declspec(dllexport) HRESULT CocoonCloseStream(IStream *pStream)
{
   if (pStream) pStream->Release();
   return S_OK;
}

// Returns the size of a stream in bytes.
extern "C" __declspec(dllexport) HRESULT CocoonGetStreamSize(IStream *pStream,
DWORD *pSize)
{
   assert(pStream);

   // Get the statistics for the stream - which includes the size.
   STATSTG stats;
   pStream->Stat(&stats, STATFLAG_DEFAULT);

   // Return the size.
   *pSize = stats.cbSize.LowPart;
   return S_OK;

}

// Returns the contents of the stream. The caller is responsible
// for allocating and freeing the memory pointed to by pBytes.
extern "C" __declspec(dllexport) HRESULT CocoonGetStreamBytes(
IStream *pStream, BYTE *pBytes)
{
   assert (pStream);

   // Get the number of bytes to read.
   STATSTG stats;
   pStream->Stat(&stats, STATFLAG_DEFAULT);
   DWORD dwSize = stats.cbSize.LowPart;

   // Read from the stream.
   DWORD dwBytesRead = 0;
   pStream->Read(pBytes, dwSize, &dwBytesRead);
   assert (dwSize == dwBytesRead);
   return S_OK; }

Example 8-5. Runcocoonm.exe

using System;
using System.Runtime.InteropServices;
using System.Reflection;
using System.Threading;
using CocoonRuntime;

namespace RunCocoonM
{

   class CCocoonHost
   {
      // Import the definitions for the helper routines from
      // cocoonreader.dll.
      [ DllImport( "CocoonReader.dll",CharSet=CharSet.Unicode)]
      public static extern int CocoonOpenCocoon(string cocoonName,
         ref IntPtr pCocoon);

      [ DllImport( "CocoonReader.dll",CharSet=CharSet.Unicode)]
      public static extern int CocoonCloseCocoon(IntPtr pCocoon);

      [ DllImport( "CocoonReader.dll",CharSet=CharSet.Unicode)]
      public static extern int CocoonOpenStream(IntPtr pCocoon,
         string streamName, ref IntPtr pStream);

      [ DllImport( "CocoonReader.dll",CharSet=CharSet.Unicode)]
      public static extern int CocoonCloseStream(IntPtr pStream);

      [ DllImport( "CocoonReader.dll",CharSet=CharSet.Unicode)]
      public static extern int CocoonGetStreamSize(IntPtr pStream,
         ref int size);

      [ DllImport( "CocoonReader.dll",CharSet=CharSet.Unicode)]
      public static extern int CocoonGetStreamBytes(IntPtr pStream,
         IntPtr streamBytes);
static Assembly AssemblyResolveHandler(Object sender,
ResolveEventArgs e)
{
   // Get the name of the assembly you need to resolve from the
   // event args. You want just the simple text name. If the name
   // is fully qualified, you want just the portion before the
   // comma.
   string simpleAssemblyName;
   int commaIndex = e.Name.IndexOf('.'),

   if (commaIndex == -1)
      simpleAssemblyName = e.Name;
   else
      simpleAssemblyName = e.Name.Substring(0, commaIndex);

   // Retrieve the cocoon from the application domain property.
   IntPtr pCocoon = (IntPtr) Thread.GetDomain().GetData("Cocoon");

   // Open the stream for the assembly.
   IntPtr pStream = IntPtr.Zero;
   CocoonOpenStream(pCocoon, simpleAssemblyName, ref pStream);

   // Call the helper DLL to get the number of bytes in the stream
   // you're about to read. You need the size so you can allocate
   // the correct number of bytes in the managed array.
   int size = 0;
   CocoonGetStreamSize(pStream, ref size);

   // Allocate enough memory to hold the contents of the entire
   // stream.
   IntPtr pBytes = Marshal.AllocHGlobal(size);

   // Read the assembly from the cocoon.
   CocoonGetStreamBytes(pStream, pBytes);

   // Copy the bytes from unmanaged memory into your managed byte
   // array. You need the bytes in this format to call
   // Assembly.Load.
   byte[] assemblyBytes = new byte[size];
   Marshal.Copy(pBytes, assemblyBytes, 0 , size);

   // Free the unmanaged memory.
   Marshal.FreeHGlobal(pBytes);

   // Close the stream.
   CocoonCloseStream(pStream);

   // Load the assembly from the byte array and return it.
   return Assembly.Load(assemblyBytes, null, null);

}

static string GetTypeNameString()
{

   // Retrieve the cocoon from the application domain property.
   IntPtr pCocoon = (IntPtr) Thread.GetDomain().GetData("Cocoon");
   // Open the "_entryPoint" stream.
   IntPtr pStream = IntPtr.Zero;
   CocoonOpenStream(pCocoon, "_entryPoint", ref pStream);

   // Get the size of the stream containing the main type name.
   // You need to know the size so you can allocate the correct
   // amount of space to hold the name.
   int size = 0;
   CocoonGetStreamSize(pStream, ref size);

   // Allocate enough space to hold the main type name.
   IntPtr pBytes = Marshal.AllocHGlobal(size);

   // Read the main type name from the cocoon.
   CocoonGetStreamBytes(pStream, pBytes);

   // Copy the stream's contents from unmanaged memory into a
   // managed character array - then create a string from the
   // character array.
   char[] typeNameChars = new char[size];
   Marshal.Copy(pBytes, typeNameChars, 0 , size);
   string typeName = new string(typeNameChars, 0, size/2);

   // Free the unmanaged memory.
   Marshal.FreeHGlobal(pBytes);

   // Close the "MainTypeName" stream.
   CocoonCloseStream(pStream);

   return typeName;

}

[STAThread]
static void Main(string[] args)
{

    // Make sure the name of a cocoon file was passed on the
    // command line.
    if (args.Length != 1)
    {
       Console.WriteLine("Usage: RunCocoonM <cocoon file>");
       return;
    }

    // Open the cocoon file and store a pointer to it in
    // an application domain property. You need this value in
    // your AssemblyResolve event handler.
    IntPtr pCocoon = IntPtr.Zero;
    CocoonOpenCocoon(args[0], ref pCocoon);

    Thread.GetDomain().SetData("Cocoon", pCocoon);

    // Strip off the .cocoon from the command-line argument to get
    // the name of the assembly within the cocoon that contains the
    // Main method.
    int dotIndex = args[0].IndexOf('.'),
    string assemblyName = args[0].Substring(0, dotIndex);
      // Get name of type containing the application's Main method
      // from the cocoon.
      string typeName = CCocoonHost.GetTypeNameString();

      // Set up the delegate for the assembly resolve event.
      Thread.GetDomain().AssemblyResolve +=new
         ResolveEventHandler(CCocoonHost.AssemblyResolveHandler);

      // Use CocoonHostRuntime to invoke Main.
      CocoonDomainManager cdm = new CocoonDomainManager();
      cdm.Run(assemblyName, typeName);

      // Close the cocoon file.
      CocoonCloseCocoon(pCocoon);
    }
  }
}

Supporting Multifile Assemblies

Managed programs can support multifile assemblies by handling the Assembly.ResolveModule event. This event is raised by the CLR if it cannot find one of an assembly’s modules at run time, just as the AssemblyResolve event is raised if the file containing the assembly’s manifest cannot be found.

Because modules are always part of an assembly, handlers for the ModuleResolve event are always associated with a particular assembly, not with the application domain as the AssemblyResolve event is. To register for the ModuleResolve event, create a delegate of type ModuleResolveEventHandler and add it to the appropriate assembly’s list of handlers. The following code shows a handler for the ModuleResolve event being registered for the currently executing assembly:

class ResolveClass
{
   static Module ModuleResolveHandler(Object sender, ResolveEventArgs e)
   {
      // Locate or create the module depending on your scenario and
      // return it.
      Module m = ...
      return m;

   }
   static void Main(string[] args)
   {
      // ...

      // Set up the delegate for the module resolve event.
      Assembly currentAssembly = Assembly.GetExecutingAssembly();
      currentAssembly.ModuleResolve += new
         ModuleResolveEventHandler(ResolveClass.ModuleResolveHandler);

      //...
   }
}

The ModuleResolve event takes as input an object of type ResolveEventArgs and returns an instance of a System.Reflection.Module. ResolveEventArgs is the same type of class that was passed to the AssemblyResolve event. In this case, the Name property contains the string name of the module the CLR could not locate.

The most common way to get an instance of Module to return from the ModuleResolve event is to load one from an array of bytes using the LoadModule method on System.Reflection.Assembly. (The other way to get a module is to create one dynamically using the classes in System. Reflection.Emit.) LoadModule has two overloads, one that enables you to specify debugging information in addition to the module itself and one that does not:

public Module LoadModule(String moduleName,
                         byte[] rawModule)

public Module LoadModule(String moduleName,
                         byte[] rawModule,
                         byte[] rawSymbolStore)

When calling LoadModule, you must pass the name of the module in the moduleName parameter. The CLR uses this string to identify which module is being loaded by checking the name against the list of modules stored in the assembly’s manifest. If you pass a moduleName that cannot be found in the manifest, the CLR throws an exception of type System.ArgumentException.

Summary

The CLR is incredibly flexible in its ability to be adapted to a variety of application environments and deployment scenarios. In this chapter, I discussed how you can completely replace the way the CLR locates and loads assemblies if you need that amount of customization. In general, you can use two different techniques to customize the CLR default deployment model. The approach you pick will depend directly on the amount of customization you require. The CLR hosting APIs provide the hooks necessary to load assemblies from virtually any format or storage mechanism imaginable. Although this approach provides the greatest degree of flexibility, it also requires the most effort in terms of implementation. As an alternative to writing an unmanaged host, you can achieve some level of customization from completely within managed code. The primary capability you lose with this approach is the ability to change the way the version policy system works. You also see slightly poorer performance than you would if you were to write a full CLR host. If neither issue is of concern, you can save yourself some time and effort by taking advantage of the productive gains you’ll realize by sticking with managed code.



[1] Thanks to my colleague Jim Hogg for the term cocoon.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.138.35.193