Managed Code Execution

To load and execute an assembly, the common language runtime has to be hosted within a process first. The .NET Framework provides APIs to let an application host the common language runtime. Some examples of such host applications are ASP.NET, SQL Server .NET, and so on.

An EXE-based assembly contains a small piece of bootstrapping code that points to a function exported by the .NET runtime, named CorExeMain. When such an assembly is executed, the OS creates a process and executes CorExeMain. This function in turn loads the common language runtime into the process and transfers control to it.

Using a standard PE file mechanism to bootstrap the common language runtime is a clever technique on the part of Microsoft. It ensures that many things “just work.” For example, to run the application, you can double-click an application in the Explorer window or enter its name at the command line.

The function CorExeMain loads the EE, reads the assembly's manifest, and loads the module containing the entry point for the application. From this point, the common language runtime (the EE, more specifically) goes through the following general steps:

1.
Metadata validation

2.
Code verification

3.
Compilation of MSIL code to native machine code

4.
Execution of the compiled code

Figure 4.3 shows the overall process of how a method is compiled and executed.

Figure 4.3. Method execution.


As the code executes, it may make references to other types. The common language runtime loads the module containing the referenced type, if it has not been loaded already, as illustrated in Figure 4.3.

Once a module has been loaded and validated, and the reference to the type has been resolved, the runtime is ready to execute the type's method. However, before this is done, the runtime has to take care of a few things:

  • The MSIL code for the method has to be validated. For example, the MSIL code should not contain any invalid opcodes.

  • The MSIL code has to be checked for unsafe code. This is called code verification.

  • Finally, the MSIL code for the method has to be converted to the native machine language instructions. This on-the-fly conversion is referred to as just-in-time (JIT) compilation.

We will cover these steps in more detail shortly.

After the MSIL instructions have been converted to native machine language instructions, the common language runtime steps aside and let the native machine code execute. While the code is executing, the common language provides any runtime service that is needed such as automatic memory management, enhanced security, interoperability with COM components and so on.

The process of loading a module when needed, metadata validation, verification, compilation, and then executing the native code is repeated until the execution is complete.

It is interesting to note that a module is loaded as a side effect of referencing a type contained in the module. If some types within an assembly are placed in a module such that a particular execution path never references any of the types in the module, then the runtime will not load the module. By bundling least used types in a separate module, you can improve execution performance, especially if the assembly needs to be downloaded over a slow link. If the module is not used, it will not be downloaded.

Metadata Validation

As each module is loaded, the common language runtime performs a set of tests on it to ensure that the file format and the metadata are self-consistent. This set of tests is referred to as metadata validation. The ECMA specifications define what constitutes valid metadata. As of this writing, the specification consists of about 500 rules.

Note that MSIL code is not validated when the module is loaded. This is done on a per-method basis—when the method is about to be executed. As a result, if a method is never executed, its MSIL code is never validated.

Code Validation and Verification

Once a module has been loaded and validated and the reference to the type has been resolved, the common language runtime is ready to execute the type's method. However, before this is done, the common language runtime has to validate and verify the MSIL code.

Validation checks if the MSIL code is consistent. For example, the code should not contain any invalid MSIL instruction. This can happen, for example, as a result of a bug in the front-end compiler. If the validation fails, the runtime throws an exception.

Verification examines the MSIL code and the metadata to ensure that the code accesses memory locations it is authorized to access and that the code calls methods only through properly defined types. Such type safety is necessary to ensure that objects do not cause any inadvertent or malicious corruption of memory or other important resources. Using pointers, for example, generates potentially unsafe code.

During the verification process, the code is examined against a well-defined set of type-safe rules. The code is deemed unsafe if it fails any of the rules.

Do not confuse unsafe code with unmanaged code. Only managed code is verified for type safety. If the code is unsafe, it still remains managed code. Unmanaged code, on the other hand, is the code that the runtime has no control over, such as native calls to the platform.

Verification ensures that each type is only asked to execute valid operations. It is not possible for verification to check runtime conditions such as array bounds violation. Such runtime conditions are handled by the runtime, not verification.

Verification Is Limited

It is quite possible that your code may fail the verification test even though it is perfectly safe to use. This is due to the limitations of the verification process. Moreover, some languages do not produce verifiably type-safe code, which causes the verification to fail.


Under C#, if you wish to write unsafe code, you need to explicitly scope the code using a keyword unsafe, as shown in the following example:

// Project UnsafeCode

public void GetValueByRef(ref int value) {
    int retValue = 0;
    unsafe {
      byte* p = (byte*) &retValue;
      p[0] = 10;
    }
    value = retValue;
}

Keyword unsafe can also be applied at the method level or the type level. Here are some examples:

// Project UnsafeCode

public class Foo {
    ...

    unsafe public void GetValueByPointer(int* value) {
      *value = 20;
     }
}

public unsafe class Bar {
    private int* m_Count = null;

    ...
}

The Microsoft C# compiler produces verifiable type-safe code by default. You need to explicitly instruct the compiler to generate unsafe code. This is done using the -unsafe switch, as shown in the following command line:

csc.exe –unsafe MyCode.cs

Recall that verification is performed on a method that is about to be executed. During verification, if any unsafe code is encountered, the runtime throws an exception of type VerificationException (namespace System.Security). The unsafe code is not executed.

The only way to run unsafe code is to force the runtime to skip verification on the assembly. The assembly has to request for a security permission called SkipVerification and the administrator has to set the security policy on the local machine such that the assembly is granted this permission. We will look at this in Chapter 9 on security.

By default, all the assemblies on the local machine are granted full trust (which includes the SkipVerification permission). Files downloaded over the network (Internet or intranet) have a reduced set of permissions.

Offline Verification

It is also possible to validate and verify your assembly without executing it. The framework provides a tool called PEVerify (peverify.exe) to check if the MSIL code and the metadata in the assembly meet the type-safety requirements. The following command line, for example, checks if Foo.dll is type-safe:

peverify.exe Foo.dll

Check the SDK documentation for the command line switches for this tool.

Note that out of all the metadata validation rules defined in the ECMA specifications, PEVerify checks only the important ones. Perhaps the next version of the .NET Framework will have a more elaborate coverage.

Also note that there is an important behavioral difference between PEVerify and the JIT compiler. The JIT compiler verifies only those methods that are executed whereas PEVerify verifies all the methods in the assembly. It is recommended that you test your assemblies with PEVerify before you ship it.

JIT Compilation

Once the method to be executed has been validated and verified, the MSIL code for the method has to be converted to the native machine language code; that is, code that is specific to the local machine's CPU architecture. This on-the-fly conversion is referred to as JIT compilation.

JIT compilation makes it possible to develop the code once but run it on different hardware platforms. All that is needed is the availability of a JIT compiler (or the JITter, as it is called) for the target CPU architecture. Of course, if your managed code makes native calls to the specific OS platform, then your code might not be able to run on another platform.

As the JIT compiler is written for a specific CPU architecture, it can potentially perform all the hardware-specific optimization, thus producing a performance-efficient native code.

Note that the JIT compiler compiles code only for the method that is about to be executed, not the type's module or the assembly. As a matter of fact, if a type's method is never called during the execution, the method's MSIL code is never compiled, saving both time and memory.

The scheme for activating JIT compilation is very simple and efficient. When a type is loaded by the common language runtime, the loader creates and attaches a stub to each of the type's methods. On the first call to the method, the stub passes control to the JIT compiler. The JIT compiler converts the MSIL code for that method into native code and patches the stub to point directly to the native code. As a result, subsequent calls to the JIT-compiled method proceed directly to the native code.

In some cases, the JIT compiler may also expand the method inline, that is, within the calling code. This reduces the overhead associated with making a call.

Native-Code Assemblies

The JIT compiler generates native code that stays in memory only for the specific process. The generated code is not retained to be used from one run to the next.

It is also possible to compile the MSIL code to native code before executing an assembly, perhaps during assembly installation. This pre-jitting results in reduced application startup time and improved runtime performance, as there is no need to create and attach stubs to each type's methods during run-time or to compile the methods dynamically.

The framework provides a tool called the Native Image Generator (ngen.exe) to compile an assembly. This tool not only creates a native image for the specified assembly but also installs the assembly into a special region of the runtime called the native image cache. The following command line, for example, compiles the assembly ConsoleGreeting.dll into a native image and adds the image to the native image cache:

ngen.exe ConsoleGreeting.dll

Note that even though the native image has been installed in the native image cache, the source assembly has to be present on the machine in the same location where it was originally compiled from. This is because the runtime checks whether the source assembly has been modified with respect to its native image. If so, then the runtime does not use the “stale” native image, but silently falls back to the JIT compilation.

The native image is considered stale if the source assembly has been re-created, if the common language runtime has been modified, if the assembly bindings have been changed by way of configuration, and so forth.

The .NET Framework installs a few assemblies as the native images. This can be observed from the GAC viewer. Another way to get a list of currently installed native images on a machine is to use -show switch on ngen.exe, as shown here:

ngen.exe -show

To remove an image from the native image cache, you can use ngen.exe with -delete switch. You can also use gacutil.exe with the switch -ungen, as shown in the following command line:

gactuil.exe –ungen ConsoleGreeting

Rebasing Executables

Under Windows, each executable stores a value that identifies the starting address where the executable should be loaded in a process' memory. However, to avoid any address overlaps, the OS may have to load the executable at a different location. This relocation results in increased load time for the executable. This relocation penalty can be avoided if each executable that will be loaded within a process is specified a base address that avoids any load-time address collisions.

Under .NET, base-address collision is not a problem for assemblies that contain IL code. Recall that the runtime “extracts” the MSIL code from the assembly and loads it into the execution engine.

Native-code assemblies, however, are loaded into the process' memory space. Therefore, it is a good idea to properly base such assemblies before compiling them into native code. The C# compiler creates executables with the base address of 0x00400000. However, this default behavior can be customized by using –baseaddress compiler switch.

Note that native-code images can be twice as large as the corresponding MSIL-code image. The exact factor depends on the content.

Also note that most of the .NET assemblies are loaded into 0x79000000 to 0x7D000000 range.

Needless to say, if your managed code interacts with any unmanaged code (DLLs and COM servers), these unmanaged executables should also be rebased properly.


Code Execution

After the MSIL instructions have been converted to native machine language instructions, the common language runtime steps aside and lets the native machine code execute. While the code is executing, the common language runtime provides any runtime service that is needed, such as automatic memory management, enhanced security, interoperability with COM components and unmanaged code, and so on.

In this chapter, we look at perhaps the most important service that the runtime provides—automatic memory management. Other services such as security, interoperability, and so forth, are covered in later chapters.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.129.13.201