Process Execution

How and when is MSIL code compiled to native binary? The MSIL code is compiled into binary at run time. Only managed methods that are called are compiled into binary. This is part of a larger procedure called process execution. Most of the information in this section is obtained from the Shared Source CLI, which is an open-source implementation of the CLI. The Shared Source CLI is often referred to as Rotor. For more information on Rotor, visit this Web site: http://msdn.microsoft.com/net/sscli. This section explains method compilation and how the entry point method is identified and executed. What is important is not the specific details, which may change with time, but the concepts.

Process execution begins when a managed application is launched. At that time, the CLR is bootstrapped into the application. The CLR is bootstrapped from the mscoree.dll library. This library starts the process of loading the CLR into the memory of the application. _CorExeMain is the starting point in mscoree.dll. Every managed application includes a reference to mscoree.dll and _CorExeMain. You can confirm this with the dumpbin.exe tool. Execute the following command at a command prompt on any managed application for confirmation:

dumpbin /imports application.exe

Figure 14-2 shows the result of the dumpbin command.

The dumpbin command with the imports option

Figure 14-2. The dumpbin command with the imports option

Managed applications have an embedded stub. The stub fools the Windows environment into loading a managed application and temporarily masks the managed application as a native Windows application. The stub calls _CorExeMain in moscoree.dll. _CorExeMain then delegates to _CorExeMain2 in mscorwks.

_CorExeMain2 eventually calls SystemDomain::ExecuteMainMethod. As the name implies, ExecuteMainMethod is responsible for locating and executing the entry point method. The entry point method is a member of a class. The first step in executing the entry point method is locating that class.

During process execution, classes are represented by EEClass structures internally. ExecuteMainMethod calls ClassLoader::LoadTypeHandleFromToken to obtain an instance of EEClass for the class that contains the entry point method. LoadTypeHandleFromToken is provided the metadata token for the class and returns an instance of an EEClass as an out parameter. In a managed application, only classes that are touched have a representative EEClass structure. The important components of EEClass are a pointer to the parent class, a list of fields, and a pointer to a method table.

The method table contains an entry for each function in the class. The entries are called method descriptors. The method descriptor is subdivided into parts. The first part is m_CodeOrIL. Before a method is jitted, m_CodeOrIL contains the relative virtual address (RVA) to the MSIL code of the method. The second part is a stub containing a thunk to the JIT compiler. The first time the method is called, the jitter is invoked through the stub. The jitter uses the IL RVA part to locate and then compile the implementation of the method into a binary. The resulting native binary is cached in memory. In addition, the stub and m_CodeOrIL parts are updated to reference the virtual address of the native binary. This is an optimization and prevents additional jitting of the same method. Future calls to the function simply invoke the cached native binary.

Roundtripping

Roundtripping refers to disassembling an application, modifying the resulting code, and then reassembling the application. This provides a mechanism for maintaining or otherwise updating an application without the original source code.

The following C# application simply totals two numbers. The program is called "Add," and the results are displayed in the console window:

using System;

namespace Donis.CSharpBook {
    public class Starter {
        public static void Main(string[] args) {
            if (args.Length < 2) {
                Console.WriteLine("Not enough parameters.");
                Console.WriteLine("Program exiting...");
                return;
            }
            try {
                byte value1 = byte.Parse(args[0]);
                byte value2 = byte.Parse(args[1]);
                byte total = (byte) (value1 + value2);
                Console.WriteLine("{0} + {1} = {2}",
                    value1, value2, total);
            }
            catch(Exception e) {
                Console.WriteLine(e.Message);
            }
        }
    }
}

Let’s assume that the above program was purchased from a third-party vendor for a quintillion dollars. Of course, the source code was not included with the application—even for a quintillion dollars. Almost immediately, the purchaser discovers a bug in the application, such that when the program is executed, the total is sometimes incorrect. Look at the following example:

C: >add 200 150

200 + 150 = 94

The result of this equation should equal 350, not 94. How can the purchaser fix this problem without the source code? Roundtripping is the answer. This begins by disassembling the application. For convenience, the ILDASM disassembler is used:

ildasm /out=newadd.il add.exe

Open newadd.il in a text editor. In examining the MSIL code, we can find the culprit easily. The add instruction adds two byte variables. The result is cached in another byte variable. This is an unsafe action that occasionally causes an overflow in the total. Instead of notifying the application of the overflow event, an incorrect value is stored. This is the reason for the errant results. Add the ovf suffix to the conv instruction to correct the problem. An exception is now raised when the overflow occurs.

You can use roundtripping to add features that are not otherwise available in C#. For example, C# supports general exception handling but doesn’t support exception filters directly. You can implement an exception filter directly in MSIL. In MSIL, when an exception is raised, the exception filter determines whether the exception handler executes. If the exception filter evaluates to one, the handler runs. If the result is zero, the handler is skipped.

Here is a partial listing of the disassembled program. It is modified to throw an exception when the addition overflows the total. An exception filter also has been added to the exception handling. Changes in the code are highlighted:

.try

{

    IL_0029:  nop

    IL_002a:  ldarg.0

    IL_002b:  ldc.i4.0

    IL_0035: ldelem.ref
    IL_0036: call uint8 [mscorlib]System.Byte::Parse(string)
    IL_003b: stloc.1
    IL_003c: ldloc.0
    IL_003d: ldloc.1
    IL_003e: add
    IL_003f: conv.ovf.u1
    IL_0040: stloc.2
    IL_0041: ldstr "{0} + {1} = {2}"
    IL_0046: ldloc.0
    IL_0047: box [mscorlib]System.Byte
    IL_004c: ldloc.1
    IL_004d: box [mscorlib]System.Byte
    IL_0052: ldloc.2
    IL_0053: box [mscorlib]System.Byte
    IL_0058: call void [mscorlib]System.Console::WriteLine(string,
        object,
        object,
        object)
    IL_005d: nop
    IL_005e: nop
    IL_005f: leave.s IL_0072
} // end .try
filter
{
pop
ldc.i4.1
endfilter
}
// catch [mscorlib]System.Exception
{
IL_0061: stloc.3
IL_0062: nop
IL_0063: ldloc.3

Because the filter returns one, the exception is always handled. Use the ILASM compiler to reassemble the application, which completes the round trip. Here is the command:

ilasm newadd.il

Run and test the newadd application. The changed program is more robust. Roundtripping has succeeded!

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.116.67.22