Chapter 14. MSIL Programming

Microsoft Intermediate Language (MSIL) is the programming language of the Common Language Runtime (CLR) and the Common Instruction Language (CIL) for managed code. A managed application undergoes two compilations. The first compilation is from source code to MSIL and is performed by the language compiler. The second compilation occurs at run time, when the MSIL code is compiled to native code. The CLR performs the second compilation as part of process execution. From the perspective of the CLR, managed applications are simply MSIL code and metadata. The original source code language is unimportant to the CLR. For this reason, .NET is considered language-agnostic, or language-independent. One of the goals of the CLR is that process execution of a managed application is identical regardless of the source language, whether C# or Microsoft Visual Basic .NET.

MSIL promotes the concept of compile-once-and-run-anywhere in the .NET environment. Just-in-time (JIT) compilers, otherwise known as jitters, compile assemblies into native binary code that targets a specific platform. You can write an application or component once and then deploy the application to Microsoft Windows, Linux, and other environments that support a compliant .NET run time. A benefit to compile-once-and-run-anywhere is the ability to assemble applications from components deployed on disparate hardware and platforms. This was one of the objectives of component technologies such as Component Object Model (COM) and Common Object Request Broker Architecture (CORBA), but the objective was never truly realized. .NET makes this a reality. If platform-agnostic code is a design goal for an application, best practices must be adopted to insulate the managed application from platform-specific code. This includes avoiding or isolating interoperability and calls to native application programming interfaces (APIs).

Your application also can be pre-jitted. The jitter distributes the cost of compiling your program, whenever a method is first used, across the execution of the application. Pre-jitting provides an alternative to this methodology. You can compile the entire application up front and cache the resulting native binary in the global assembly cache. When pre-jitted, the cost of compiling an application is incurred prior to execution. The cached binary is used whenever the application is run versus incremental compilation. A tool called Ngen is provided with the .NET Framework for pre-jitting .NET applications. Pre-jitting is sometimes preferred, such as with a large application where most of the features are touched in common usage scenarios, or when optimal runtime performance is critical

MSIL is a full-featured, object-oriented programming (OOP) language. A C# program compiles to MSIL. However, there are differences when compared with C# programming. For example, global functions are allowed in MSIL but not supported in the C# language. Despite being a lower-level language, MSIL has expanded language elements. It encompasses the constituents common to most object-oriented languages: classes, structures, inheritance, transfer-of-control statements, an assortment of arithmetic operators, and much more. Indeed, you can write .NET applications directly in MSIL.

This is a book on C# programming. In that context, why is understanding MSIL important? An understanding of MSIL code advances a deeper appreciation and comprehension of C# programming and .NET. Managed code isn’t just magic. Understanding MSIL removes much of the mystery and helps C# developers better maintain, debug, and write smart, efficient, and robust code.

Applications, particularly production applications purchased from third-party vendors, are sometimes available without the original source code. How is an application maintained or debugged without the source code? A native application would require not only the ability to disassemble the code but also solid knowledge of assembly language, which is a challenge for many developers. For a managed application, as part of the assembly, the MSIL code is usually available. MSIL is much easier to understand and interpret when compared to assembly language. The exception is when the assembly is obfuscated. Several tools, including Intermediate Language Disassembler (ILDASM), can disassemble an assembly to provide access to the MSIL code. With the MSIL code, a developer can essentially read the application. You can even modify the code as MSIL and reassemble the application. This is called roundtripping. Of course, this assumes that the developer is competent in MSIL programming.

In a native application, debugging without a debug file (.pdb) is a challenge (which is an understatement). Debugging a native application without symbol files invariably means interpreting assembly code. As previously mentioned, that is a challenge. More than a general understanding of assembly programming is needed to debug without symbol files. MSIL can be viewed as the assembly code of the CLR. Debugging a managed application without the germane symbol files requires more than a superficial understanding of MSIL. However, when compared to the tedious task of reading assembly, working with MSIL is a leisure cruise.

MSIL is instructive in managed programming. Learning MSIL programming is learning C# programming. What algorithms are truly efficient? When has boxing occurred? Which source code routines expand the footprint of the application? These secrets can be found in understanding MSIL code.

The ability to code inline MSIL in a C# application is suggested in numerous blogs on programming, but it is not currently available. I am an advocate of inline MSIL for C#. MSIL is more than an abstraction of higher-level source code. There are unique features in MSIL code that are not exposed in C#. In addition, C# is a code generator that emits MSIL code. In rare circumstances, compiler-generated MSIL might not be optimal for your specific application. For these reasons, I favor inline MSIL. However, the problem with inline MSIL is maintaining safe code. Inline MSIL is inherently unsafe and could lead to abuse. If a safe implementation of inline MSIL is not possible, it should not be added to the language.

As mentioned, managed applications undergo two compilations. First the language compiler and then the run time (jitter) compiles the application. You can compile MSIL code directly into an assembly with the MSIL compiler, which is the Intermediate Language Assembler (ILASM). Conversely, you can disassemble an assembly with ILDASM.

This chapter is an overview of MSIL programming, not a comprehensive narrative on MSIL. The intention is to convey enough information on the language to aid in the interpretation, maintenance, and debugging of C# applications. For an authoritative explanation of MSIL, I recommend Inside Microsoft .NET IL Assembler, written by Serge Lidin (Microsoft Press, 2002). Serge Lidin is one of the original architects of the ILASM compiler, the ILDASM disassembler, and other tools included in the .NET Framework. Alternatively, consult the European Computer Manufacturers Association (ECMA) documents pertaining to CIL, which are available online at http://www.ecma-international.org/publications/standards/Ecma-335.htm.

"Hello World" Application

An example is a great place to begin the exploration of MSIL code programming. The following is a variation of the universally known "Hello World!" application. It displays "Hello Name!". (Name is a local variable.)

// Hello World Application

.assembly extern mscorlib {}
.assembly hello {}

/* Starter class with entry point method */

.namespace Donis.CSharpBook {
    .class Starter {
        .method static public void Main() cil managed {
           .maxstack 2
           .entrypoint
           .locals init (string name)
           ldstr "Donis"
           stloc.0
           ldstr "Hello, {0}!"
           ldloc name
           call void [mscorlib] System.Console::WriteLine(
               string, object)
           ret
       }
    }
}

Here is the command line that compiles the MSIL code to create a hello executable:

ilasm /exe /debug hello.il

The exe option indicates that the target is a console application, which is also the default. The dll option specifies a library target. The debug option asks the compiler to generate a debug file (.pdb) for the application. A debug file is useful for a variety of reasons, including viewing source code in a debugger or disassembler.

The elements of the application are explained in more detail throughout this chapter. A brief explanation is given here. The application begins with comments and three declaratives:

// Hello World Application
.assembly extern mscorlib {}
.assembly hello {}
/* Starter class with entry point method */
.namespace Donis.CSharpBook {

MSIL supports C# style comments—both single and multiline comments. The first declarative is an external reference to the Mscorlib library. Mscorlib.dll contains the core of the .NET Framework Class Library (FCL), which includes the System.Console class. The second assembly directive is the simple name of the assembly, which is hello. Notice that the simple name does not include the extension. The third directive defines a new namespace.

The next two lines define a class and a method within that class. The class directive introduces a public class named Starter, which implicitly inherits the System.Objectclass. The method directive, which is the next directive, defines Main as a member method. Main is a managed, public, and static function. The cil keyword indicates that the method contains Intermediate Language (IL) code:

.class Starter {
    .method static public void Main() cil managed {

The Main method begins with three directives. The .maxstack directive sets the size of the evaluation stack to two slots. The .entrypoint directive designates Main as the entry point of the application. By convention, Main is always the entry point of a C# executable. In MSIL, the entry point method is whatever method contains the .entrypoint directive, which could be a method other than Main. Finally, the .locals directive declares a local string variable called name. The init option of the .locals directive initializes a local variable to a default value before the method executes:

.maxstack 2
.entrypoint
.locals init (string name)

Table 14-1 explains the MSIL code of the Main method.

Table 14-1. Hello World MSIL code

Instruction

Description

ldstr "Donis"

Loads the string "Donis" onto the evaluation stack.

stloc.0

Stores "Donis" from the evaluation stack into the first local variable, which is called name.

ldstr "Hello, {0}!"

Loads the string "Hello, {0}!" onto the evaluation stack.

ldloc name

Loads the local variable onto the evaluation stack. (Local variables can be referenced by index or by name.)

call void [mscorlib] System.Console::WriteLine(string, object)

Calls the Console::WriteLine method, which consumes the two items on the evaluation stack as parameters from right to left. "Hello xxx!" is displayed, where xxx is replaced by the topmost item on the evaluation stack.

ret

Returns from the method.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.144.38.253