In software engineering, the concept of
abstraction
is extremely
important. We often use abstraction to hide the complexity of system
or application services, providing instead a simple interface to the
consumer. As long as we can keep the interface the same, we can
change the hideous internals, and different consumers can use the
same interface.
In language advances, scientists introduced different incarnations of language-abstraction layers, such as p-code and bytecode . Produced by the Pascal-P compiler, p-code is an intermediate language that supports procedural programming. Generated by Java compilers, bytecode is an intermediate language that supports object-oriented programming. Bytecode is a language abstraction that allows Java code to run on different operating platforms, as long as the platforms have a Java Virtual Machine (JVM) to execute bytecode.
Microsoft calls its own language-abstraction layer the Common Intermediate Language (CIL). Similar to bytecode, IL supports all object-oriented features, including data abstraction, inheritance, polymorphism, and useful concepts such as exceptions and events. In addition to these features, IL supports other concepts, such as properties, fields, and enumeration. Any .NET language may be converted into IL, so .NET supports multiple languages and perhaps multiple platforms in the future (as long as the target platforms have a CLR).
Shipped with the .NET SDK, the MSIL Instruction Set
Specification describes the important IL instructions that
language compilers should use. In addition to this specification, the
.NET SDK includes another important document for IL development,
The IL Assembly Language Programmers’
Reference. Both of these documents are intended for
developers who write compilers and tools, but you should read them to
further understand how IL fits into .NET. While you can develop a
valid .NET assembly using the supported IL instructions and features,
you’ll find IL to be very tedious because the instructions are
a bit cryptic. However, should you decide to write pure IL code, you
could use the IL Assembler (ilasm.exe
) to turn
your IL code into a .NET PE file.[9]
Enough with the theory: let’s take a look at some IL.
Here’s a excerpt of IL code for the
hello.exe
program that we wrote
earlier:[10]
.class private auto ansi MainApp extends [mscorlib]System.Object { .method public hidebysig static void Main( ) cil managed { .entrypoint .maxstack 8 ldstr "C# hello world!" call void [mscorlib]System.Console::WriteLine(class System.String) ret } // end of method MainApp::Main .method public hidebysig specialname rtspecialname instance void .ctor( ) cil managed { .maxstack 8 ldarg.0 call instance void [mscorlib]System.Object::.ctor( ) ret } // end of method MainApp::.ctor } // end of class MainApp
Ignoring the weird-looking, syntactic details, you can see that IL is conceptually the same as any other object-oriented language. Clearly, there is a class that is called MainApp that derives from System.Object. This class supports a static method called Main( ), which contains the code to dump out a text string to the console. Although we didn’t write a constructor for this class, our C# compiler has added the default constructor for MainApp to support object construction.
Since a lengthy discussion of IL is beyond the scope of this book, let’s just concentrate on the Main( ) method to examine its implementation briefly. First, you see the following method signature:
.method public hidebysig static void Main( ) cil managed
This signature declares a method that is public, meaning that it can
be called by anyone, and static, meaning it’s a class-level
method. The name of this method is Main( ). Main( ) contains IL code
that is to be managed or executed by the CLR. The
hidebysig
attribute says that this method hides
the same methods (with the same signatures) defined earlier in the
class hierarchy. This is simply the default behavior of most
object-oriented languages, such as C++. Having gone over the method
signature, let’s talk about the method body itself:
{ .entrypoint .maxstack 8 ldstr "C# hello world!" call void [mscorlib]System.Console::WriteLine(class System.String) ret } // end of method MainApp::Main
This method uses two directives:
.entrypoint
and
.maxstack
. The .entrypoint
directive specifies that Main( ) is the one and only entry point for
this assembly. The .maxstack
directive specifies
the maximum stack slots needed by this method; in this case, the
maximum number of stack slots required by Main( ) is eight. Stack
information is needed for each IL method because IL instructions are
stack-based, allowing language compilers to generate IL code easily.
In addition to these directives, this method uses three IL
instructions. The first IL instruction,
ldstr
, loads our literal string onto
the stack so that the code in the same block can use it. The next IL
instruction, call
, invokes the WriteLine( )
method, which picks up the string from the stack. The
call
IL instruction expects the
method’s arguments to be on the stack, with the first argument
being the first object pushed on the stack, the second argument being
the second object pushed onto the stack, and so forth. In addition,
when you use the call
instruction to invoke a
method, you must specify the method’s signature. For example,
examine the method signature of WriteLine( ):
void [mscorlib]System.Console::WriteLine(class System.String)
and you’ll see that WriteLine( ) is a static method of the
Console class. The Console class belongs to the System namespace,
which happens to be a part of the
mscorlib
assembly. The
WriteLine( ) method takes a System.String
object and returns a void
. The last thing to note
in this IL snippet is that the ret
IL instruction
simply returns control to the caller.
[9] You can test this utility using the IL disassembler to load a .NET PE file and dump out the IL to a text file. Once you’ve done this, use the IL Assembler to covert the text file into a .NET PE file.
[10] Don’t compile this IL code: it’s
incomplete because we’ve extracted unclear details to make it
easier to read. If you want to see the complete IL code, use
ildasm.exe
on
hello.exe
.
3.141.7.186