Chapter 13

image

IL Instructions

When a method is executed, three categories of memory local to the method plus one category of external memory are involved. All these categories represent typed data slots, not simply an address interval as is the case in the unmanaged world. The external memory manipulated from the method is the community of the fields the method accesses (except the fields of value types belonging to the local categories). The local memory categories include an argument table, a local variable table, and an evaluation stack. Figure 13-1 describes data transitions between these categories. As you can see, all IL instructions resulting in data transfer have the evaluation stack as a source or a destination, or both.

9781430267614_Fig13-01.jpg

Figure 13-1. Method memory categories

The number of slots in the argument table is inferred from the method signature at the call site (not from the method signature specified when the method is defined—remember vararg methods). The number of slots in the local variable table is inferred from the local variable signature whose token is specified in the method header. The number of slots in the evaluation stack is defined by the MaxStack value of the method header, specified in ILAsm by the .maxstack directive (default evaluation stack depth is 8, as you remember).

The slots of the argument and local variable tables have static types, which can be any of the types defined in the .NET Framework and the application. The slots of the evaluation stack hold different types at different times during the course of the method execution. The types of stack slots change as the computations progress, and the same stack slots are used for different values. The execution engine of the common language runtime implements a coarser type system for the evaluation stack: the only types a stack slot can have at a given moment are int32, native int, int64, Float (the current implementation uses 64-bit floating-point representation, which covers both float32 and float64 types), & (a managed pointer), ObjectRef (an object reference, which is an instance pointer to an object), or an instance of a value type.

The IL instruction sequences that make up the IL code of a method can be valid or verifiable, or both, or neither. The concept of validity is easy to grasp: invalid instruction sequences are rejected by the JIT compiler, so nothing really bad can happen if you emit an invalid sequence—except that your code won’t run.

Verifiability of the code is a security issue, not a compilation issue. The verifiable code is guaranteed to access only the memory it is allowed to access and hence is not capable of any malice or hidden hacks, so you can download a verifiable component from a remote location and run it without fear. If the code is deemed unverifiable—that is, if the code contains segments that just might contain a hack—the runtime security system will not allow it to be run except from a local disk. (I’ll discuss the verifiability of IL code in the “Code Verifiability” section.) Generally, it’s a good idea to check your executables with the PEVerify utility, distributed with the Microsoft .NET Framework SDK. This utility provides metadata validation and IL code verification, which includes checking both aspects—code validity and verifiability.

IL instructions consist of an operation code (opcode), which for some instructions is followed by an instruction parameter. Opcodes are either 1 byte or 2 bytes long; in the latter case, the first byte of the opcode is always 0xFE. In later sections of this chapter, opcodes are specified in parentheses following the instruction specification. Some instructions have synonyms, which I’ve also listed in parentheses immediately after the principal instruction name.

Long-Parameter and Short-Parameter Instructions

Many instructions that take an integer or an unsigned integer as a parameter have two forms. The long-parameter form requires a 4-byte integer, and the short-parameter form, recognized by the suffix .s, requires a 1-byte integer. Short-parameter instructions are used when the value of the parameter is in the range –128 through 127 for signed parameters and in the range 0 through 255 for unsigned parameters. The long-parameter form of an instruction can also be used for parameters within these ranges, but it leads to unnecessary bloating of the IL code.

Instructions that take a metadata token as a parameter don’t have short forms: metadata tokens are always used in the IL stream in uncompressed and uncoded form, as 4-byte unsigned integers.

The byte order of the integers embedded in the IL stream must be little endian—that is, the least significant byte comes first.

Labels and Flow Control Instructions

Flow control instructions include branching instructions, the switch instruction, exiting and ending instructions used with managed EH blocks, and a return instruction.

All these instructions, except the return instruction, use integer offsets (in bytes) from the current position within the method IL code to specify the target instruction. The “current position” in this case is the offset of the beginning of the next instruction—the one following the flow control instruction. The target offset (which is the sum of the current position and the offset specified in the branching instruction) must point at the beginning of some instruction of this method. In other words, the target offset cannot be less than zero, cannot be larger than the method’s code size, and cannot point at the middle of an instruction. If the target instruction is prefixed (the prefix instructions are discussed later in this chapter), the target offset cannot point at the prefixed instruction directly and must point at the prefix instruction. From the flow control point of view, a combination of a prefix instruction and a prefixed instruction is a single instruction.

Unconditional Branching Instructions

Unconditional branching instructions take no arguments from the stack and have a signed integer parameter. The parameter specifies the offset in bytes from the current position (which is the beginning of the next instruction—the one following the branching instruction) within the IL stream. The ILAsm notation does allow you to specify the offset explicitly (for example, br -234), but this practice is not recommended for an obvious reason: it’s difficult to calculate the offset correctly when you’re writing in a programming language.

It is much safer and less troublesome to use labels instead, letting the ILAsm compiler calculate the correct offsets. Labels, which you’ve already encountered many times, are simple names followed by a colon.

...
Loop:
...
br Loop
...

By default, the IL assembler does not automatically choose between long-parameter and short-parameter forms. Thus, if you specify a short-parameter instruction and put the target label farther away than the short parameter permits, the calculated offset is truncated to 1 byte, and the IL assembler issues an error message. Versions 2.0 and later of the IL assembler feature the command-line option /OPT, which turns on the automatic replacement of long-parameter instructions by the short-parameter instructions whenever the parameter size permits.

Unconditional branching instructions take nothing from the evaluation stack and put nothing on it.

  • br <int32> (0x38): Branch <int32> bytes from the current position.
  • br.s <int8> (0x2B): The short-parameter form of br.

Conditional Branching Instructions

Conditional branching instructions differ from the unconditional instructions in one aspect only: they branch only if the condition (<value>, which they take from the evaluation stack) is true (nonzero) or false (zero).

  • brfalse (brnull, brzero) <int32> (0x39): Branch if <value> is 0.
  • brfalse.s (brnull.s, brzero.s) <int8> (0x2C): The short-parameter form of brfalse.
  • brtrue (brinst) <int32> (0x3A): Branch if <value> is nonzero.
  • brtrue.s (brinst.s) <int8> (0x2D): The short-parameter form of brtrue.

Comparative Branching Instructions

Comparative branching instructions take two values (<value1>, <value2>) from the evaluation stack and compare them according to the <condition> specified by the opcode. Not all combinations of types of <value1> and <value2> are valid; Table 13-1 lists the valid combinations.

Table 13-1. Valid Type Combinations in Comparison Instructions

Type of <value i >

Can Be Compared with Type

int32

int32, native int.

int64

int64.

native int

int32, native int, & (equality or nonequality comparisons only).

Float

Float. Without exception, all floating-point comparisons are formulated as <condition> or unordered. Unordered is true when at least one of the operands is NaN (“not a number,” undefined floating point value discussed below in Arithmetic Instructions section).

&(managed pointer)

native int (equality or nonequality comparisons only), &. Unless the compared values are pointers to the same array or value type or pointers to pinned variables, comparing two managed pointers should be limited to equality or nonequality comparisons because the garbage collection subsystem might move the managed pointers in an unpredictable way at unpredictable moments.

ObjectRef

ObjectRef (equality or nonequality comparisons only). “Greater-than” unsigned comparison is also admissible and is used to compare an object reference to null because objects are subject to garbage collection, and their references can be changed by the GC at will.

  • beq <int32> (0x3B): Branch if <value1> is equal to <value2>.
  • beq.s <int8> (0x2E): The short-parameter form of beq.
  • bne.un <int32> (0x40): Branch if the two values are not equal. Integer values are interpreted as unsigned; floating-point values are compared unordered.
  • bne.un.s <int8> (0x33): The short-parameter form of bne.un.
  • bge <int32> (0x3C): Branch if <value1> is greater or equal to <value2>.
  • bge.s <int8> (0x2F): The short-parameter form of bge.
  • bge.un <int32> (0x41): Branch if greater or equal. Integer values are interpreted as unsigned; floating-point values are compared unordered.
  • bge.un.s <int8> (0x34): The short-parameter form of bge.un.
  • bgt <int32> (0x3D): Branch if greater.
  • bgt.s <int8> (0x30): The short-parameter form of bgt.
  • bgt.un <int32> (0x42): Branch if greater. Integer values are interpreted as unsigned; floating-point values are compared unordered.
  • bgt.un.s <int8> (0x35): The short-parameter form of bgt.un.
  • ble <int32>(0x3E): Branch if less or equal.
  • ble.s <int8> (0x31): The short-parameter form of ble.
  • ble.un <int32> (0x43): Branch if less or equal. Integer values are interpreted as unsigned; floating-point values are compared unordered.
  • ble.un.s <int8> (0x36): The short-parameter form of ble.un.
  • blt <int32> (0x3F): Branch if less.
  • blt.s <int8> (0x32): The short-parameter form of blt.
  • blt.un <int32> (0x44): Branch if less. Integer values are interpreted as unsigned; floating-point values are compared unordered.
  • blt.un.s <int8> (0x37): The short-parameter form of blt.un.

The switch Instruction

The switch instruction implements a jump table. This instruction is unique in the sense that it has not one, not two, but N+1 parameters following it, where N is the number of cases in the switch. The first parameter is a 4-byte unsigned integer specifying the number of cases, and the following N parameters are 4-byte signed integers specifying offsets to the targets (cases). There is no short-parameter form of this instruction. The ILAsm notation is as follows:

switch(Label1,Label2,...,LabelN)
...                // Default case
Label1:
...
Label2:
...
...
LabelN:
...

As in the case of branching instructions, the ILAsm syntax allows you to replace the labels in a switch(...) instruction with explicit offsets, but I definitely do not recommend this.

The instruction takes one value from the stack and converts it to an unsigned integer. It then switches to the target according to the value of this unsigned integer. A 0 value corresponds to the first target offset on the list. If the value is greater than or equal to the number of targets, the switch instruction is ignored, and control is passed to the instruction immediately following it. In this sense, the default case in ILAsm is always the first (lexically) case of the switch.

  • switch <unsigned int32> <int32>...<int32> (0x45): Branch to one of the <int32> offsets; number of the offsets is specified in leading <unsigned int32>.

The break Instruction

This break instruction isnot equivalent to the break statement in C, which is used as an exit from the switch cases or loops. The break instruction in IL inserts a breakpoint into the IL stream and is used to indicate that if a debugger is attached, execution will stop and control will be given to the debugger. If a debugger is not present, the instruction does nothing. This instruction does not have parameters and does not touch the evaluation stack.

  • break (0x01): Debugging breakpoint.

Managed EH Block Exiting Instructions

The blocks of code involved in managed exception handling cannot be entered or exited by simple branching because of the strict stack state requirements imposed on them. The leave instruction, or its short-parameter form, is used to exit a guarded block (a try block) or an exception handler block. You cannot use this instruction, however, to exit a filter, finally, or fault block. (For more details about these blocks, see Chapter 14.)

The instruction has one integer parameter specifying the offset of the target and works the same way as an unconditional branching instruction except that it empties the evaluation stack before the branching. The ILAsm notation for this instruction is similar to the notation for unconditional branching instructions: leave <label> or leave <int32>; the latter one is highly unrecommended.

  • leave <int32> (0xDD): Clear the stack, and branch <int32> bytes from the current point.
  • leave.s <int8> (0xDE): The short-parameter form of leave.

EH Block Ending Instructions

IL has two specific instructions to mark the end of filter, finally, and fault blocks. Unlike leave, these instructions mark the lexical end of a block (the instruction that has the highest offset in the block) rather than an algorithmic end or point of exit (which may just as well be located in the middle of the block). These instructions have no parameters.

  • endfilter (0xFE 0x11):The lexical end of a filter block. The instruction takes one 4-byte integer value from the evaluation stack and signals the execution engine whether the associated exception handler should be engaged (a value of 1) or whether the search for the handler for this particular exception should be continued (a value other than 1) because this filter doesn’t know what to do with this particular exception.
  • endfinally (endfault) (0xDC): The lexical end of a finally or fault block. This instruction clears the evaluation stack.

The ret Instruction

The return instruction—ret—returns from a called method to the call site (immediately after the call site, to be precise). It has no parameters. If the called method should return a value of a certain type, exactly one value of the required type must be on the evaluation stack at the moment of return. The ret instruction causes this value to be removed from the evaluation stack of the called method and put on the evaluation stack of the calling method. If the called method returns void, its evaluation stack must be empty at the moment of return.

  • ret (0x2A): Return from a method.

Arithmetical Instructions

Arithmetical operations deal with numeric data processing and include stack manipulation instructions, constant loading instructions, indirect (by pointer) loading and storing instructions, arithmetical operations, bitwise operations, data conversion operations, logical condition check operations, and block operations.

Stack Manipulation

Stack manipulation instructions work with the evaluation stack and have no parameters.

  • nop (0x00): No operation; a placeholder only. The nop instruction is not exactly a stack manipulation instruction, since it does not touch the stack, but I’ve included it here rather than creating a separate category for it. The nop instruction is somewhat useful only in that, as it is a distinct opcode, a line of source code can be bound to it in the PDB file containing the debug information. The Visual Basic compiler introduces a lot of nop instructions because it wants to bind each and every line of the source code to the IL code. The reasoning behind this is not clear; perhaps the Visual Basic programmers want to be able to put breakpoints on comment lines.

    Another useful application of the nop instructions is specific to version 2.0 or later of the common language runtime (or, more exactly, its JIT compiler). Compilers, emitting the debug information into the PDB files, specify so-called code points, which bind source code lines and columns to offsets in the method code. In versions 1.0 and 1.1, the JIT compiler provided the ability to set the breakpoints at any code point specified in the PDB file. In version 2.0, an “optimized” mode was introduced, in which the JIT compiler effectively ignores the code points specified in the PDB and allows setting the breakpoints only according to some “heuristics.” These heuristics include, for example, the moments when the evaluation stack is empty or when nop is encountered. Empty evaluation stack heuristics work for most imperative high-level languages because, as a rule, each completed statement in these languages translates into code that begins and ends with the evaluation stack empty. This does not work, of course, for ILAsm, so in “optimized” mode you cannot set a breakpoint on an arbitrary instruction, and you cannot “walk” the ILAsm code instruction by instruction under a debugger. Inserting nop instructions after each meaningful instruction would probably help, but it is not a feasible option. Fortunately, the traditional mode, which allows setting the breakpoints according to PDB code points, is also supported in versions 2.0+. But I had to fight for preserving it and explain at lengths why it is important. You can find more details of debug modes in Chapter 16.

  • dup (0x25): Duplicate the value on the top of the stack. If the stack is empty, the JIT compiler fails because of the stack underflow.
  • pop (0x26): Remove the value from the top of the stack. The value is lost. If the stack is empty, the JIT compiler fails. It’s not healthy to invoke dup or pop on an empty stack.

Constant Loading

Constant loading instructions take at most one parameter (the constant to load) and load it on the evaluation stack. The ILAsm syntax requires explicit specification of the constants (in other words, you cannot use a variable or argument name), in decimal or hexadecimal form:

ldc.i4–1
ldc.i40xFFFFFFFF

Some instructions have no parameters because the value to be loaded is specified by the opcode itself.

Note that for integer and floating-point values, the slots of the evaluation stack are either 4- or 8-bytes wide, so the constants being loaded are converted to the suitable size.

  • ldc.i4 <int32> (0x20): Load <int32> on the stack.
  • ldc.i4.s <int8> (0x1F): Load <int8> on the stack.
  • ldc.i4.m1 (ldc.i4.M1) (0x15): Load –1 on the stack.
  • ldc.i4.0 (0x16): Load 0.
  • ldc.i4.1 (0x17): Load 1.
  • ldc.i4.2 (0x18): Load 2.
  • ldc.i4.3 (0x19): Load 3.
  • ldc.i4.4 (0x1A): Load 4.
  • ldc.i4.5 (0x1B): Load 5.
  • ldc.i4.6 (0x1C): Load 6.
  • ldc.i4.7 (0x1D): Load 7.
  • ldc.i4.8 (0x1E): Load 8. (I should have listed these in reverse order so then we could imagine ourselves on Cape Canaveral.)
  • ldc.i8 <int64> (0x21): Load <int64> on the stack.
  • ldc.r4 <float32> (0x22): Load <float32> (single-precision) on the stack.
  • ldc.r8 <float64> (0x23): Load <float64> (double-precision) on the stack. ILAsm permits the use of integer parameters or byte arrays in both the ldc.r4 and ldc.r8 instructions; this is useful when you need to load special floating-point codes denoting positive or negative infinity or NaN. In such cases, the integers are either converted to floating-point numbers or interpreted as binary images of the floating-point numbers; the byte arrays are always interpreted as such binary images:
    • ldc.r4 1078530000 loads floating-point number 1078530000.0.
    • ldc.r4 float32(1078530000) loads 3.1415901184082.
    • ldc.r4 (D0 0F 49 40) loads 3.1415901184082.

Indirect Loading

An indirect loading instruction takes a managed pointer (&) or an unmanaged pointer (native int) from the stack, retrieves the value at this pointer, and puts the value on the stack. The type of value to be retrieved is defined by the opcode. The indirect loading instructions have no parameters.

  • ldind.i1 (0x46): Load a signed 1-byte integer from the location specified by the pointer taken from the stack.
  • ldind.u1 (0x47): Load an unsigned 1-byte integer.
  • ldind.i2 (0x48): Load a signed 2-byte integer.
  • ldind.u2 (0x49): Load an unsigned 2-byte integer.
  • ldind.i4 (0x4A): Load a signed 4-byte integer.
  • ldind.u4 (0x4B): Load an unsigned 4-byte integer.
  • ldind.i8 (ldind.u8) (0x4C): Load an 8-byte integer, signed or unsigned.
  • ldind.i (0x4D): Load native int, an integer the size of a pointer.
  • ldind.r4 (0x4E): Load a single-precision floating-point value.
  • ldind.r8 (0x4F): Load a double-precision floating-point value.
  • ldind.ref (0x50): Load an object reference.

Indirect Storing

Indirect storing instructions take a value and an address, in that order, from the stack and store the value at the location specified by the address. Since you are copying the memory (stack slot to specified location) without the need to interpret it, all you care about really is the size of the value to be stored. That’s why the indirect storing instructions for integers don’t have “unsigned” modifications. The address can be a managed or an unmanaged pointer. The type of the value to be stored is specified in the opcode. These instructions have no parameters.

  • stind.ref (0x51): Store an object reference.
  • stind.i1 (0x52): Store a 1-byte integer.
  • stind.i2 (0x53): Store a 2-byte integer.
  • stind.i4 (0x54): Store a 4-byte integer.
  • stind.i8 (0x55): Store an 8-byte integer.
  • stind.i (0xDF): Store a pointer-size integer.
  • stind.r4 (0x56): Store a single-precision floating-point value.
  • stind.r8 (0x57): Store a double-precision floating-point value.

Arithmetical Operations

All instructions performing the arithmetical operations except the negation operation take two operands from the stack and put the result on the stack. If the result value does not fit the result type, the value is truncated. Table 13-2 lists the admissible type combinations of operands and their corresponding result types. If the type combination is not admissible, the result is undefined and will most probably cause an exception. Note also that this table lists generally admissible type combinations of the operands; particular arithmetic instructions may have their own limitations.

Table 13-2. Admissible Operand Types and Their Result Types in Arithmetical Operations

Operand Type

Operand Type

Result Type

int32

int32

int32

native int

native int

native int

native int

int32

native int

int64

int64

int64

int64

native int

int64

int64

int32

int64

&

int32, native int (addition or subtraction only, unverifiable)

&

Float

Float (except unsigned division)

Float

Float

int32 (except unsigned division)

Float

Float

native int (except unsigned division)

Float

Float

int64 (except unsigned division)

Float

&

& (subtraction only, unverifiable)

native int

The arithmetical operation instructions are as follows:

  • add (0x58): Addition.
  • sub (0x59): Subtraction.
  • mul (0x5A): Multiplication. For floating-point numbers, which have the special values infinity and NaN, the following rule applies:
    0 * infinity = NaN, x * NaN = NaN, x * infinity = infinity
  • div (0x5B): Division. For integers, division by 0 results in a DivideByZero exception. For floating-point numbers, the following rule applies:
    0 / 0 = NaN, infinity / infinity = NaN, x / infinity = 0
  • div.un (0x5C): Division of operands treated as unsigned (integer operands only).
  • rem (0x5D): Remainder, modulo. For integers, modulo 0 results in a DivideByZero exception. For floating-point numbers, the following rule applies:
    infinity rem x = NaN, x rem 0 = NaN, x rem infinity = x, x rem NaN = NaN
  • rem.un (0x5E): The remainder of unsigned operands (integer operands only).
  • neg (0x65): Negate—that is, invert the sign. This is the only unary arithmetical operation. It takes one operand rather than two from the evaluation stack and puts the result back. This operation is not applicable to managed pointers or object references but is applicable to unmanaged pointers. With integers, a peculiar situation can occur in which the maximum negative number does not change after negation because of the overflow condition during the operation, as shown in this example:
    ldc.i40x80000000// Max. negative number for int32,
                      //-2147483648
    neg
    call void[mscorlib]System.Console::WriteLine(int32)
    // Output: -2147483648;
    // The same effect with subtraction:
    ldc.i4.0
    ldc.i40x80000000
    sub
    call void[mscorlib]System.Console::WriteLine(int32)
    // Output: -2147483648;

    The previous problem is not in any way IL specific; it stems from the binary representation of integer numbers. Floating-point numbers don’t have this problem. Negating NaN returns NaN because NaN, which is not a number, has no sign.

If an arithmetical operation is applied to integer operands and the result overflows the target, the result is bit truncated to fit the target type, with most significant bits thrown away:

ldc.i40xFFFFFFF0// 4294967280
ldc.i40x000000FF// 255
add
call void[mscorlib]System.Console::WriteLine(int32)
// Output: 239 (0xEF);

When int32 and native int are used as operands of arithmetic instructions on a 64-bit platform, int32 operand is sign extended to native int.

Overflow Arithmetical Operations

Overflow arithmetical operations are similar to the arithmetical operations described in the preceding section except that they work with integer operands only and generate an Overflow exception if the result does not fit the target type. The ILAsm notation for the overflow arithmetical operations contains the suffix.ovf following the operation kind. The type compatibility list, shown in Table 13-3, is similar to the list shown in Table 13-2.

Table 13-3. Acceptable Operand Types and Their Result Types in Overflow Arithmetical Operations

Operand Type

Operand Type

Result Type

int32

int32

int32

native int

native int

native int

int32

native int

native int

int64

int64

int64

int64

native int

int64

int64

int32

int64

&

int32, native int (unsigned addition or subtraction only, unverifiable)

&

&

& (unsigned subtraction, unverifiable)

native int

The overflow operation instructions are as follows:

  • add.ovf (0xD6): Addition.
  • add.ovf.un (0xD7): Addition of unsigned operands.
  • sub.ovf (0xDA): Subtraction.
  • sub.ovf.un (0xDB): Subtraction of unsigned operands.
  • mul.ovf (0xD8): Multiplication.
  • mul.ovf.un (0xD9): Multiplication of unsigned operands.

Bitwise Operations

Bitwise operations have no parameters and are defined for integer types only; floating-point, pointer, and object reference operands are not allowed. As a result, the related operand type compatibility list, shown in Table 13-4, is pretty simple.

Table 13-4. Acceptable Operand Types and Their Result Types in Bitwise Operations

Operand Type

Operand Type

Result Type

int32

int32

int32

int32

native int

native int

native int

native int

native int

int64

int32

int64

int64

native int

int64

int64

int64

int64

Three of the bitwise operations are binary, taking two operands from the stack and placing one result on the stack; and one is unary, taking one operand from the stack and placing one result on the stack:

  • and (0x5F): Bitwise AND (binary).
  • or (0x60): Bitwise OR (binary).
  • xor (0x61): Bitwise exclusive OR (binary).
  • not (0x66): Bitwise inversion (unary). This operation (which is the equivalent of ldc.i4.1; add; neg;), rather than neg, can be used (to some extent) for the integer sign inversion of the maximum negative integer:
    ldc.i40x80000000// Max. negative number for int32,
                      // -2147483648
    not
    call void[mscorlib]System.Console::WriteLine(int32)
    // Output: 2147483647 (0x7FFFFFFF);
    // Of course, it's not +2147483648,
    // which cannot be expressed by an int32,
    // but at least we have the max. positive number

When int32 and native int are used as operands of bitwise instructions on a 64-bit platform, int32 operand is sign-extended to native int.

Shift Operations

Shift operations have no parameters and are defined for integer operands only. The shift operations are binary: they take from the stack the shift count and the value being shifted, in that order, and put the shifted value on the stack. The result always has the same type as the operand being shifted, which can be of any integer type. The type of the shift count cannot be int64 and is limited to int32 or native int.

  • shl (0x62): Shift left.
  • shr (0x63): Shift right (the most significant bits of the result assume the value of the sign bit of the value being shifted).
  • shr.un (0x64): Shift right, treating the shifted value as unsigned (the most significant bits of the result assume the zero value).

It is interesting that, as you can see, there are no rotational shift operations in the IL, and it is not obvious how these operations could be implemented through shl and shr.un because rotational shift operations are size specific: rol(i, n) == (shl(i, n%sizeof(i)) | shr.un(i, sizeof(i)-n%sizeof(i))), where i is the operand being shifted and n is the shift count.

Conversion Operations

The conversion operations have no parameters. They take a value from the stack, convert it to the type specified by the opcode, and put the result back on the stack. The specifics of the conversion obviously depend on the type of the converted value and the target type (the type to which the value is converted). If the type of the value on the stack is the same as the target type, no conversion is necessary, and the operation itself is doing nothing more than bloating the IL code.

For integer source and target types, several rules apply. If the target integer type is narrower than the source type (for example, int32 to int16, or int64 to int32), the value is truncated—that is, the most significant bytes are thrown away. If the situation is the opposite—if the target integer type is wider than the source—the result is either sign-extended or zero-extended, depending on the type of conversion. Conversions to signed integers use sign-extension, and conversions to unsigned integers use zero-extension.

If the source type is a pointer, it can be converted to either unsigned int64 or native unsigned int. In either case, if the converted pointer was managed, it is dropped from the GC tracking and is not automatically updated when the GC rearranges the memory layout. A pointer cannot be used as a target type.

If both source and target types are floating point, the conversion merely results in a change of precision. In float-to-integer conversions, the values are truncated toward 0; for example, the value 1.1 is converted to 1, and the value –2.3 is converted to –2. In integer-to-float conversions, the integer value is simply converted to floating point, possibly losing less significant mantissa bits.

Object references cannot be subject to conversion operations either as a source or as a target.

  • conv.i1 (0x67): Convert the value to int8.
  • conv.u1 (0xD2): Convert the value to unsigned int8.
  • conv.i2 (0x68): Convert the value to int16.
  • conv.u2 (0xD1): Convert the value to unsigned int16.
  • conv.i4 (0x69): Convert the value to int32.
  • conv.u4 (0x6D): Convert the value to unsigned int32.
  • conv.i8 (0x6A): Convert the value to int64.
  • conv.u8 (0x6E): Convert the value to unsigned int64. This operation can be applied to pointers.
  • conv.i (0xD3): Convert the value to native int.
  • conv.u (0xE0): Convert the value to native unsigned int. This operation can be applied to pointers.
  • conv.r4 (0x6B): Convert the value to float32.
  • conv.r8 (0x6C): Convert the value to float64.
  • conv.r.un (0x76): Convert an unsigned integer value to floating point value of “native size.” The “native size” is hardware-specific and generally must be wide enough to represent the result without loss of precision or range. For example, for x86 architecture, the “native floating-point size” is 80 bit (width of an FPU register). The “native-sized” floating-point values cannot be stored in memory and can exist only as intermediate computation results. You cannot have a variable of type “native float;” you can only have a float32 or a float64 variable, and when you store the result of conv.r.un operation in a float32 or a float64 variable, the value is truncated accordingly.

Overflow Conversion Operations

Overflow conversion operations differ from the conversion operations described in the preceding section in two aspects: the target types are exclusively integer types, and an Overflow exception is thrown whenever the value must be truncated to fit the target type. In short, the story is the same as it is with overflow arithmetical operations and arithmetical operations. Note that the overflow condition when converting an integer value depends on this value being considered signed or unsigned. That’s why there are special *.un variants of overflow conversion operations, which presume the integer value on stack being unsigned.

  • conv.ovf.i1 (0xB3): Convert the value to int8.
  • conv.ovf.u1 (0xB4): Convert the value to unsigned int8.
  • conv.ovf.i1.un (0x82): Convert an unsigned integer to int8.
  • conv.ovf.u1.un (0x86): Convert an unsigned integer to unsigned int8.
  • conv.ovf.i2 (0xB5): Convert the value to int16.
  • conv.ovf.u2 (0xB6): Convert the value to unsigned int16.
  • conv.ovf.i2.un (0x83): Convert an unsigned integer to int16.
  • conv.ovf.u2.un (0x87): Convert an unsigned integer to unsigned int16.
  • conv.ovf.i4 (0xB7): Convert the value to int32.
  • conv.ovf.u4 (0xB8): Convert the value to unsigned int32.
  • conv.ovf.i4.un (0x84): Convert an unsigned integer to int32.
  • conv.ovf.u4.un (0x88): Convert an unsigned integer to unsigned int32.
  • conv.ovf.i8 (0xB9): Convert the value to int64.
  • conv.ovf.u8 (0xBA): Convert the value to unsigned int64.
  • conv.ovf.i8.un (0x85): Convert an unsigned integer to int64.
  • conv.ovf.u8.un (0x89): Convert an unsigned integer to unsigned int64.
  • conv.ovf.i (0xD4): Convert the value to native int.
  • conv.ovf.u (0xD5): Convert the value to native unsigned int.
  • conv.ovf.i.un (0x8A): Convert an unsigned integer to native int.
  • conv.ovf.u.un (0x8B): Convert an unsigned integer to native unsigned int.

Logical Condition Check Instructions

Logical condition check operations are similar to comparative branching instructions except that they result not in branching but in putting the condition check result on the stack. The result type is int32, and its value is equal to 1 if the condition checks and 0 otherwise; in other words, logically the result is a Boolean value. The two operands being compared are taken from the stack, and since no branching is performed, the condition check instructions have no parameters.

The logical condition check instructions are useful when you want to store the result of the condition check for multiple use or for later use. If you need the condition check to decide only once and on the spot whether you need to branch, you would be better off using a comparative branching instruction.

The admissible combinations of operand types are the same as for comparative branching instructions (see Table 13-1). There are, however, fewer condition check instructions than conditional branching operations: some conditions are just logical negations of other conditions (“not equal” is not “equal,” “less or equal” is not “greater,” and so on), so it would be redundant to introduce special check instructions for such conditions.

  • ceq (0xFE 0x01): Check whether the two values on the stack are equal.
  • cgt (0xFE 0x02): Check whether the first value is greater than the second value. It’s the stack you are working with, so the “second” value is the one on the top of the stack.
  • cgt.un (0xFE 0x03): Check whether the first value is greater than the second; integer values are compared as unsigned, and floating-point values are compared as unordered.
  • clt (0xFE 0x04): Check whether the first value is less than the second value.
  • clt.un (0xFE 0x05): Check whether the first value is less than the second; integer values are compared as unsigned, and floating-point values are compared as unordered.
  • ckfinite (0xC3): This unary operation, which takes only one value from the stack, is applicable to floating-point values only. It throws an Arithmetic exception if the value is +infinity, -infinity, or NaN. Otherwise, the operation puts the same value back on the stack.

Block Operations

Two IL instructions deal with blocks of memory regardless of the type or types that make up this memory. Because of their type blindness, both instructions are unverifiable.

  • cpblk (0xFE 0x17): Copy a block of memory. The instruction has no parameters and takes three operands from the stack in the following order: the size (in bytes) of the block to be copied (unsigned int32), the source address (a pointer or native int), and the destination address (a pointer or native int). The source and destination addresses must be aligned on the size of native int unless the instruction is prefixed with the unaligned. instruction, described in the “Prefix Instructions” section later in this chapter. The cpblk instruction does not deduce the right direction of byte copying when the source and destination areas overlap, so there is no guarantee in such case. The cpblk instruction puts nothing on the stack.
  • initblk (0xFE 0x18): Initialize a block of memory. The instruction has no parameters and takes three operands from the evaluation stack: the size of the block in bytes (unsigned int32), the initialization value (int8), and the block start address (a pointer or native int). The alignment rules mentioned apply to the block start address. The initblk instruction puts nothing on the stack. As a result of this operation, each byte within the specified block is assigned the initialization value.

Addressing Arguments and Local Variables

A special group of IL instructions is dedicated to loading the values of method arguments and local variables on the evaluation stack and storing the values taken from the stack in local variables and method arguments. It is to be noted that in the case of vararg methods, the argument-addressing instructions described in the following sections cannot target the arguments of the variable part of the signature.

Method Argument Loading

The following instructions are used for loading method argument values on the evaluation stack:

  • ldarg <unsigned int16> (0xFE 0x09): Load the argument number <unsigned int16> on the stack. The argument enumeration is zero based, but it’s important to remember that instance methods have an “invisible” argument not specified in the method signature: the class instance pointer, this, which is always argument number 0. The static methods don’t have such an “invisible” argument, so for them the argument number 0 is the first argument specified in the method signature. The total number of arguments cannot exceed 65535 (0xFFFF), which means the argument ordinal cannot exceed 65534. This limitation stems from the fact that the Sequence entry of the Parameter metadata table is only 2 bytes wide.
  • ldarg.s <unsigned int8> (0x0E): The short-parameter form of ldarg.
  • ldarg.0 (0x02): Load argument number 0 on the stack.
  • ldarg.1 (0x03): Load argument number 1 on the stack.
  • ldarg.2 (0x04): Load argument number 2 on the stack.
  • ldarg.3 (0x05): Load argument number 3 on the stack.

Method Argument Address Loading

These two instructions are used for loading method argument addresses on the evaluation stack:

  • ldarga <unsigned int16> (0xFE 0x0A): Load the address of argument number <unsigned int16> on the stack.
  • ldarga.s <unsigned int8> (0x0F): The short-parameter form of ldarga.

Method Argument Storing

These two instructions are used for storing a value from the stack in a method argument slot:

  • starg <unsigned int16> (0xFE 0x0B): Take a value from the stack and store it in argument slot number <unsigned int16>. The value on the stack must be of the same type as the argument slot or must be convertible to the type of the argument slot. The convertibility rules and effects are the same as those for conversion operations, discussed earlier in this chapter.
  • starg.s <unsigned int8> (0x10): The short-parameter form of starg.

Method Argument List

The following instruction is used exclusively in vararg methods to retrieve the method argument list and put an instance of the value type [mscorlib]System.RuntimeArgumentHandle on the stack. Chapter 10 discusses the application of this instruction.

  • arglist (0xFE 0x00): Get the argument list handle.

Local Variable Loading

Local variable loading instructions are similar to argument loading instructions except that no “invisible” items appear among the local variables, so local variable number 0 is always the first one specified in the local variable signature.

  • ldloc <unsigned int16> (0xFE 0x0C): Load the value of local variable number <unsigned int16> on the stack. Like the argument numbers, local variable numbers can range from 0 to 65534 (0xFFFE). The value 65535, also admissible for unsigned 2-byte integers, is excluded because otherwise the counter of local variables would have to be 4 bytes wide. Limiting the number of the local variables, however standardized, seems arbitrary and implementation specific because the number of the local variables of a method is not stored in the metadata or in the method header, so this limitation comes purely from one particular implementation of the JIT compiler.
  • ldloc.s <unsigned int8> (0x11): The short-parameter form of ldloc.
  • ldloc.0 (0x06): Load the value of local variable number 0 on the stack.
  • ldloc.1 (0x07): Load the value of local variable number 1 on the stack.
  • ldloc.2 (0x08): Load the value of local variable number 2 on the stack.
  • ldloc.3 (0x09): Load the value of local variable number 3 on the stack.

Local Variable Reference Loading

The following instructions load references (managed pointers) to the local variables on the evaluation stack:

  • ldloca <unsigned int16> (0xFE 0x0D): Load the address of local variable number <unsigned int16> on the stack. The local variable number can vary from 0 to 65534 (0xFFFE).
  • ldloca.s <unsigned int8> (0x12): The short-parameter form of ldloca.

Local Variable Storing

It would be strange to have local variables and be unable to assign values to them. The following two instructions take care of this aspect of your life:

  • stloc <unsigned int16> (0xFE 0x0E): Take the value from the stack, and store it in local variable slot number <unsigned int16>. The value on the stack must be of the same type as the local variable slot or must be convertible to the type of the local variable slot. The convertibility rules and effects are the same as those for the conversion operations discussed earlier in this chapter.
  • stloc.s <unsigned int8> (0x13): The short-parameterform of stloc. You’ve probably noticed that using short-parameter forms of argument and local variable manipulation instructions results in a double gain against the standard form: not only is the parameter 1 byte instead of 2, but also the opcode is shorter.

Local Block Allocation

With all due respect to the object-oriented approach, sometimes it is necessary (or just convenient) to obtain a plain, C-style chunk of memory. The IL instruction set provides an instruction for such allocation. It is to be noted, however, that this memory is available only while the method is executing and is deallocated on the method exit (via ret or an exception). Only the allocating method itself and the methods it calls can access this memory.

  • localloc (0xFE 0x0F): Allocate a block of memory for the duration of the method execution. The instruction takes the block size (native unsigned int) from the evaluation stack and puts a managed pointer (&) to the allocated block on the evaluation stack. If not enough space is available on the native thread stack, a StackOverflow exception is thrown. This instruction must not appear within any exception handling block. Like any other block instruction, localloc is unverifiable.

Prefix Instructions

The prefix instructions listed in this section have no meaning per se but are used as prefixes for the pointer-consuming instructions—that is, the instructions that take a pointer value from the stack, such as ldind.*, stind.*, ldfld, stfld, ldobj, stobj, initblk, and cpblk—that immediately follow them. When used as prefixes of instructions that don’t consume pointers, the prefix instructions are ignored and do not carry on to the nearest pointer-consuming instruction.

  • unaligned. <unsigned int8> (0xFE 0x12): Indicates that the pointer(s) on the stack are aligned on <unsigned int8> rather than on the pointer size. The <unsigned int8> parameter must be 1, 2, or 4.
  • volatile. (0xFE 0x13): Indicates that the pointer on the stack is volatile—that is, the value it points at can be modified from another thread of execution and the results of its dereferencing therefore cannot be cached for performance considerations.

A prefix instruction affects only the immediately following instruction and does not mark the respective pointer as unaligned or volatile throughout the entire method. Both prefixes can be used with the same instruction—in other words, the pointer on the stack can be marked as both unaligned and volatile; in such a case, the order of appearance of the prefixes does not matter.

The ILAsm syntax requires the prefix instructions to be separated from the next instruction by at least a space symbol:

volatile. ldind.i4// Correct
 
volatile.
ldind.i4// Correct
 
volatile.ldind.i4// Syntax error

Such a mistake is unlikely with the unaligned. instruction because it requires an integer parameter:

unaligned.4 ldind.i4

The prefix instructions tail. and constrained. are specific to method calling, and the prefix instruction readonly. is specific to array manipulation. These prefix instructions are discussed in respective sections of this chapter.

Addressing Fields

Six instructions can be used to load a field value or an address on the stack or to store a value from the stack in a field. A field signature does not indicate whether the field is static or instance, so the IL instruction set defines separate instructions for dealing with instance and static fields. Instructions dealing with instance fields take the instance pointer—an object reference if the field addressed belongs to a class and a managed pointer if the field belongs to a value type—from the stack.

  • ldfld <token> (0x7B): Take the instance from the stack, and load the value of an instance field on the stack. <token> specifies the field being loaded and must be a valid FieldDef or MemberRef token, uncompressed and uncoded. The “instance” may be an object reference, a managed pointer to instance of a value type, or an instance of a value type itself. An unmanaged pointer to an instance of a value type can also be used, but it renders the code unverifiable.
  • ldsfld <token> (0x7E): Load the value of a static field on the stack.
  • ldflda <token> (0x7C): Take the instance pointer from the stack, and load a managed pointer to the instance field on the stack. Unlike ldfld, this instruction cannot use an instance of a value type and takes only an object reference or a pointer (managed or unmanaged) to an instance of a value type. Using an unmanaged pointer renders the code unverifiable.
  • ldsflda <token> (0x7F): Load a managed pointer to the static field on the stack.
  • stfld <token> (0x7D): Take the value from the stack, take the instance pointer from the stack, and store the value in the instance field. “The instance pointer” in this case, like in ldflda instruction, is either an object reference or a pointer to an instance of a value type. In this respect, stfld is asymmetric to ldfld.
  • stsfld <token> (0x80): Store the value from the stack in the static field.

The ILAsm notation requires full field specification, which is resolved to <token> at compile time:

ldfld int32Foo.Bar::ii

The applicable conversion rules when loading and storing values are the same as those discussed earlier. Note also that the fields cannot be of managed pointer type, as was discussed in Chapter 8.

Calling Methods

Methods can be called directly or indirectly. In addition, you can also use the special case of a so-called tail call, discussed in this section. The method signature indicates whether the method is instance or static, so separate instructions for instance and static methods are unnecessary. What the method signature doesn’t hold, however, is information about whether the method is virtual. As a result, separate instructions are used for calling virtual and nonvirtual methods. Besides, even a virtual method may be called as nonvirtual. In this case, the call is not dispatched through the class’s virtual table, and all your nice overrides have no effect. Because of that, the nonvirtual calls of the virtual methods have been declared unverifiable in version 2.0 except in some specific contexts (see “Direct Calls”).

Method call instructions have one parameter: the token of the method being called, either a MethodDef or a MemberRef. The arguments of the method call should be loaded on the stack in order of their appearance in the method signature, with the last signature parameter being loaded last, which is exactly the opposite of what you would normally expect. Instance methods have an “invisible” first argument (an instance pointer, which is an object reference for reference types or a managed pointer for value types) not present in the signature; when an instance method is called, this instance pointer should be loaded on the stack first, preceding all arguments corresponding to the method signature.

Unless the called method returns void, the return value is left on the stack by the callee when the call is completed.

Direct Calls

The IL instruction set contains three instructions intended for the direct method calls (well, “direct” in the sense that all these instructions directly specify the method being called; some purists would not consider a virtual call “direct” because under the hood it is done via the v-table):

  • jmp <token> (0x27): Abandon the current method and jump to the target method, specified by <token>, transferring the current arguments in their present state (which may be different from the original state because the calling method could change the argument values before the jump). Everything else of the current method, including local variables and locally allocated memory, is abandoned. At the moment jmp is invoked, the evaluation stack must be empty, and the arguments are transferred automatically. Because of this, the signature of the target method must exactly match the signature of the method invoking jmp. This instruction should not be used within EH blocks—.try, catch, filter, fault, or finally blocks, discussed in Chapter 14—or within a synchronized region (the code segment protected by a thread lock, such as a mutex). The jmp instruction is unverifiable.
  • call <token> (0x28): Call an instance or static method nonvirtually. You can also call a virtual method, but in this case it is called not through the type’s v-table. (If this sounds somehow vague to you, you might want to return to Chapter 10 and, more precisely, to the sample file Virt_not.il.) The real difference between virtual and nonvirtual instance methods becomes obvious when you create an instance of a class, cast it to the parent type of the class, and then call instance methods on this “child-posing-as-parent” instance. The nonvirtual methods are called directly, bypassing the child type’s v-table, so the parent’s methods will be called in this case. Virtual methods are called through the v-table, and hence the overriding child’s methods will be called. To confirm this, carry out a simple experiment: open the sample file Virt_not.il in a text editor, and change callvirt instance void A::Bar( ) to call instance void A::Bar( ). Then recompile the sample, and run it.
  • callvirt <token> (0x6F): Call the virtual method specified by <token>. This type of method call is conducted through the instance’s v-table. It is possible to call a nonvirtual instance method using callvirt. In this case, the method is called directly simply because the method cannot be found in the v-table. But unlike call, the callvirt instruction first checks the validity of the object reference (this pointer) before doing anything else, which is a very useful feature. The C# compiler exploits it shamelessly, emitting callvirt to call both virtual and nonvirtual instance methods of classes. I say “of classes” because callvirt requires an object reference as the this pointer and will not accept a managed pointer to a value type instance. To use callvirt with an instance of a value type, you need to box the instance first, thus converting it to a class instance carrying the v-table.

CLR 2.0 or later considers nonvirtual calls of virtual methods unverifiable except in the following cases:

  • If the called method is final (cannot be overridden).
  • If the instance reference is that of a boxed value type (value types are all sealed).
  • If the parent class of the called method is sealed.
  • If the calling method and the called method are instance methods of the same class.
  • If the instance pointer is a managed pointer to a value type.

These exceptions cover almost all cases when the called virtual method is guaranteed not to be overridden. I say “almost” because there is at least one such case not covered—when the type of the object reference is reliably traceable and the called method belongs to this type:

.class public A
{
    ...
    .method public virtual void f() { ... }
}
.class public B extends A
{
    ...
    .method public virtual void f() { ... }
}
...
newobj instance void B:: .ctor()
dup
// The objects on the stack are known to be B (not B's descendants cast to B)
call instance void B::f()// should be verifiable, the type matches the object
call instance void A::f()// unverifiable – the type doesn't match the object
...

Indirect Calls

Methods in IL can be called indirectly through the function pointer loaded on the evaluation stack. This allows you to make calls to computed targets—for example, to call a method by a function pointer returned by another method. Function pointers used in indirect calls are unmanaged pointers represented by native int. Two instructions load a function pointer to a specified method on the stack, and one other instruction calls a method indirectly:

  • ldftn <token> (0xFE 0x06): Load the function pointer to the method specified by <token> of MethodDef or MemberRef type.
  • ldvirtftn <token> (0xFE 0x07): Take the object reference (the instance pointer) from the stack, and load the function pointer to the method specified by <token>. The method is looked up in the instance’s v-table.
  • calli <token> (0x29): Take the function pointer from the stack, take all the arguments from the stack, and make an indirect method call according to the method signature specified by <token>. <token> must be a valid StandAloneSig token. The function pointer must be on the top of the stack. If the method returns a value, it is pushed on the stack at the completion of the call, just like in direct calls. The calli instruction is unverifiable, which is not surprising, considering that the call is made via an unmanaged pointer, which is itself unverifiable.

It’s easy enough to see that the combination ldftn/calli is equivalent to call, as long as you don’t consider verifiability, and the combination ldvirtftn/calli is equivalent to callvirt.

The ILAsm notation requires full specification of the method in the ldftn and ldvirtftn instructions, similar to the call and callvirt instructions. The method signature accompanying the calli instruction is specified as <call_conv> <ret_type>(<arg_list>). For example,

.locals init(native int fnptr)
...
ldftn void[mscorlib]System.Console::WriteLine(int32)
stloc.0// Store function pointer in local variable
...
ldc.i412345// Load argument
ldloc.0// Load function pointer
calli void(int32)
...

Tail Calls

Tail calls are similar to method jumps (jmp) in the sense that both lead to abandoning the current method and passing the arguments to the tail-called (jumped-at) method. However, since the arguments of a tail call have to be loaded on the evaluation stack explicitly (a tail call discards the stack frame of the current method, unlike a jump, which preserves the stack frame and can use the arguments already loaded), a tail call—unlike a jump—does not require the entire signature of the called method to match the signature of the calling method; only the return types must be the same or compatible. The tail calls are very useful in implementing massively recursive methods: the caller’s stack frame is discarded in the process of a tail call, so there is no risk of overflowing the stack, no matter how deep the recursion is. This is important for the functional languages, which use recursion instead of loops.

Tail calls are distinguished by the prefix instruction tail. immediately preceding a call, callvirt, or calli instruction:

  • tail. (0xFE 0x14): Mark the following call instruction as a tail call. This instruction has no parameters and does not work with the stack. In ILAsm, this instruction—like the other prefix instructions unaligned. and volatile., discussed earlier—must be separated from the call instruction that follows it by at least a space symbol.

The difference between a method jump and a tail call is that the tail call instruction pair is verifiable in principle, subject to the verifiability of the call arguments, as long as it is immediately followed (in the caller instruction stream) by the ret instruction. As is the case with other prefix instructions, it is illegal to bypass the prefix and branch directly to the prefixed instruction, in this case, call, callvirt, or calli.

Constrained Virtual Calls

Constrained virtual calls were introduced in version 2.0 of the common language runtime in order to deal with instantiations of generic types or methods when the type whose method is called is represented by a type parameter and hence equally might be a reference type or a value type:

.class public G<(IFoo)T>// Interface IFoo specifies method void Foo(int32)
{
    .method public static void CallVirtFoo(class!T t, int32val)
    {
        // How do I get the object ref?
        // If T is a reference type, I just needldarg.0
        // If T is a value type, I needldarga.s0 and thenbox
        ldarg.1
        callvirt instance void IFoo::Foo(int32)
...

Obviously, the applicable calling mechanism must be identified “on the spot,” when the virtual call is about to be executed and the nature of the type instance becomes known. This is not good—the IL code of the method becomes dependent on the nature of the generic parameter.

The constrained virtual calls, unlike unconstrained calls and virtual calls, require a managed pointer (&) to the type instance (this pointer), whether this instance is an object reference or a value type instance. In unconstrained calls, as you know, the this pointer must be an object reference (O) or a managed pointer (&) to a value type instance. Uniform usage of a managed pointer in constrained calls allows you to use the same IL instructions, “preparing” the instance pointer for the virtual call:

.class public G<(IFoo)T>// Interface IFoo specifies method void Foo(int32)
{
    .method public static void CallVirtFoo(class!T t, int32val)
    {
        ldarga.s0// load managed pointer to t
        ldarg.1
        constrained. class!T
        callvirt instance void IFoo::Foo(int32)
...

The applicable calling mechanism is identified as follows:

  • If the type is a reference type (and hence the this pointer is a managed pointer to object reference), then the this pointer is dereferenced yielding the object reference, and the virtual call is executed on this object reference.
  • If the type is a value type (and hence the this pointer is a managed pointer to a value type instance) and the type implements the specified method, then the nonvirtual call is executed on the this pointer; it is safe to do because value types are sealed, and the virtual methods implemented by them cannot be possibly overridden.
  • If the type is a value type and it does not implement the specified method (must have inherited it from its ancestors System.Object, System.ValueType and maybe System.Enum), then the this pointer is dereferenced yielding a value type instance, which is then boxed yielding an object reference, and the virtual call is executed on this object reference. You cannot use the same trick here and call the method nonvirtually because the value type doesn’t implement it; but at least one of its ancestors does, so the method is present in the (boxed) value type’s v-table.

Constrained calls are distinguished by the prefix instruction constrained. immediately preceding a callvirt or ldvirtftn instruction:

  • constrained. <token> (0xFE 0x16): Mark the following virtual call instruction as a constrained call. The <token> usually is a TypeSpec token representing a generic type variable ({E_T_VAR <ordinal>} or {E_T_MVAR <ordinal>}). However, <token> may be a TypeRef or TypeDef token as well, which means that constrained virtual calls can be used with nongeneric types.

The mechanism of constrained virtual calls unifies the way the methods can be called on reference and value types and hence is very useful to compilers, which now don’t have to figure out whether the instance is an object reference or a value type instance. Considering that some languages don’t even make a distinction between reference types and value types and treat all types as objects, the constrained virtual call mechanism is indeed a good addition to the IL instruction set.

Addressing Classes and Value Types

Being object oriented in its base, IL offers quite a few instructions dedicated specifically to manipulating class and value type instances:

  • ldnull (0x14): Load a null object reference on the stack. This is not the same as ldc.i4.0! The resulting bits are the same (all zero), but the type of the top slot of the evaluation stack becomes an object reference (O).
  • ldobj <token> (0x71): Load an instance of value type specified by <token> on the stack. This instruction takes from the stack the managed pointer to the value type instance to be loaded (obtained via the ldarga, ldloca, or ldflda/ldsflda instruction). <token> must be a valid TypeDef, TypeRef, or TypeSpec token. The name of the instruction is somewhat misleading, for it deals with value type instances rather than objects (class instances). The ILAsm notation requires full specification of the value type so that it can be resolved to the token. For example,
    ldobj[ .module other.dll]Foo.Bar
  • stobj <token> (0x81): Take the value type value—no, that’s not a typo—from the stack, take the managed pointer to the value type instance from the stack, and store the value type value in the instance referenced by the pointer. <token> indicates the value type and must be a valid TypeDef, TypeRef, or TypeSpec token. The ILAsm notation is similar to that used for ldobj.
  • ldstr <token> (0x72): Load a string reference on the stack. <token> is a token of a user-defined string, whose RID portion is actually an offset in the #US blob stream. This instruction performs a lot of work: by the token, the Unicode string is retrieved from the #US stream, an instance of the [mscorlib]System.String class is created from the retrieved string, and the object reference is pushed on the stack. In ILAsm, the string is specified explicitly either as a (possibly) composite quoted string, like
    ldstr"Hello World!"
    ldstr"Hello"+" World!"

    or as a byte array, like

    ldstr bytearray(A1 00 A2 00 A3 00 A4 00 A5 00 00 00)

    In the first case, at compile time the composite quoted string is converted to Unicode before being stored in the #US stream. In the second case, the byte array is stored “as is” without conversion. It can be padded with one 0 byte to make the byte count even (if you forget to do it, the IL assembler will do it as a courtesy). Storing a string in the #US stream gives the compiler the string token, which it puts into the IL stream.

  • cpobj <token> (0x70): Copy the value of one value type instance to another instance. This instruction pops the source and the target instance pointers and pushes nothing on the stack. Both instances must be of the value type specified by <token>, either a TypeDef token or a TypeRef token. The ILAsm notation for this instruction is similar to that used for ldobj or stobj.
  • newobj <token> (0x73): Allocate memory for a new instance of a class—not a value type—and call the instance constructor method specified by <token>. The object reference to newly created class instance is pushed on the stack. <token> must be a valid MethodDef, MemberRef or MethodSpec token of a .ctor. The instruction takes from the stack all the arguments explicitly specified in the constructor signature but does not take the instance pointer (no instance exists yet; it’s being created):
    newobj instance void[mscorlib]System.Object:: .ctor()

    The newobj instruction is also used for array creation:

    newobj instance void int32[0...,0...]:: .ctor(int32, int32)

    An array constructor takes as many parameters as there are undefined lower bounds and sizes of the array being created. (Hence, the same number of integer values must be loaded on the stack before newobj is invoked.) In the example just shown, both lower bounds of the two-dimensional array are specified in the array type, so you need to specify only two sizes.

  • initobj <token> (0xFE 0x15): Initialize the value type instance. This instruction takes an instance pointer—a managed pointer to a value type instance—from the stack. <token> specifies the value type and must be a valid TypeDef, TypeRef, or TypeSpec token. The initobj instruction zeroes all the fields of the value type instance, so if you need more sophisticated initialization, you might want to define .ctor and call it instead.
  • castclass <token> (0x74): Cast a class instance to the class specified by <token>. This instruction takes the object reference to the original instance from the stack and pushes the object reference to the cast instance on the stack. <token> must be a valid TypeDef, TypeRef, or TypeSpec token. If the specified cast is illegal, the instruction throws InvalidCast exception. The cast is legal if one of the following conditions is true:
    • If <token> indicates a class and the object reference on the stack is an instance of this class or of any class derived from it.
    • If <token> indicates an interface and the object reference is an instance of the class implementing this interface.
    • If <token> indicates a value type and the object reference is a boxed instance of this value type.
  • isinst <token> (0x75): Check to see whether the object reference on the stack is an instance of the class specified by <token>. <token> must be a valid TypeDef, TypeRef, or TypeSpec token. This instruction takes the object reference from the stack and pushes the result on the stack. If the check succeeds, the result is an object reference, as if castclass had been invoked; otherwise, the result is a null reference, as if ldnull had been invoked. This instruction does not throw exceptions. The check succeeds under the same conditions as listed above.
  • box <token> (0x8C): Convert a value type instance to an object reference. <token> specifies the value type being converted and must be a valid TypeDef, TypeRef, or TypeSpec token. This instruction takes the value type instance from the stack, creates a new instance of the type as an object, and pushes the object reference to this instance on the stack. Since a copy of the instance is created, all further changes of the original instance of the value type have no effect on the boxed instance.
  • unbox <token> (0x79): Revert a boxed value type instance from the object form to its value type form. <token> specifies the value type being converted and must be a valid TypeDef, TypeRef, or TypeSpec token. This instruction takes an object reference from the stack and pushes back a managed pointer to the value type instance. As you can see, unbox is asymmetric to the box instruction: box takes an instance of value type, and unbox returns a managed pointer to such instance; box creates a copy of the instance, and unbox returns a pointer to the value part of the existing instance.
  • unbox.any <token> (0xA5): Introduced in version 2.0 of the CLR, this unboxing instruction is symmetric to the box instruction because it returns an instance of the value type.
  • mkrefany <token> (0xC6): Take a pointer—either managed or unmanaged—from the stack, convert it to a typed reference (typedref), and push the typed reference back on the stack. The typed reference is an opaque handle that carries both type information and an instance pointer. The type of the created typedref is specified by <token>, which must be a valid TypeDef, TypeRef, or TypeSpec token. Typically, this instruction is used to create the typedref values to be passed as arguments to methods that expect typedref parameters. These methods split the typed references into type information and instance pointers using the refanytype and refanyval instructions.
  • refanytype (0xFE 0x1D): Take a typed reference from the stack, retrieve the type information, and push the internal type handle on the stack. This instruction has no parameters.
  • refanyval <token> (0xC2): Take a typed reference from the stack, retrieve the instance pointer (& or native int), and push it on the stack. This instruction has one parameter <token>, which must be a valid TypeDef, TypeRef, or TypeSpec token and must match the type of the typed reference or be its ancestor. In other words, the type of the typed reference must be castable to the type specified by <token>; otherwise, the instruction throws an exception. Why do you need to specify the type by <token> when the type is already present in the typed reference? Well, the type is indeed present, but you need to specify it explicitly for the sake of the verifier, which performs static analysis of the IL code. Without explicit specification of the type, the result of the refanyval instruction would have “whatever type was encoded in the typed reference,” unidentifiable in static analysis. And you don’t want refanyval instruction to be absolutely unverifiable, now do you?
  • ldtoken <token> (0xD0): Convert <token> to an internal handle to be used in calls to the [mscorlib]System.Reflection methods in the .NET Framework class library. The admissible token types are MethodDef, MemberRef, MethodSpec, TypeDef, TypeRef, TypeSpec, and FieldDef. The handle pushed on the stack is an instance of one of the following value types: [mscorlib]System.RuntimeMethodHandle, [mscorlib]System.RuntimeTypeHandle, or [mscorlib]System.RuntimeFieldHandle.

    The ILAsm notation requires full specification for classes (value types), methods, and fields used in ldtoken. This instruction is the only IL instruction that is not specific to methods only or fields only, and thus the keyword method or field must be used:

    ldtoken[mscorlib]System.String
    ldtoken method instance void[mscorlib]System.Object:: .ctor( )
    ldtoken field int32Foo.Bar::ff
  • sizeof <token> (0xFE 0x1C): Load the size in bytes of the value type specified by <token> on the stack. <token> must be a valid TypeDef, TypeRef, or TypeSpec token. This instruction can be applied to the reference types as well, but the usefulness of such an application is questionable: for reference types, sizeof always returns pointer size (4 or 8, depending on the underlying platform).
  • throw (0x7A): Take the object reference from the stack and throw it as a managed exception. See Chapter 14 for details about exception handling.
  • rethrow (0xFE 0x1A): Throw the caught exception again. This instruction can be used exclusively within exception handlers. This instruction does not take anything from the stack. The rethrown exception is whatever was last thrown on the corresponding thread.

Vector Instructions

Arrays and vectors are the only true generics implemented in the first release of the common language runtime. Vectors are “elementary” arrays, with one dimension and a zero lower bound. In signatures, vectors are represented by type ELEMENT_TYPE_SZARRAY, whereas “true” arrays are represented by ELEMENT_TYPE_ARRAY. The two different array types have different layouts and are for the most part unrelated to each other. You can, of course, declare a single-dimensional, zero-lower-bound array (whose ILAsm notation is <type>[0...]), which will be a true array, as opposed to a vector (whose ILAsm notation is <type>[ ]).

image Note  The IL instruction set defines specific instructions dealing with vectors but not with arrays. To handle array elements and arrays themselves, you need to call the methods of the .NET Framework class [mscorlib]System.Array, from which all arrays are derived. However, don’t look in vain among the System.Array’s methods to find the most useful ones—Get, Set, and Address. These methods are provided by the runtime, and unlike other runtime-provided methods, they are not reflected in the metadata of Mscorlib.dll. The Get method takes N (where N is the rank of the array) arguments (all int32) representing indexes in respective array dimensions and returns the value of the indexed element. The Address method takes the same arguments and returns the managed pointer to the indexed element. The Set method takes N indexes and the element value to be assigned, assigns the specified value to the indexed element, and returns void.

Now let’s get back to the vectors.

Vector Creation

In order to work with a vector, it is necessary to create one. The IL instruction set contains special instructions for vector creation and vector length querying:

  • newarr <token> (0x8D): Create a vector. <token> specifies the type of vector elements and must be a valid TypeDef, TypeRef, or TypeSpec token. This instruction pops the number of vector elements (native int or int32) from the stack and pushes an object reference to the created vector on the stack. If the operation fails, an OutOfMemory exception is thrown. If the number of elements happens to be negative, an Overflow exception is thrown. The elements of the newly created vector are zero initialized. For example,
    .locals init(int32[] arr)
    ldc.i4123
    newarr[mscorlib]System.Int32// newarr int32 would work too
    stloc.0

    For specific details about array creation, see the description of the newobj instruction.

  • ldlen (0x8E): Get the element count of a vector. This instruction takes an object reference to the vector instance from the stack and puts the element count (native int) on the stack.

Element Address Loading

You can obtain a managed pointer to a single vector element by using the following instruction:

  • ldelema <token> (0x8F): Get the address (a managed pointer) of a vector element. <token> specifies the type of the element and must be a valid TypeDef, TypeRef, or TypeSpec token. This instruction takes the element index (native int) and the vector reference (an object reference) from the stack and pushes the managed pointer to the element on the stack.
  • readonly. (0xFE 0x1E): Prefix instruction, introduced in version 2.0 of the CLR, to be used with the ldelema instruction. With this prefix, ldelema yields not a managed pointer to the element but a controlled mutability managed pointer, which can be used only as a source pointer but not as a destination pointer. For example, you can use a controlled mutability pointer as an instance pointer when accessing instance fields or methods, as a source pointer in cpobj instruction, or as a pointer in an ldind or ldobj instruction. But it cannot be used in any instruction that would change the instance it points to, such as initobj, stind, stobj, or mkrefany. The main reason for introducing this prefix instruction is to avoid the type check ldelema must execute when the type specified might be a reference class. If the ldelema instruction is preceded by the readonly. prefix, the type check is suppressed, because it is known that the obtained pointer cannot be used for writing purposes (the referent type instance cannot be mutated). Of course, the type might have a method that mutates the instance, and the controlled mutability pointer can be used to call this method and thus mutate the instance. Also, you can use this pointer for the stfld instruction, and if it doesn’t mutate the instance, I don’t know what does. So, a controlled mutability pointer is not exactly a read-only pointer. However, you can only fiddle with this very instance (pointed at by the controlled mutability pointer); you cannot replace it with another instance, and in this sense the pointer is read-only.

Element Loading

Element loading instructions load a vector element of an elementary type on the stack. All these instructions take the element index (native int) and the vector reference (an object reference) from the stack and put the value of the element on the stack. If the vector reference is null, the instructions throw a NullReference exception. If the index is negative or greater than or equal to the element count of the vector, an IndexOutOfRange exception is thrown.

  • ldelem.i1 (0x90): Load a vector element of type int8.
  • ldelem.u1 (0x91): Load a vector element of type unsigned int8.
  • ldelem.i2 (0x92): Load a vector element of type int16.
  • ldelem.u2 (0x93): Load a vector element of type unsigned int16.
  • ldelem.i4 (0x94): Load a vector element of type int32.
  • ldelem.u4 (0x95: Load a vector element of type unsigned int32.
  • ldelem.i8 (ldelem.u8) (0x96): Load a vector element of type int64.
  • ldelem.i (0x97): Load a vector element of type native int.
  • ldelem.r4 (0x98): Load a vector element of type float32.
  • ldelem.r8 (0x99): Load a vector element of type float64.
  • ldelem.ref (0x9A): Load a vector element of object reference type.
  • ldelem (ldelem.any) <token> (0xA3): Load a vector element of the type specified by <token>, which must be a valid TypeDef, TypeRef, or TypeSpec token. This instruction was introduced in version 2.0 of the CLR to deal with, as you probably guessed, vectors and arrays of a type defined by a generic type variable. The good thing about this instruction (and its sister instruction stelem described next) is that it allows you to load elements of vectors of an arbitrary value type.

Element Storing

Element storing instructions store a value from the stack in a vector element of an elementary type. All these instructions take the value to be stored, the element index (native int), and the vector reference (an object reference) from the stack and put nothing on the stack. Generally, the instructions can throw the same exceptions as the ldelem.* instructions described in the preceding section.

  • stelem.i (0x9B): Store a value in a vector element of type native int.
  • stelem.i1 (0x9C): Store a value in a vector element of type int8.
  • stelem.i2 (0x9D): Store a value in a vector element of type int16.
  • stelem.i4 (0x9E): Store a value in a vector element of type int32.
  • stelem.i8 (0x9F): Store a value in a vector element of type int64.
  • stelem.r4 (0xA0): Store a value in a vector element of type float32.
  • stelem.r8 (0xA1): Store a value in a vector element of type float64.
  • stelem.ref (0xA2): Store a value in a vector element of the object reference type. This instruction involves the casting of the object on the stack to the type of the vector element, so an InvalidCast exception can be thrown.
  • stelem (stelem.any) <token> (0xA4): Store a value in a vector element of the type specified by <token>, which must be a valid TypeDef, TypeRef, or TypeSpec token. This instruction was introduced in version 2.0 of the CLR to support vectors and arrays of the type defined by a generic type variable.

Special stelem.* instructions for unsigned integer types are missing for an obvious reason: the stelem.i* instructions are equally applicable to signed and unsigned integer types because the sizes of these integer types are the same.

Code Verifiability

The verification algorithm associates IL instructions with the number of stack slots occupied and available at each moment and with valid evaluation stack states. Stack overflows and underflows render the code not only unverifiable but invalid as well. The verification algorithm also requires that all local variables are zero initialized before the method execution begins. As a result, the .locals directive—at least one, if several of these are used throughout the method—must have the init clause in order for the method to be verifiable.

The verification algorithm simulates all possible control flow paths and branchings, checking to see whether the stack state corresponding to every reachable instruction is legal for this instruction. It is impossible, of course, to predict the actual values stored on the evaluation stack at every moment, but the number of stack slots occupied and the types of the slots can be analyzed.

As mentioned, the evaluation stack type system is coarser than the metadata type system used for field, argument, and local variable types. Hence, the type validity of instructions transferring data between the stack and other typed memory categories depends on the type conversion performed during such transfers. Table 13-5 lists type conversions between different type systems (for example, a value of a local variable of type int8 loaded on the stack becomes int32, and the managed pointer to the same local variable is int8&).

Table 13-5. Evaluation Stack Type Conversions

Metadata Type

Stack Type

Managed Pointer to Type

[unsigned] int8, bool

int32

int8&

[unsigned] int16, char

int32

int16&

[unsigned] int32

int32

int32&

[unsigned] int64

int64

int64&

native [unsigned] int, function pointer

native int

native int&

float32

Float (“native-sized”)

float32&

float64

Float (“native-sized”)

float64&

Value type

Same type (see substitution rules in this section)

Same type&

Object

Same type (see substitution rules in this section)

Same type&

According to verification rules, if top-of-stack has type A, then the current instruction, expecting to find a type B, is verifiable in the following cases only:

  • If A is a class and B is the same class or any class derived from A.
  • If A an interface and B is a class implementing this interface.
  • If both A and B are interfaces and the implementation of B requires the implementation of A.
  • If A is a class or an interface and B is a null reference.
  • If both A and B are function pointers and the signatures of their respective methods match.
  • If both A and B are vectors and their element types can be respectively substituted as outlined previously.
  • If both A and B are arrays of the same rank and their element types can be respectively substituted.
  • If both A and B are vectors or both are arrays of the same rank and their element types are identically sized integers (for example, int32[] and uint32[]).

These substitution rules set the limits of “type leeway” allowed for the IL code to remain verifiable. As the verification algorithm proceeds from one instruction to another along every possible path, it checks the simulated stack types against the types expected by the next instruction. Failure to comply with the substitution rules results in verification failure and possibly indicates invalid IL code.

A few verification rules, rather heuristic than formal, are based on the question “Is it possible in principle to do something unpredictable using this construct?”

  • Any code containing embedded native code is unverifiable.
  • Any code using unmanaged pointers is unverifiable.
  • Any code containing calls to methods that return managed pointers is unverifiable, unless the managed pointers are to instances of reference types or their parts. The reason is that, theoretically, the called method might return a managed pointer to one of its local variables or another “perishable” item. The instances of reference types reside in the GC heap and cannot be considered “perishable.”
  • An instance constructor of a class must call the instance constructor of the base class. The reason is that until the base class .ctor is called, the instance pointer (this) is considered uninitialized; and until this is initialized, no instance methods should be called.
  • When a delegate is being instantiated—its constructor takes a function pointer as the last argument—the newobj instruction must be immediately preceded by the ldftn or ldvirtftn instruction, which loads the function pointer. If anything appears between these two instructions, the code becomes unverifiable.

A great many additional rules regulate exception handling, but the place to discuss them is the next chapter.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.118.0.42