Usually the exception handling model of a programming language is considered the domain of that particular language’s runtime. Under the hood, each language has its own way of detecting exceptions and locating an appropriate exception handler. Some languages perform exception handling completely within the language runtime, whereas others rely on the structured exception handling (SEH) mechanism provided by the operating system—which in your case is Win32 or Win64.
In the world of managed code, exception handling is a fundamental feature of the common language runtime execution engine. The execution engine is fully capable of handling exceptions without regard to language, allowing exceptions to be raised in one language and caught in another. At that, the runtime does not dictate any particular syntax for handling exceptions. The exception mechanism is language neutral in that it is equally efficient for all languages.
No special metadata is captured for exceptions other than the metadata for the exception classes themselves. No association exists between a method of a class and the exceptions that the method might throw. Any method is permitted to throw any exception at any time.
Although we talk about managed exceptions thrown and caught within managed code, a common scenario involves a mix of both managed and unmanaged code. Execution threads routinely traverse managed and unmanaged blocks of code through the use of the common language runtime’s platform invocation mechanism (P/Invoke) and other interoperability mechanisms (see Chapter 18). Consequently, during execution, exceptions can be thrown or caught in either managed code or unmanaged code.
The runtime exception handling mechanism integrates seamlessly with the Win32/Win64 SEH mechanism so that exceptions can be thrown and caught within and between the two exceptionhandling systems.
EH Clause Internal Representation
Managed exception handling tables are located immediately after a method’s IL code, with the beginning of the table aligned on a double word boundary. It would be more accurate to say that “additional sections” are located after the method IL code, but the existing releases of the common language runtime allow only one kind of additional section: the exception handling section.
This additional section begins with the section header, which comes in two varieties (small and fat) and contains two entries, Kind and DataSize. In a small header, DataSize is represented by 1 byte, whereas in a fat header, DataSize is 3 bytes long. A Kind entry can contain the following binary flags:
The section header—padded with 2 bytes if small, which makes one wonder why the small header was introduced at all—is followed by a sequence of EH clauses, which can also have a small or fat format. Each EH clause describes a single triad made up of a guarded block, an exception identification, and an exception handler. The entries of small and fat EH clauses have the same names and meanings but different sizes, as shown in Table 14-1.
Table 14-1. EH Clause Entries
Branching into or out of guarded blocks and handler blocks is illegal. A guarded block must be entered “through the top”—that is, through the instruction located at TryOffset—and handler blocks are entered only when they are engaged by the exception handling subsystem of the execution engine. To exit guarded and handler blocks, you must use the instruction leave (or leave.s). You might recall that in Chapter 2, this principle was formulated as “leave only by leave.” Another way to leave any block is to throw an exception using the throw or rethrow instruction.
Types of EH Clauses
Exception handling clauses are classified by the algorithm of the handler engagement. Four mutually exclusive EH clause types are available, and because of that, the Flags entry must hold one of the following values:
Figure 14-1 illustrates this process. If an exception of type A is thrown within the innermost guarded block, it is caught and processed by the first handler (catch A), and the finally handler is engaged after the first handler executes the leave instruction. If an exception of type B is thrown, it is caught by the third handler (catch B); this fact is registered by the runtime, and the finally handler is executed before the third handler. If no exception is thrown within the guarded block, the finally handler is engaged when the guarded block executes the leave instruction.
Figure 14-1. Engagement of the finally exception handler
Label Form of EH Clause Declaration
The most generic form of ILAsm notation of an EH clause is
.try<label> to<label> <EH_type_specific> handler<label> to<label>
where <EH_type_specific> ::=
catch<class_ref> |
filter<label>|
finally |
fault
Take alook at this example:
BeginTry:
...
leave KeepGoing
BeginHandler:
...
leave KeepGoing
KeepGoing:
...
ret
.try BeginTry to BeginHandler catch[mscorlib]System.Exception
handler BeginHandler to KeepGoing
In the final lines of the example, the code .try <label> to <label> defines the guarded block, and handler <label> to <label> defines the handler block. In both cases, the second <label> is exclusive, pointing at the first instruction after the respective block. ILAsm imposes a limitation on the positioning of the EH clause declaration directives: all labels used in the directives must have already been defined. Thus, the best place for EH clause declarations in the label form is at the end of the method scope.
In the case just presented, the handler block immediately follows the guarded block, but you could put the handler block anywhere within the method, provided it does not overlap with the guarded block or other handlers:
...
br AfterHandler// Can't enter the handler block on our own
BeginHandler:
...
leave KeepGoing
AfterHandler:
...
BeginTry:
...
leave KeepGoing
KeepGoing:
...
ret
.try BeginTry to KeepGoing catch[mscorlib]System.Exception
handler BeginHandler to AfterHandler
A single guarded block can have several catch or filter handlers:
...
br AfterHandler2// Can't enter the handler block(s) on our own
BeginHandler1:
...
leave KeepGoing
AfterHandler1:
...
BeginHandler2:
...
leave KeepGoing
AfterHandler2:
...
BeginTry:
...
leave KeepGoing
KeepGoing:
...
ret
.try BeginTry to KeepGoing
catch[mscorlib]System.NullReferenceException
handler BeginHandler1 to AfterHandler1
.try BeginTry to KeepGoing catch[mscorlib]System.Exception
handler BeginHandler2 to AfterHandler2
In the case of multiple handlers—catch or filter but not finally or fault, as explained next—the guarded block declaration need not be repeated.
.try BeginTry to KeepGoing
catch[mscorlib]System.NullReferenceException
handler BeginHandler1 to AfterHandler1
catch[mscorlib]System.Exception
handler BeginHandler2 to AfterHandler2
The lexical order of handlers belonging to the same guarded block is the order in which the IL assembler emits the EH clauses and is the same order in which the execution engine of the runtime processes these clauses. You must be careful about ordering the handlers. For instance, if you swap the handlers in the preceding example, the handler for [mscorlib]System.Exception will always be executed, and the handler for [mscorlib]System.NullReferenceException will never be executed. This is because all standard exceptions are (and user-defined should be) derived, eventually, from [mscorlib]System.Exception, and hence all exceptions are caught by the first handler, leaving the other handlers unemployed.
The finally and fault handlers cannot peacefully coexist with other handlers, so if a guarded block has a finally or fault handler, it cannot have anything else. To combine a finally or fault handler with other handlers, you need to nest the guarded and handler blocks within other guarded blocks, as shown in Figure 14-1, so that each finally or fault handler has its own personal guarded block.
Scope Form of EH Clause Declaration
The label form of the EH clause declaration is universal, ubiquitous, and close to the actual representation of the EH clauses in the EH table. The only quality the label form lacks is convenience. In view of that, ILAsm offers an alternative form of EH clause description: a scope form. You’ve already encountered the scope form in Chapter 2, which discussed protecting the code against possible surprises in the unmanaged code being invoked. Just to remind you, here’s what the protected part of the method (from the sample file Simple2.il on the Apress web site) looks like:
...
.try{
// Guarded block begins
call string[mscorlib]System.Console::ReadLine()
//pop
//ldnull
ldstr"%d"
ldsflda int32Odd.or.Even::val
call vararg int32sscanf(string, string,..., int32*)
stloc.0
leave.s DidntBlowUp
// Guarded block ends
}
catch[mscorlib]System.Exception
{// Exception handler begins
pop
ldstr"KABOOM!"
call void[mscorlib]System.Console::WriteLine(string)
leave.s Return
}// Exception handler ends
DidntBlowUp:
...
The scope form can be used only for a limited subset of all possible EH clause configurations: the handler blocks must immediately follow the previous handler block or the guarded block. If the EH clause configuration is different, you must resort to the label form or a mixed form in which the guarded block is scoped but the catch handler is specified by IL offsets, or vice versa.
...
br AfterHandler
HandlerBegins:
// The exception handler code
...
leave KeepGoing
AfterHandler:
...
.try{
// Guarded code
...
leave KeepGoing
}
catch[mscorlib]System.Exception
handler HandlerBegins to AfterHandler
...
KeepGoing:
...
The IL disassembler by default outputs the EH clauses in the scope form—at least those clauses that can be represented in this form. However, there is an option to suppress the scope form and output all EH clauses in their label form (command-line option /RAW). But let’s suppose for the sake of convenience that you can shape the code in such a way that the contiguity condition is satisfied, allowing you to use the scope form. A single guarded block with multiple handlers in scope form will look like this:
.try{
// Guarded code
...
leave KeepGoing
}
catch[mscorlib]System.NullReferenceException {
// The exception handler #1 code
...
leave KeepGoing
}
catch[mscorlib]System.Exception {
// The exception handler #2 code
...
leave KeepGoing
}
...
KeepGoing:
...
Much more readable, isn’t it? The nested EH configuration shown earlier in Figure 14-1 is easily understandable when written in scope form:
.try{
.try{
.try{
// Guarded code
...
leave L1
}
catch A {
// This code works when exception A is thrown
...
leave L2
}
} // No need for leave here!
finally{
// This code works in any case
...
endfinally
}
}// No need forleavehere either!
catch B {
// This code works when exception B is thrown in guarded code
...
leave L3
}
L1:
...
L2:
...
L3:
...
The filter EH clauses in scope form are subject to the same limitation: the handler block must immediately follow the guarded block. But in a filter clause the handler block includes first the filter block and then, immediately following it, the actual handler, so the scope form of a filter clause looks like this:
.try{
// Guarded code
...
leave KeepGoing
}
filter{
// Here we decide whether we should invoke the actual handler
...
ldc.i4.1// OK, let's invoke the handler
endfilter
} {
// Actual handler code
...
leave KeepGoing
}
KeepGoing:
...
And, of course, you easily switch between scope form and label form within a single EH clause declaration. The general ILAsm syntax for an EH clause declaration is as follows:
<EH_clause> ::= .try<guarded_block>
<EH_type_specific> <handler_block>
Where
<guarded_block> ::= <label> to<label>| <scope>
<EH_type_specific> ::= catch<class_ref> |
filter<label> | filter<scope> |
finally|
fault
<handler_block> ::= handler<label> to<label> | <scope>
The nonterminals <label> and <class_ref> must be familiar by now, and the meaning of <scope> is obvious: “code enclosed in curly braces.”
Processing the Exceptions
The execution engine of the CLR processes an exception in two passes. The first pass determines which, if any, of the managed handlers will process the exception. Starting at the top of the EH table for the current method frame, the execution engine compares the address where the exception occurred to the TryOffset and TryLength entries of each EH clause. If it finds that the exception happened in a guarded block, the execution engine checks to see whether the handler specified in this clause will process the exception. (The “rules of engagement” for catch and filter handlers were discussed in previous sections.) If this particular handler can’t be engaged—for example, the wrong type of exception has been thrown—the execution engine continues traversing the EH table in search of other clauses that have guarded blocks covering the exception locus. The finally and fault handlers are ignored during the first pass.
If none of the clauses in the EH table for the current method are suited to handling the exception, the execution engine steps up the call stack and starts checking the exception against EH tables of the method that called the method where the exception occurred. In these checks, the call site address serves as the exception locus. This process continues from method frame to method frame up the call stack, until the execution engine finds a handler to be engaged or until it exhausts the call stack. The latter case is the end of the story: the execution engine cannot continue with an unhandled exception on its conscience, and the runtime executes all finally and fault handlers and then either aborts the application execution or offers the user a choice between aborting the execution and invoking the debugger, depending on the runtime configuration.
If a suitable handler is found, the execution engine swings into the second pass. The execution engine again walks the EH tables it worked with during the first pass and invokes all relevant finally and fault handlers. Each of these handlers ends with the endfinally instruction (or endfault, its synonym), signaling the execution engine that the handler has finished and that it can proceed with browsing the EH tables. Once the execution engine reaches the catch or filter handler it found on its first pass, it engages the actual handler.
What happens to the method’s evaluation stack? When a guarded block is exited in any way, the evaluation stack is discarded. If the guarded block is exited peacefully, without raising an exception, the leave instruction discards the stack; otherwise, the evaluation stack is discarded the moment the exception is thrown.
During the first pass, the execution engine puts the exception object on the evaluation stack every time it invokes a filter block. The filter block pops the exception object from the stack and analyzes it, deciding whether this is a job for its actual handler block. The decision, in the form of int32 having the value 1 or 0 (engage the handler or don’t, respectively), is the only thing that must be on the evaluation stack when the endfilter instruction is reached; otherwise, the IL verification will fail. The endfilter instruction takes this value from the stack and passes it to the execution engine.
During the second pass, the finally and fault handlers are invoked with an empty evaluation stack. These handlers do nothing about the exception itself and work only with method arguments and local variables, so the execution engine doesn’t bother providing the exception object. If anything is left on the evaluation stack by the time the endfinally (or endfault) instruction is reached, it is discarded by endfinally (or endfault).
When the actual handler is invoked, the execution engine puts the exception object on the evaluation stack. The handler pops this object from the stack and handles it to the best of its abilities. When the handler is exited by using the leave instruction, the evaluation stack is discarded.
Table 14-2 summarizes the stack evolutions.
Table 14-2. Changes in the Evaluation Stack
When the Block |
Is Entered, the Stack… |
Is Exited, the Stack… |
---|---|---|
try |
Must be empty |
Is discarded |
filter |
Holds the exception object |
Must hold a single int32 value, equal to 1 or 0, consumed by endfilter |
handler |
Holds the exception object |
Is discarded |
finally, fault |
Is empty |
Is discarded |
Two IL instructions are used for raising an exception explicitly: throw and rethrow. The throw instruction takes the exception object (ObjectRef) from the stack and raises the exception. This instruction can be used anywhere, within or outside any EH block.
The rethrow instruction can be used within catch handlers only (not within the filter block), and it does not work with the evaluation stack. This instruction signals the execution engine that the handler that was supposed to take care of the caught exception has for some reason changed its mind and that the exception should therefore be offered to the higher-level EH clauses. The only blocks where the words “caught exception” mean something are the catch handler block and the filter block, but invoking rethrow within a filter block, though theoretically possible, is illegal. It is legal to throw the caught exception from the filter block, but it doesn’t make much sense to do so: the effect is the same as if the filter simply refused to handle the exception, by loading 0 on the stack and invoking endfilter.
Rethrowing an exception is not the same as throwing the caught exception, which you have on the evaluation stack upon entering a catch handler. The rethrow instruction preserves the call stack trace of the original exception so that the exception can be tracked down to its point of origin. The throw instruction starts the call stack trace anew, giving you no way to determine where the original exception came from.
Exception Types
Chapter 13 mentioned some of the exception types that can be thrown during the execution of IL instructions. Earlier chapters mentioned some of the exceptions thrown by the loader and the JIT compiler. Now it’s time to review all these exceptions in an orderly manner.
All managed exceptions defined in the .NET Framework classes are descendants of the mscorlib]System.Exception class. This base exception type, however, is never thrown by the common language runtime. In the following sections, I’ve listed the exceptions the runtime does throw, classifying them by major runtime subsystems. Enjoying the monotonous repetition no more than you do, I’ve omitted the [mscorlib]System. part of the names, common to all exception types. As you can see, many of the exception type names are self-explanatory.
Loader Exceptions
The loader represents the first line of defense against erroneous applications, and the exceptions it throws are related to the file presence and integrity:
JIT Compiler Exceptions
The JIT compiler throws only two exceptions. The second one can be thrown only when the code is not fully trusted (for example, comes from the Internet).
Execution Engine Exceptions
The execution engine throws a wide variety of exceptions, most of them related to the operations on the evaluation stack. A few exceptions are thrown by the thread control subsystem of the engine.
Interoperability Services Exceptions
The following exceptions are thrown by the interoperability services of the common language runtime, which are responsible for managed and unmanaged code interoperation:
Subclassing the Exceptions
In addition to the plethora of exception types already defined in the .NET Framework classes, you can always devise your own types tailored to your needs. The best way to do this is to derive your exception types from the “standard” types listed in the preceding sections.
The following exception types are sealed and can’t be subclassed. Again, I’ve omitted the [mscorlib]System. portion of the names.
Caution As mentioned earlier, I must warn you against devising your own exception types not derived from [mscorlib]System.Exception or some other exception type of the .NET Framework classes.
Unmanaged Exception Mapping
When an unmanaged exception occurs within a native code segment, the execution engine maps it to a managed exception that is thrown in its stead. The different types of unmanaged exceptions, identified by their status code, are mapped to the managed exceptions as described in Table 14-3.
Table 14-3. Mapping Between the Managed and Unmanaged Exceptions
Unmanaged Exception Status Code |
Mapped to Managed Exception |
---|---|
STATUS_FLOAT_INEXACT_RESULT |
ArithmeticException |
STATUS_FLOAT_INVALID_OPERATION |
ArithmeticException |
STATUS_FLOAT_STACK_CHECK |
ArithmeticException |
STATUS_FLOAT_UNDERFLOW |
ArithmeticException |
STATUS_FLOAT_OVERFLOW |
OverflowException |
STATUS_INTEGER_OVERFLOW |
OverflowException |
STATUS_FLOAT_DIVIDE_BY_ZERO |
DivideByZeroException |
STATUS_INTEGER_DIVIDE_BY_ZERO |
DivideByZeroException |
STATUS_FLOAT_DENORMAL_OPERAND |
FormatException |
STATUS_ACCESS_VIOLATION |
NullReferenceException |
STATUS_ARRAY_BOUNDS_EXCEEDED |
IndexOutOfRangeException |
STATUS_NO_MEMORY |
OutOfMemoryException |
STATUS_STACK_OVERFLOW |
StackOverflowException |
All other status codes |
Runtime.InteropServices.SEHException |
Summary of EH Clause Structuring Rules
The rules for structuring EH clauses within a method are neither numerous nor overly complex.
All the blocks—try, filter, handler, finally, and fault—of each EH clause must be fully contained within the method code. No block can protrude from the method.
The guarded blocks and the handler blocks belonging to the same EH clause or different EH clauses can’t partially overlap. A block either is fully contained within another block or is completely outside it. If one guarded block (A) is contained within another guarded block (B) but is not equal to it, all handlers assigned to A must also be fully contained within B.
A handler block of an EH clause can’t be contained within a guarded block of the same clause, and vice versa. And a handler block can’t be contained in another handler block that is assigned to the same guarded block.
A filter block can’t contain any guarded blocks or handler blocks.
All blocks must start and end on instruction boundaries—that is, at offsets corresponding to the first byte of an instruction. Prefixed instructions must not be split, meaning that you can’t have constructs such as tail. .try { call ... }.
A guarded block must start at a code point where the evaluation stack is empty.
The same handler block can’t be associated with different guarded blocks:
.try Label1 to Label2 catch A handler Label3 to Label4
.try Label4 to Label5 catch B handler Label3 to Label4 // Illegal!
If the EH clause is a filter type, the filter’s actual handler must immediately follow the filter block. Since the filter block must end with the endfilter instruction, this rule can be formulated as “the actual handler starts with the instruction after endfilter.”
If a guarded block has a finally or fault handler, the same block can have no other handler. If you need other handlers, you must declare another guarded block, encompassing the original guarded block and the handler:
.try{
.try{
.try{
// Code that needs finally, catch, and fault handlers
...
leave KeepGoing
}
finally{
...
endfinally
}
}
catch[mscorlib]System.StackOverflowException
{
...
leave KeepGoing
}
}
fault{
...
endfault
}
KeepGoing:
...
3.142.98.240