Chapter 18. Unsafe Code

Unsafe code can access unmanaged memory, which is outside the realm of the Common Language Runtime (CLR). Conversely, safe code is limited to accessing the managed heap. The managed heap is managed by the Garbage Collector, which is a component of the CLR. Code restricted to the managed heap is intrinsically safer than code that accesses unmanaged memory. The CLR automatically releases unused objects, performs type verification, and conducts other checks on managed memory. This is not done automatically for unmanaged memory; rather, the developer is responsible for these tasks. With managed code, developers can focus on core application development instead of administrative tasks such as memory management. For this reason, safe code improves programmer productivity and customer satisfaction.

You can access unmanaged memory with raw pointers, which are only available to unsafe code. Pointers point to a fixed location in unmanaged memory, whereas reference types point to a movable location in managed memory. The CLR manages reference types, which includes controlling the lifetime of objects and calling cleanup code. Developers do not delete memory allocated for reference types. In C and C++ programs, where pointers are used extensively, developers are preoccupied with memory management. Despite this, improper pointer management causes many common problems, including memory leaks, accessing invalid memory, deleting bad pointers, and fencepost errors. Abstracting the nuances of pointer management and manipulation with reference types has made managed code safer than unmanaged code. However, when needed, you can write unsafe code and access pointers directly.

When is unsafe code appropriate? Unsafe code should be used as an exception, not the rule. There are specific circumstances in which unsafe code is recommended:

  • Unmanaged code often relies heavily on pointers. When porting this code to C#, incorporating some unsafe code in the managed application might make the conversion more straightforward. Most nontrivial C and C++ programmers heavily leverage pointers.

  • Implementing a software algorithm where pointers are integral to the design might necessitate unsafe code.

  • Calling an unmanaged function that requires a function pointer as a parameter.

  • Pointers might be easier and more convenient when working with binary and memory-resident data structures.

  • Unmanaged pointers might improve performance and efficiencies in certain circumstances.

Code in an unmanaged section is considered unsafe and not accessible to the CLR. Therefore, no code verification, stack tracing, or other checking is performed on the unmanaged code, which makes the code less safe.

Developers sometimes need to call unmanaged code from managed applications. Although the Microsoft .NET Framework Class Library (FCL) contains most of the code needed for .NET application development, the FCL umbrella does not encompass everything. You might need to call application programming interfaces (APIs) in operating system libraries for behavior defined outside the FCL. In addition, third-party software might be available only as unmanaged code.

Alternatively, you might need to call managed code from an unmanaged module, such as during a callback. In addition, managed components might be exposed as COM objects to COM clients, which are unmanaged.

Platform invoke (P/Invoke) is the bridge between managed and unmanaged execution. The bridge is bidirectional. Marshaling is the primary concern of cross-platform calls and is the responsibility of the Interop marshaler. Marshaling converts parameters and return values between unmanaged and managed formats. Fortunately, marshaling is not always required, which can avoid unnecessary overhead. Certain types, known as blittable types, do not require transformation and are the same in managed and unmanaged memory.

You also can build bridges between managed code and COM components, which contain unmanaged code. The Runtime Callable Wrapper (RCW) helps managed code call COM components. The COM Callable Wrapper (CCW) wraps a managed component as a COM component. This makes the managed component accessible to unmanaged COM clients. COM components also are available directly via P/Invoke. However, the CCW and RCW are more convenient and are the recommended solutions in most circumstances. COM interoperability is not a topic for this book. COM Programming with Microsoft .NET, by John Paul Mueller and Julian Templeman (Microsoft Press, 2003), is an excellent resource for additional information on COM interoperability and .NET.

Because code access security does not extend to unsafe code, unsafe code is not trusted. Type verification, which helps prevent buffer overrun attacks, is not performed, nor is code verification. Therefore, the reliability of the unsafe code is undetermined. Because it is not trusted, elevated permissions are required to call unsafe code from managed code. For this reason, applications that rely on unsafe code might not execute successfully in every deployment situation and should be thoroughly tested in all potential scenarios. Managed code requires the SecurityPermission.UnmanagedCode permission to call unsafe code. The SuppressUnmanagedCodeSecurityAttribute attribute disables the stack walk that confirms the SecurityPermission.UnmanagedCode permission in callers. This attribute is a free pass for other managed code to call unsafe code. This option is convenient but potentially dangerous.

Managed applications that include unsafe code must be compiled with the unsafe option. The C# compiler option is simply /unsafe. In Microsoft Visual Studio 2008, this option is found in the project properties. In Solution Explorer, right-click the project name and choose Properties from the context menu. Alternatively select <project> Properties from the Project menu. In the Build window, choose the Allow Unsafe Code option, as shown in Figure 18-1.

The Build tab of the <project> Properties window, with the Allow Unsafe Code option selected

Figure 18-1. The Build tab of the <project> Properties window, with the Allow Unsafe Code option selected

Unsafe Keyword

The unsafe keyword specifies the location of unsafe code. Code inside the target can be unsafe. When the keyword is applied to a type, all the members of that type are considered unsafe as well. You also can apply the unsafe keyword to specific members of a type. If applied to a function member, the entire function operates in the unsafe context.

In the following code, the ZStruct contains two fields that are pointers. Each is annotated with the unsafe keyword:

public struct ZStruct {
    public unsafe int* fielda;
    public unsafe double* fieldb;
}

In the following example, ZStruct is marked as unsafe. The unsafe context extends to the entire structure, which includes the two fields. Both fields are therefore considered unsafe.

public unsafe struct ZStruct {
    public int* fielda;
    public double* fieldb;
}

In addition, You can create an unsafe block using the unsafe statement. All code encapsulated by the block is in the unsafe context. The following code has an unsafe block and an unsafe method. Within the unsafe block in the Main method, MethodA is called and passed an int pointer as a parameter. MethodA is an unsafe method. It assigns the int pointer to a byte pointer, which now points to the lower byte of the int value. The value at that lower byte is returned from MethodA. For an int value of 296, MethodA returns 40.

public static void Main() {
    int number = 296;
    byte b;
    unsafe {
        b = MethodA(&number);
    }
    Console.WriteLine(b);
}

public unsafe static byte MethodA(int* pI) {
    byte* temp = (byte*) pI;

    return *temp;
}

The unsafe status of a base class is not inherited by a derived class. Unless explicitly designated as unsafe, a derived class is safe. The derived class can use unsafe members of the base class that are visible.

In the following code, a compiler error occurs in the derived type. The fieldb member of YClass requires an unsafe context, which is not inherited from the ZClass base class. Add the unsafe keyword explicitly to fieldb, and the code will compile successfully:

public unsafe class ZClass {
    protected int* fielda;
}

public class YClass: ZClass {
    protected int* fieldb;  // compiler error
}

Pointers

Unsafe code is often about direct access to pointers, which point to a fixed location in memory. Because the location is fixed, the pointer is reliable and can be used for dereferencing, pointer math, and other traditional pointer-type manipulation. Pointers are outside the control of the Garbage Collector. The developer (not the Garbage Collector) is responsible for managing the lifetime of the pointer, if necessary.

C# does not expose pointers automatically. Exposing a pointer requires an unsafe context. In C#, pointers normally are abstracted using references. The reference abstracts a pointer to memory on the managed heap. The reference and related memory are managed by the Garbage Collector and the related memory is subject to relocation. A movable pointer underlies a reference, which is why references are not available for direct pointer manipulation. Pointer manipulation on a movable address would yield unreliable results.

Here is the syntax for declaring a pointer:

unmanagedtype* identifier;
unmanagedtype* identifier = initializer;

You can declare multiple pointers in a single statement using comma delimiters. Notice that the syntax is slightly different from C or C++ languages:

int* pA, pB, pC; // C++: int *pA, *pB, *pC;

The unmanaged types (a subset of managed types) that can be used with pointers are sbyte, byte, short, ushort, int, uint, long, ulong, char, float, double, decimal, bool, and enum. Some managed types, such as string, are not included in this list. You can create pointers to user-defined structures, assuming that they contain only unmanaged types as fields. Pointer types do not inherit from System.Object, so they cannot be cast to or from System.Object.

Void pointers are allowed, but they are dangerous. This is a typeless pointer that can emulate any other pointer type. All pointer types can be cast implicitly to a void pointer. This unpredictability makes void pointers particularly unsafe. You cannot cast implicitly between unrelated pointer types. Explicit casting between most pointer types is allowed. As expected, the following code would cause a compiler error because of the pointer mismatch. This assignment could be forced with an explicit cast to another pointer type. In that circumstance, the developer assumes responsibility for the safety of the pointer assignment.

int val = 5;
float* pA = &val; // compiler error

You can initialize a pointer with the address of a value or with another pointer. In the following code, both methods of initializing a pointer are shown:

public unsafe static void Main() {
    int ival = 5;
    int* p1 = &ival;  // dereference
    int* p2 = p1;     // pointer initialized to another pointer
}

In the preceding code, the asterisk (*) is used to declare a pointer. The ampersand (&) is used to dereference a value pointed to by a pointer. Table 18-1 describes the various symbols that are used with pointers.

Table 18-1. Pointer symbols

Symbol

Description

Pointer declaration (*)

For pointers, the asterisk symbol has two purposes. The first is to declare new pointer variables:

int* pA;

Pointer dereference (*)

The second purpose of the asterisk is to dereference a pointer. Pointers point to an address in memory. Dereferencing a pointer returns the value at that address in memory:

int val = 5;
int* pA = &val;
Console.WriteLine(*pA); // displays 5

You cannot dereference a void pointer.

Address of (&)

The ampersand symbol returns the memory location of a variable, which is a fixed value. The following code initializes the pointer pA to the memory address of an int named val. It is used to initialize an int pointer:

int* pA = &val;

Member access (->)

Arrow notation dereferences members of a pointer type found at a memory location. For example, you can access members of a structure using arrow notation and a pointer. In the following code, ZStruct is a structure, and fielda is an integer member of that type:

ZStruct obj = new ZStruct(5);
ZStruct* pObj = &obj;
int val1 = pObj->fielda;

Alternatively, you can deference the pointer and access a member using dot syntax (.):

int val2 = (*pObj).fielda; // dot syntax

Pointer element ([])

A pointer element is an offset from the memory address of a pointer. For example, p[2] is an offset of two. Offsets are incremented by the size of the pointer type. If p is an int pointer, p[2] is an increment of eight bytes. In the following code, assume that ZStruct has two int fields in contiguous memory: fielda and fieldb:

ZStruct obj = new ZStruct(5);
int* pA = &obj.fielda;
Console.WriteLine(pA[1]); // fieldb

Pointer to a pointer (**)

A pointer to a pointer contains a location in memory that lists the address of another pointer. Although rarely useful, you can extend the chain of pointers even further (***, ****, and so on). You can dereference a pointer to a pointer with a double asterisk (**). Alternatively, you can dereference a pointer to a pointer using individual asterisks in separate steps:

int val = 5;
int* pA = &val;
int** ppA = &pA;
// Address stored in ppA, which is pA.
Console.WriteLine((int)ppA);
// Address stored in pA.
Console.WriteLine((int)*ppA);
// value at address stored in pA (5).
Console.WriteLine((int)**ppA);

Pointer addition (+)

Pointer addition adds the size of the pointer type to the memory location. This changes a pointer so that it points to a different location:

ZStruct obj = new ZStruct(5);
int* pA = &obj.fielda;
pA = pA + 2; // Add eight to pointer

Pointer subtraction (-)

Pointer subtraction subtracts from the pointer the size of the pointer type. This changes a pointer so that it points to a different location:

ZStruct obj = new ZStruct(5);
int* pA = &obj.fieldd;
pA = pA - 3; // Subtract twelve from pointer

Pointer increment (++)

Pointer increment increments the pointer address by the size of the pointer type:

ZStruct obj = new ZStruct(5);
int* pA = &obj.fielda;
++pA; // increment pointer by four

Pointer decrement (--)

Pointer decrement decrements the pointer address by the size of the pointer type:

ZStruct obj = new ZStruct(5);
int* pA = &obj.fieldb;
--pA; // decrement pointer by four

Relational symbols

The relational operators, such as < > >= <= != ==, can be used to compare pointers. The comparison is based on memory location rather than on pointer type:

ZStruct obj = new ZStruct(5);
int* pA = &obj.fielda;
int val = 5;
int* pB = &val;
if (pA == pB) {

 Console.WriteLine("Pointers point to the same object.");
}

Pointer Parameters and Pointer Return Values

A pointer is a legitimate variable. As such, a pointer can be used as a variable in most circumstances, including as a parameter or return type. When used as a return type, you should ensure that the lifetime of the pointer target is the same as or greater than that of the function caller. For example, do not return a pointer to a local variable from a function—the local variable loses scope outside the function and the pointer then becomes invalid.

In the following code, a pointer is used as both a parameter and return type. MethodA accepts a pointer as a parameter. It then returns the same pointer. After the method call, both pB and pA point to the same location in memory. They are aliases. Therefore, Console.WriteLine displays the same number when the values at the pointers are displayed:

using System;

namespace Donis.CSharpBook {
    public class Starter {
        public unsafe static void Main() {
            int val = 5;
            int* pA = &val;
            int* pB;
            pB = MethodA(pA);
            Console.WriteLine("*pA = {0} | *pB = {0}",
                *pA, *pB);
        }

        public unsafe static int* MethodA(int* pArg) {
            *pArg += 15;
            return pArg;
        }
    }
}

The ref or out modifier can be applied to pointer parameters. Without the modifier, the memory location is passed by pointer. The pointer itself is passed by value on the stack. In the function, you can dereference the pointer and change values at the memory location. These changes will persist even after the function exits. However, changes to the pointer itself are discarded when the function exits. With the ref or out modifier, a pointer parameter is passed by reference. In the function, the pointer can be changed directly. Those changes continue to persist even after the function exits.

In the following code, both MethodA and MethodB have a pointer as a parameter. MethodA passes the pointer by value, whereas MethodB passes the pointer by reference. In both methods, the actual pointer is changed. The change is discarded when MethodA exists. When MethodB exits, the change persists:

using System;

namespace Donis.CSharpBook {
    public class Starter {

        public unsafe static void Main() {
            int val = 5;
            int* pA = &val;
            Console.WriteLine("Original: {0}", (int) pA);
            MethodA(pA);
            Console.WriteLine("MethodA:  {0}", (int) pA);
            MethodB(ref pA);
            Console.WriteLine("MethodB:  {0}", (int) pA);
        }
        public unsafe static void MethodA(int* pArg) {
            ++pArg;
        }

        public unsafe static void MethodB(ref int* pArg) {
            ++pArg;
        }
    }
}

Fixed Statements

What is wrong with the following code?

int[] numbers = { 1,2,3,4,5,6 };
int* pI = numbers; // compiler error

The problem is that the numbers variable is an array, which is a reference type. The code will not compile because the array is assigned to a pointer. References are movable types and cannot be implicitly converted to pointers. However, structures are value types and are placed on the stack and outside of the control of the Garbage Collector. Struct values have a fixed address and are easily converted into pointers. In the preceding code, if the type were changed from an array to a structure, it would compile successfully. With the fixed statement, you pin the location of a movable type—at least temporarily. Be careful, though. Pinning memory for an extended period of time can interfere with efficient garbage collection.

Here is the code revised with the fixed statement. This code compiles successfully:

int[] numbers = { 1,2,3,4,5,6 };
fixed (int* pI = numbers) {
    // do something
}

The fixed statement pins memory for the span of a fixed block. In the block, the memory is unmovable and is exempt from garbage collection. You can access the pinned memory using the pointer from the fixed statement, which is a read-only pointer. When the fixed block exits, the memory is unpinned. Multiple pointers can be declared in the fixed statement. The pointers are delimited with commas, and only the first pointer is prefixed with the asterisk (*):

int[] n1 = { 1,2,3,4 };
int[] n2 = { 5,6,7,8 };
int[] n3 = { 9,10,11,12 };
fixed (int* p1 = n1, p2 = n2, p3 = n3) {
}

Here is a more complete example of using the fixed statement:

using System;

namespace Donis.CSharpBook {
    public class Starter {

        private static int[] numbers = { 5,10,15,20,25,30 };

        public unsafe static void Main() {
           int count = 0;
           Console.WriteLine(" Pointer   Value
");
           fixed (int* pI = numbers) {
               foreach (int a in numbers) {
                   Console.WriteLine("{0} : {1}",
                       (int)(pI+count), *((int*)pI + count));
                   ++count;
               }
           }
        }
    }
}

In the following code, ZClass is a class and a movable type. The fixed statement makes the ZClass object fixed in memory. A pointer to the integer member then is obtained:

public class Starter {
    public unsafe static void Main() {
        ZClass obj = new ZClass();
        fixed (int* pA = &obj.fielda) {
        }
    }
}

public class ZClass {
    public int fielda = 5;
}

The stackalloc Command

The stackalloc command allocates memory dynamically on the stack instead of the heap, which provides another option for allocating memory at run time. The lifetime of the allocation is the duration of the current function. The stackalloc command must be used within an unsafe context. It can be used to initialize only local pointers. The CLR will detect buffer overruns caused by the stackalloc command.

Here is the syntax for stackalloc:

type* stackalloc type[expression]

These are the particulars of the stackalloc command. It returns an unmanaged type. The expression should evaluate to an integral value, which is the number of elements to be allocated. The resulting pointer points to the base of the memory allocation. This memory is fixed and not available for garbage collection. It is automatically released at the end of the function.

The following code allocates 26 characters on the stack. The subsequent for loop assigns alphabetic characters to each element. The final loop displays each character:

using System;

namespace Donis.CSharpBook {
    public unsafe class Starter {
        public static void Main() {
            char* pChar = stackalloc char[26];
            char* _pChar = pChar;
            for (int count = 0; count < 26; ++count) {
                (*_pChar) = (char)(((int)('A')) + count);
                ++_pChar;
            }
            for (int count = 0; count < 26; ++count) {
                Console.Write(pChar[count]);
            }
        }
    }
}

P/Invoke

You can call unmanaged functions from managed code using P/Invoke. Managed and unmanaged memory might be laid out differently, which could require marshaling of parameters or the return type. In .NET, marshaling is the responsibility of the Interop marshaler.

Interop Marshaler

The Interop marshaler is responsible for transferring data between managed and unmanaged memory. It automatically transfers data that is similarly represented in managed and unmanaged environments. For example, integers are identically formatted in both environments and automatically marshaled between managed and unmanaged environments. Types that are the same in both environments are called blittable types. Nonblittable types, such as strings, are managed types without an equivalent unmanaged type and must be marshaled. The Interop marshaler assigns a default unmanaged type for many nonblittable types. Developers can also explicitly marshal nonblittable types to specific unmanaged types with the MarshalAsAttribute type.

DllImport

DllImportAttribute imports a function exported from an unmanaged library, where the library must export the function. DllImportAttribute is in the System.Runtime.InteropServices name space. DllImportAttribute has several options that configure the managed environment for importing the named function. The library is dynamically loaded with the LoadLibrary native API, and the underlying function pointer is initialized at run time. Because the attribute is evaluated at run time, most configuration errors are not found at compile time; they are found later. Because many errors related to DllImportAttribute do not occur at compile time, you should be careful when using this attribute.

Here is the syntax of DllImportAttribute:

[DllImport(options)] accessibility static extern returntype functionname(parameters)

Options are used to configure the import of the external function. The name of the library is the only required option. If it is not found in a directory within the environment path variable, the name of the library should include the fully qualified path. Accessibility is the visibility of the function, such as public or protected. Imported functions must be static and extern. The remainder of the statement is the managed signature of the function.

The following code imports three functions to display the vertical and horizontal size of the screen. GetDC, GetDeviceCaps, and ReleaseHandle are Microsoft Win32 APIs. The imported functions are configured and exposed in the API class, which is a static class. The functions then are called from managed code. In the code, the IntPtr type is used. IntPtr is an abstraction of an integer pointer, where IntPtr.Zero is a null integer pointer:

using System;
using System.Runtime.InteropServices;

namespace Donis.CSharpBook {
    public class Starter {
        public static void Main() {
            IntPtr hDC = API.GetDC(IntPtr.Zero);
            int v = API.GetDeviceCaps(hDC, API.VERTRES);
            Console.WriteLine("Vertical size of window {0}mm.", v);
            int h = API.GetDeviceCaps(hDC, API.HORZRES);
            Console.WriteLine("Horizontal size of window {0}mm.", h);
            int resp = API.ReleaseDC(IntPtr.Zero, hDC);
            if (resp != 1) {
                Console.WriteLine("Error releasing hdc");
            }
        }
    }

    public static class API {
       [DllImport("user32.dll")] public static extern
        IntPtr GetDC(IntPtr hWnd);

       [DllImport("user32.dll")] public static extern
        int ReleaseDC(IntPtr hWnd, IntPtr hDC);

       [DllImport("gdi32.dll")]public static extern
        int GetDeviceCaps(IntPtr hDC, int nIndex);

        public const int HORZSIZE = 4;  // horizontal size in pixels
        public const int VERTSIZE = 6;  // vertical size in pixels
        public const int HORZRES = 8;   // horizontal size in millimeters
        public const int VERTRES = 10;  // vertical size in millimeters
    }
}

In the preceding code, the only option used with DllImportAttribute is the library name. There are several other options, which are described in the following sections.

EntryPoint

This option explicitly names the imported function. Without this option, the name is implied from the managed function signature, as demonstrated in the preceding code example. When the imported name is ambiguous, the EntryPoint option is helpful. You can specify a unique name for the related managed function to remove ambiguity.

In the following code, MessageBox is being imported. Instead of using that name, which is the default, the assigned managed name is ShowMessage:

using System;
using System.Runtime.InteropServices;

namespace Donis.CSharpBook {
    public class Starter {
        public static void Main() {
            string caption = "Visual C# 2008";
            string text = "Hello, world!";
            API.ShowMessage(0, text, caption, 0);
        }
    }

    public class API {
        [DllImport("user32.dll", EntryPoint="MessageBox")]
        public static extern int ShowMessage(int hWnd,
            string text, string caption, uint type);
    }
}

CallingConvention

This option sets the calling convention of the function. The default calling convention is Winapi, which maps to the standard calling convention in the Win32 environment as well as to the standard calling convention in the CE .NET environment. The calling convention is set with the CallingConvention enumeration. Table 18-2 lists the members of this enumeration.

Table 18-2. CallingConvention enumeration members

Member

Description

Cdecl

The caller removes the parameters from the stack, which is the calling convention for functions that have a variable-length argument list.

FastCall

This calling convention is not supported.

StdCall

The called method removes the parameters from the stack. This calling convention is commonly used for APIs and is the default for calling unmanaged functions with P/Invoke.

ThisCall

The first parameter of the function is the this pointer followed by the conventional parameters. In the function, the this pointer is cached in the ECX register and used to access instance members of an unmanaged class.

Winapi

Default calling convention of the current platform. For a Win32 environment, this is the StdCall calling convention. For Windows CE .NET, Cdecl is the default.

The following code imports the printf function, which is found in the C Runtime Library. The printf function accepts a variable number of parameters and supports the Cdecl calling convention:

using System;
using System.Runtime.InteropServices;

namespace Donis.CSharpBook {
    public class Starter {
        public static void Main() {
            int val1 = 5, val2 = 10;
            API.printf("%d+%d=%d", val1, val2, val1 + val2);
        }
    }

    public class API {
        [DllImport("msvcrt.dll", CharSet=CharSet.Ansi,
            CallingConvention=CallingConvention.Cdecl)]
        public static extern int printf(string formatspecifier,
            int lhs, int rhs, int total);
    }
}

ExactSpelling

This option stipulates that the exact spelling of the function name is used to resolve the symbol. Names are not always what they seem. For example, the function names of many Win32 APIs are actually macros that map to the real API, which is an A-suffixed or W-suffixed method. The A version is the American National Standards Institute (ANSI) version, whereas the W (wide) version is the Unicode version of the function. The ANSI versus Unicode extrapolation pertains mostly to Win32 APIs that have string parameters. For example, the supposed CreateWindow API is a macro that maps to either the CreateWindowW or CreateWindowA API. For the DllImportAttribute, the version selected is determined in the CharSet option. If ExactSpelling is true, the function name is treated as the actual name and not translated regardless of the CharSet option. The default is false, which allows the function name to be translated to either the A or W version of the method.

The following code imports the GetModuleHandleW function specifically. ExactSpelling is true to use the name "as is":

using System;
using System.Runtime.InteropServices;

namespace Donis.CSharpBook {
    public class Starter {
        public static void Main() {
            int hProcess = API.GetModuleHandleW(null);
        }
    }
    public class API {
        [DllImport("kernel32.dll", ExactSpelling=true)]
        public static extern int GetModuleHandleW(string filename);
    }
}

PreserveSig

This option preserves the signature of a method. COM functions usually return an HRESULT, which is the error status of the call. The real return is the parameter decorated with the [out, retval] Interface Definition Language (IDL) attribute. In managed code, the HRESULT is consumed for error handling and the [out, retval] parameter is returned as the actual return. To resolve a COM function, the original signature cannot be preserved; it should be mapped to a COM signature. Conversely, the signature of non-COM functions should be preserved. PreserveSig defaults to true.

The following code demonstrates the PreserveSig option with a fictitious COM function:

public class API {
    [DllImport("ole32.dll", PreserveSig=false)]
    public static extern int SomeFunction();
}

Here is the original signature in COM:

HRESULT SomeFunction([out, retval] int param)

SetLastError

This option asks the CLR to preserve the error code of the imported function. Most Win32 APIs return false if the function fails. False is minimally descriptive, so developers can call GetLastError for an integer error code that provides additional detail. GetLastError must be called immediately after the failed API; if not, the next API might reset the error code. In managed code, call Marshal.GetLastWin32Error to retrieve the error code. The Marshal type is in the System.Runtime.InteropServices namespace. SetLastError defaults to false.

In the following code, CreateDirectory and FormatMessage are imported in the API class. CreateDirectory creates a file directory; FormatMessage converts a Win32 error code into a user-friendly message. For CreateDirectory, the SetLastError option is set to true. In Main, CreateDirectory is called with an invalid path. The "c*" drive is probably an incorrect drive on most computers. The resulting error code is stored in the resp variable, which is then converted into a message using the FormatMessage API. FormatMessage returns the user-friendly message as an out parameter:

using System;
using System.Text;
using System.Runtime.InteropServices;

namespace Donis.CSharpBook {
    public class Starter {
        public static void Main() {
            bool resp = API.CreateDirectory(@"c*:file.txt",
                IntPtr.Zero);
            if (resp == false) {
                StringBuilder message;
                int errorcode = Marshal.GetLastWin32Error();
                API.FormatMessage(
                    API.FORMAT_MESSAGE_ALLOCATE_BUFFER |
                    API.FORMAT_MESSAGE_FROM_SYSTEM |
                    API.FORMAT_MESSAGE_IGNORE_INSERTS,
                    IntPtr.Zero, errorcode,
                    0, out message, 0, IntPtr.Zero);
                Console.WriteLine(message);
            }
        }
    }

    public class API {
        [DllImport("kernel32.dll", SetLastError=true)]
        public static extern bool CreateDirectory(
            string lpPathName, IntPtr lpSecurityAttributes);

        [DllImport("kernel32.dll", SetLastError=false)]
        public static extern System.Int32 FormatMessage(
            System.Int32 dwFlags,
            IntPtr lpSource,
            System.Int32 dwMessageId,
            System.Int32 dwLanguageId,
            out StringBuilder lpBuffer,
            System.Int32 nSize,
            IntPtr va_list);

        public const int FORMAT_MESSAGE_ALLOCATE_BUFFER = 256;
        public const int FORMAT_MESSAGE_IGNORE_INSERTS = 512;
        public const int FORMAT_MESSAGE_FROM_STRING = 1024;
        public const int FORMAT_MESSAGE_FROM_HMODULE = 2048;
        public const int FORMAT_MESSAGE_FROM_SYSTEM = 4096;
        public const int FORMAT_MESSAGE_ARGUMENT_ARRAY = 8192;
        public const int FORMAT_MESSAGE_MAX_WIDTH_MASK = 255;
    }
}

CharSet

This option indicates the proper interpretation of strings in unmanaged memory, which can affect the ExactSpelling option. CharSet is also an enumeration with three members. The default is CharSet.Ansi. Table 18-3 lists the members of the CharSet enumeration.

Table 18-3. CharSet enumeration members

Value

Description

CharSet.Ansi

Strings should be marshaled as ANSI.

CharSet.Unicode

Strings should be marshaled as Unicode.

CharSet.Auto

The appropriate conversion is decided at run time depending on the current platform.

The following code marshals a string for unmanaged memory as ANSI. The ExactSpelling option defaults to false, and the GetModuleHandleA API is called. GetModuleHandleA has ANSI parameters:

using System;
using System.Runtime.InteropServices;

namespace Donis.CSharpBook {
    public class Starter {
        public static void Main() {
            int hProcess = API.GetModuleHandle(null);
        }
    }

    public class API {
        [DllImport("kernel32.dll", CharSet=CharSet.Ansi)]
        public static extern int GetModuleHandle(string filename);
    }
}

BestFitMapping

This option affects the Unicode-to-ANSI mapping of text characters passed from managed to unmanaged functions running in the Microsoft Windows 98 or Microsoft Windows Millennium Edition (Windows Me) environment. If true, best-fit mapping is enabled. When there is not a direct character match, the Unicode character is mapped to the closest match in the ANSI code page. If no match is available, the Unicode character is mapped to a "?" character. The default is true.

ThrowOnUnmappableChar

This option can request an exception when an unmappable character is found in the Unicode-to-ANSI translation for Windows 98 and Windows Me. If true, an exception is raised when a Unicode character cannot be mapped to ANSI, and the character is converted to a ? character. If false, no exception is raised. See the BestFitMapping option for additional details on Unicode-to-ANSI mapping.

Blittable Types

Blittable types are represented similarly in managed and unmanaged memory. Therefore, no conversion is necessary from the Interop marshaler when marshaling between managed and unmanaged environments. Because conversion can be expensive, blittable types are more efficient than nonblittable types. For this reason, when possible, parameters and return types should be blittable types, which include System.Byte, System.SByte, System.Int16, System.UInt16, System.Int32, System.UInt32, System.Int64, System.IntPtr, System.UIntPtr, System.Single, and System.Double. Arrays of blittable types and formatted value types that contain only blittable types also are considered blittable. (Formatted types are explained in the next section.)

Nonblittable types have different representations in managed and unmanaged memory. Some nonblittable types are converted automatically by the Interop marshaler, whereas others require explicit marshaling. Strings and user-defined classes are examples of nonblittable types. A managed string can be marshaled as a variety of unmanaged string types: LPSTR, LPTSTR, LPWSTR, and so on. Classes are nonblittable unless they are formatted. In addition, a formatted class marshaled as a formatted value type is blittable.

Formatted Type

A formatted type is a user-defined type in which the memory layout of the members is explicitly specified. Formatted types are prefixed with the StructLayoutAttribute, which sets the layout of the members as described in the LayoutKind enumeration. Table 18-4 lists the members of the LayoutKind enumeration.

Table 18-4. LayoutKind enumeration members

Value

Description

LayoutKind.Auto

The CLR sets the location of members in unmanaged memory. The type cannot be exposed to unmanaged code.

LayoutKind.Sequential

Members are stored in contiguous (sequential) unmanaged memory. The members are stored in textual order. If desired, set packing with the StructLayoutAttribute.Pack option.

LayoutKind.Explicit

This flag allows the developer to stipulate the order of the fields in memory using FieldOffsetAttribute. This is useful for representing a managed type as a C or C++ union type in unmanaged code.

In the following code, the API class imports the GetWindowRect unmanaged API. This function returns the location of the client area in the screen. The parameters of GetWindowRect are a window handle and a pointer to a Rect structure, which also is defined in the API class. The Rect structure, which is initialized inside the function, is a formatted value type and is blittable. By default, value types are passed by value. To pass a value type by reference, the out modifier is assigned to the Rect parameter:

public class API
{
    [DllImport("user32.dll")]
    public static extern bool GetWindowRect(
        IntPtr hWnd,
        out Rect windowRect);

    [StructLayout(LayoutKind.Sequential)]
    public struct Rect
    {
        public int left;
        public int top;
        public int right;
        public int bottom;
    }
}

Here is the code that uses the GetWindowRect API and the Rect structure:

API.Rect client = new API.Rect();
API.GetWindowRect(this.Handle, out client);
string temp = string.Format("Left {0} : Top {1} : "+
    "Right {2} : Bottom {3}", client.left,
    client.top, client.right, client.bottom);
MessageBox.Show(temp);

The following code is a version of the API class that defines a Rect class instead of a structure. Because the Rect class has the StructLayout attribute, it is a formatted type. Classes are passed by reference or by pointer by default, depending on the signature of the native API. The out modifier required for a structure (shown in the previous example code) is not necessary for a class:

class API2
{
    [DllImport("user32.dll")]
    public static extern bool GetWindowRect(
        IntPtr hWnd,
        Rect windowRect);

    [StructLayout(LayoutKind.Sequential)]
    public class Rect
    {
        public int left;
        public int top;
        public int right;
        public int bottom;
    }
}

Here is the code to call the GetWindowRect API using the Rect class:

API2.Rect client = new API2.Rect();
API2.GetWindowRect(this.Handle, client);
string temp = string.Format("Left {0} : Top {1} : " +
    "Right {2} : Bottom {3}", client.left,
    client.top, client.right, client.bottom);
MessageBox.Show(temp);

Unions are fairly common in C and C++ code. A union is a type in which the members share the same memory location. This conserves memory by overlaying mutually exclusive data in shared memory. C# does not offer a union type. In managed code, emulate a union in unmanaged memory with the LayoutKind.Explicit option of StructLayoutAttribute. Set each field of the union to the same offset, as shown in the following code:

[StructLayout(LayoutKind.Explicit)]
struct ZStruct {
   [FieldOffset(0)] int fielda;
   [FieldOffset(0)] short fieldb;
   [FieldOffset(0)] bool fieldc;
}

Directional Attributes

Directional attributes explicitly control the direction of marshaling. Parameters can be assigned InAttribute, OutAttribute, or both attributes to affect marshaling. This is equivalent to [in], [out], and [in, out] of the IDL. InAttribute and OutAttribute are also represented by keywords in C#. Table 18-5 lists the attributes and related keywords.

Table 18-5. Directional attributes and C# keywords

Keyword

Attribute

IDL

No keyword available explicitly. (This is the underlying default.)

InAttribute

[in]

Ref

InAttribute and OutAttribute

[in, out]

Out

OutAttribute

[out]

The default directional attribute depends on the type of parameter and any modifiers.

StringBuilder

Strings are immutable and dynamically sized. An unmanaged API might require a fixed-length and modifiable string. In addition, some unmanaged APIs initialize the string with memory allocated at run time. The string type should not be used in these circumstances. Instead, use the StringBuilder class, which is found in the System.Text name space. StringBuilders are fixed-length and not immutable. Furthermore, you can initialize the StringBuilder with memory created in the unmanaged API.

In the following code, the GetWindowText unmanaged API is imported twice. GetWindowText retrieves the text from the specified window. For an overlapped window, this is text from the title bar. The second parameter of GetWindowText is a string, which is initialized with the window text during the function call. The first version of GetWindowText in the API class has a string parameter, whereas the version in the API2 class has a StringBuilder parameter. The GetWindowText application is a Windows Forms application that has two buttons. The first button calls API.GetWindowText and the second button calls API2.GetWindowText. Click the first button. Because of the string parameter in API.GetWindowText, an exception is raised because the API attempts to change that parameter. The second button invokes API2.GetWindowText, which uses the StringBuilder type, and the function runs successfully:

public class API
{
    [DllImport("user32.dll")]
    public static extern int GetWindowText(
        IntPtr hWnd, ref string lpString, int nMaxCount);
}

public class API2
{
    [DllImport("user32.dll")]
    public static extern int GetWindowText(
        IntPtr hWnd, StringBuilder lpString, int nMaxCount);
}

Here is the code from the button-click handlers of the form:

private void btnGetText_Click(object sender, EventArgs e)
{
    string windowtext=null;
    API.GetWindowText(this.Handle, ref windowtext, 25);
    MessageBox.Show(windowtext);
}

private void btnGetText2_Click(object sender, EventArgs e)
{
    StringBuilder windowtext = new StringBuilder();
    API2.GetWindowText(this.Handle, windowtext, 25);
    MessageBox.Show(windowtext.ToString());
}

Unmanaged Callbacks

Some unmanaged functions accept a callback as a parameter, which is a function pointer. The unmanaged function then invokes the function pointer to call a function in the managed caller. Callbacks typically are used for iteration. For example, the EnumWindows unmanaged API uses a callback to iterate handles of top-level windows.

.NET abstracts function pointers with delegates, which are type-safe and have a specific signature. In the managed signature, substitute a delegate for the callback parameter of the unmanaged signature.

These are the steps to implement a callback for an unmanaged function:

  1. Determine the unmanaged signature of the callback function.

  2. Define a matching managed signature as a delegate for the callback function.

  3. Implement a function to be used as the callback. The implementation of the function is essentially the response to the callback.

  4. Create a delegate and initialize it to the callback function.

  5. Invoke the unmanaged API and provide the delegate as the callback parameter.

The following code imports the EnumWindows unmanaged API. The first parameter of EnumWindows is a callback. EnumWindows enumerates top-level windows. The callback function is called at each iteration and is given the current window handle as a parameter. In this code, APICallback is a delegate and is compatible with the unmanaged signature of the callback:

class API
{
    [DllImport("user32.dll")]
    public static extern bool EnumWindows(
        APICallback lpEnumFunc,
        System.Int32 lParam);

    public delegate bool APICallback(int hWnd, int lParam);
}

EnumWindows is called in the click handler of a Windows Forms application. GetWindowHandle is passed as the callback function in the second parameter via an APICallback delegate. GetWindowHandle is called for each handle that is enumerated. During the enumeration, the managed function adds each handle to a list box:

private void btnHandle_Click(object sender, EventArgs e)
{
    API.EnumWindows(new API.APICallback(GetWindowHandle), 0);
}

bool GetWindowHandle(int hWnd, int lParam)
{
    string temp = string.Format("{0:0000000}", hWnd);
    listBox1.Items.Add(temp);
    return true;
}

Explicit Marshaling

Explicit marshaling sometimes is required to convert nonblittable parameters, fields, or return types to proper unmanaged types. Marshaling is invaluable for strings, which have several possible representations in unmanaged memory. Strings default to LPSTR. Use MarshalAsAttribute to marshal a managed type explicitly as a specific unmanaged type. The UnmanagedType enumeration defines the unmanaged types available for marshaling. Table 18-6 lists the members of the UnmanagedType enumeration.

Table 18-6. UnmanagedType enumeration members

Member

Description

AnsiBStr

Length-prefixed ANSI string

AsAny

Dynamic type where the type is set at run time

Bool

Four-byte Boolean value

BStr

Length-prefixed Unicode string

ByValArray

Marshals an array by value; SizeConst sets the number of elements

ByValTStr

Inline fixed-length character array that is a member of a structure

Currency

COM currency type

CustomMarshaler

To be used with MarshalAsAttribute.MarshalType or MarshalAsAttribute.MarshalTypeRef

Error

HRESULT

FunctionPtr

C-style function pointer

I1

One-byte integer

I2

Two-byte integer

I4

Four-byte integer

I8

Eight-byte integer

IDispatch

IDispatch pointer for COM

Interface

COM interface pointer

IUnknown

IUnknown interface pointer

LPArray

Pointer to the first element of an unmanaged array

LPStr

Null-terminated ANSI string

LPStruct

Pointer to an unmanaged structure

LPTStr

Platform-dependent string

LPWStr

Null-terminated Unicode string

R4

Four-byte floating point number

R8

Eight-byte floating point number

SafeArray

Safe array in which the type, rank, and bounds are defined

Struct

Formatted value and reference types

SysInt

Platform-dependent integer (32 bits in Win32 environment)

SysUInt

Platform-dependent unsigned integer (32 bits in Win32 environment)

TBStr

Length-prefixed, platform-dependent string

U1

One-byte unsigned integer

U2

Two-byte unsigned integer

U4

Four-byte unsigned integer

U8

Eight-byte unsigned integer

VariantBool

Two-byte VARIANT_BOOL type

VBByRefStr

Microsoft Visual Basic–specific

GetVersionEx is imported in the following code. The function is called in Main to obtain information on the current operating system. GetVersionEx has a single parameter, which is a pointer to an OSVERSIONINFO structure. The last field in the structure is szCSDVersion, which is a universally unique identifier (UUID). A UUID is a 128-byte array. In the sample code, the MarshalAs attribute marshals the field as a 128-character array. Each character is one byte long:

using System;
using System.Runtime.InteropServices;

namespace Donis.CSharpBook {
    public class Starter {
        public static void Main() {
            API.OSVERSIONINFO info = new API.OSVERSIONINFO();
            info.dwOSVersionInfoSize = Marshal.SizeOf(info);
            bool resp = API.GetVersionEx(ref info);
            if (resp == false) {
                Console.WriteLine("GetVersion failed");
            }
            Console.WriteLine("{0}.{1}.{2}",
                info.dwMajorVersion,
                info.dwMinorVersion,
                info.dwBuildNumber);
        }
    }


    public class API {

        [DllImport("kernel32.dll")] public static extern
        bool GetVersionEx(ref OSVERSIONINFO lpVersionInfo);

      [StructLayout(LayoutKind.Sequential)]
        public struct OSVERSIONINFO {
            public System.Int32 dwOSVersionInfoSize;
            public System.Int32 dwMajorVersion;
            public System.Int32 dwMinorVersion;
            public System.Int32 dwBuildNumber;
            public System.Int32 dwPlatformId;
            [MarshalAs( UnmanagedType.ByValTStr, SizeConst=128 )]
                public String szCSDVersion;
        }
    }
}

Fixed-Size Buffers

In the previous code, the MarshalAs attribute defined a fixed-size field of 128 characters or bytes. As an alternative to the MarshalAs attribute, C# provides fixed-size buffers using the fixed keyword. The primary purpose of this keyword is to embed aggregate types, such as an array, in a structure. Fixed-size buffers are allowed in structures but not in classes.

There are several rules for using fixed-size buffers:

  • Fixed-size buffers are available only in unsafe contexts.

  • Fixed-size buffers can represent only one-dimensional arrays (vectors).

  • The array must have a specific length.

  • Fixed-size buffers are allowed only in struct types.

  • Fixed-sized buffers are limited to bool, byte, char, short, int, long, sbyte, ushort, uint, ulong, float, and double types.

Here is the syntax of the fixed-sized buffer:

attributes accessibility modifier fixed type identifier[expression]

The following code is another version of the OSVERSIONINFO structure. This version uses the MarshalAs attribute and uses the fixed keyword for the szCSDVersion field:

public class API {
[StructLayout(LayoutKind.Sequential)]
    unsafe public struct OSVERSIONINFO {
        public System.Int32 dwOSVersionInfoSize;
        public System.Int32 dwMajorVersion;
        public System.Int32 dwMinorVersion;
        public System.Int32 dwBuildNumber;
        public System.Int32 dwPlatformId;
        public fixed char szCSDVersion[128];
  }
}
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.117.99.71