C# has great capabilities, especially when you consider that the underlying framework is entirely managed. Sometimes, however, you need to escape out of all the safety that C# provides and step back into the world of memory addresses and pointers. C# supports this action in two significant ways. The first option is to go through Platform Invoke (P/Invoke) and calls into APIs exposed by unmanaged dynamic link libraries (DLLs). The second way is through unsafe code, which enables access to memory pointers and addresses.
The majority of the chapter discusses interoperability with unmanaged code and the use of unsafe code. This discussion culminates with a small program that determines the processor ID of a computer. The code requires that you do the following:
Call into an operating system DLL and request allocation of a portion of memory for executing instructions.
Write some assembler instructions into the allocated area.
Inject an address location into the assembler instructions.
Execute the assembler code.
Aside from the P/Invoke and unsafe constructs covered here, the complete listing demonstrates the full power of C# and the fact that the capabilities of unmanaged code are still accessible from C# and managed code.
Whether a developer is trying to call a library of existing unmanaged code, accessing unmanaged code in the operating system not exposed in any managed API, or trying to achieve maximum performance for an algorithm by avoiding the runtime overhead of type checking and garbage collection, at some point there must be a call into unmanaged code. The Common Language Infrastructure (CLI) provides this capability through P/Invoke. With P/Invoke, you can make API calls into exported functions of unmanaged DLLs.
The APIs invoked in this section are Windows APIs. Although the same APIs are not available on other platforms, developers can still use P/Invoke for APIs native to their operating systems or for calls into their own DLLs. The guidelines and syntax are the same.
Once the target function is identified, the next step of P/Invoke is to declare the function with managed code. Just as with all regular methods that belong to a class, you need to declare the targeted API within the context of a class, but by using the extern
modifier. Listing 23.1 demonstrates how to do this.
using System; using System.Runtime.InteropServices; class VirtualMemoryManager { [DllImport("kernel32.dll", EntryPoint="GetCurrentProcess")] internal static extern IntPtr GetCurrentProcessHandle(); }
In this case, the class is VirtualMemoryManager
, because it will contain functions associated with managing memory. (This particular function is available directly off the System.Diagnostics.Processor
class, so there is no need to declare it in real code.) Note that the method returns an IntPtr
; this type is explained in the next section.
The extern
methods never include any body and are (almost) always static. Instead of a method body, the DllImport
attribute, which accompanies the method declaration, points to the implementation. At a minimum, the attribute needs the name of the DLL that defines the function. The runtime determines the function name from the method name, although you can override this default by using the EntryPoint
named parameter to provide the function name. (The .NET framework will automatically attempt calls to the Unicode [...W
] or ASCII [...A
] API version.)
In this case, the external function, GetCurrentProcess()
, retrieves a pseudohandle for the current process that you will use in the call for virtual memory allocation. Here’s the unmanaged declaration:
HANDLE GetCurrentProcess();
Assuming the developer has identified the targeted DLL and exported function, the most difficult step is identifying or creating the managed data types that correspond to the unmanaged types in the external function.1 Listing 23.2 shows a more difficult API.
1. One particularly helpful resource for declaring Win32 APIs is http://www.pinvoke.net. It provides a great starting point for many APIs, helping you avoid some of the subtle problems that can arise when coding an external API call from scratch.
LPVOID VirtualAllocEx( HANDLE hProcess, // The handle to a process. The // function allocates memory within // the virtual address space of this // process. LPVOID lpAddress, // The pointer that specifies a // desired starting address for the // region of pages that you want to // allocate. If lpAddress is NULL, // the function determines where to // allocate the region. SIZE_T dwSize, // The size of the region of memory to // allocate, in bytes. If lpAddress // is NULL, the function rounds dwSize // up to the next page boundary. DWORD flAllocationType, // The type of memory allocation DWORD flProtect); // The type of memory allocation
VirtualAllocEx()
allocates virtual memory that the operating system specifically designates for execution or data. To call it, you need corresponding definitions in managed code for each data type; although common in Win32 programming, HANDLE
, LPVOID
, SIZE_T
, and DWORD
are undefined in the CLI managed code. The declaration in C# for VirtualAllocEx()
, therefore, is shown in Listing 23.3.
using System; using System.Runtime.InteropServices; class VirtualMemoryManager { [DllImport("kernel32.dll")] internal static extern IntPtr GetCurrentProcess(); [DllImport("kernel32.dll", SetLastError = true)] private static extern IntPtr VirtualAllocEx( IntPtr hProcess, IntPtr lpAddress, IntPtr dwSize, AllocationType flAllocationType, uint flProtect); }
One distinct characteristic of managed code is that primitive data types such as int
do not change their size on the basis of the processor. Whether the processor is 16, 32, or 64 bits, int
is always 32 bits. In unmanaged code, however, memory pointers will vary depending on the processor. Therefore, instead of mapping types such as HANDLE
and LPVOID
simply to int
s, you need to map to System.IntPtr
, whose size will vary depending on the processor memory layout. This example also uses an AllocationType
enum, which we discuss in the section “Simplifying API Calls with Wrappers” later in this chapter.
An interesting point to note about Listing 23.3 is that IntPtr
is useful for more than just pointers—that is, it is useful for other things such as quantities. IntPtr
does not mean just “pointer stored in an integer”; it also means “integer that is the size of a pointer.” An IntPtr
need not contain a pointer but simply needs to contain something the size of a pointer. Lots of things are the size of a pointer but are not actually pointers.
ref
Rather Than PointersFrequently, unmanaged code uses pointers for pass-by-reference parameters. In these cases, P/Invoke doesn’t require that you map the data type to a pointer in managed code. Instead, you map the corresponding parameters to ref
(or ou
t, depending on whether the parameter is in
/out
or just out
). In Listing 23.4, lpflOldProtect
, whose data type is PDWORD
, returns the “pointer to a variable that receives the previous access protection of the first page in the specified region of pages.”2
2. MSDN documentation.
class VirtualMemoryManager { // ... [DllImport("kernel32.dll", SetLastError = true)] static extern bool VirtualProtectEx( IntPtr hProcess, IntPtr lpAddress, IntPtr dwSize, uint flNewProtect, ref uint lpflOldProtect); }
Although lpflOldProtect
is documented as [out]
(even though the signature doesn’t enforce it), the description also mentions that the parameter must point to a valid variable and not NULL
. This inconsistency is confusing but commonly encountered. The guideline is to use ref
rather than out
for P/Invoke type parameters, since the callee can always ignore the data passed with ref
, but the converse will not necessarily succeed.
The other parameters are virtually the same as VirtualAllocEx()
except that lpAddress
is the address returned from VirtualAllocEx()
. In addition, flNewProtect
specifies the exact type of memory protection: page execute, page read-only, and so on.
StructLayoutAttribute
for Sequential LayoutSome APIs involve types that have no corresponding managed type. Calling these types requires redeclaration of the type in managed code. You declare the unmanaged COLORREF
struct, for example, in managed code (see Listing 23.5).
[StructLayout(LayoutKind.Sequential)] struct ColorRef { public byte Red; public byte Green; public byte Blue; // Turn off the warning about not accessing Unused #pragma warning disable 414 private byte Unused; #pragma warning restore 414 public ColorRef(byte red, byte green, byte blue) { Blue = blue; Green = green; Red = red; Unused = 0; } }
Various Microsoft Windows color APIs use COLORREF
to represent RGB colors (i.e., levels of red, green, and blue).
The key in the Listing 23.5 declaration is StructLayoutAttribute
. By default, managed code can optimize the memory layouts of types, so layouts may not be sequential from one field to the next. To force sequential layouts so that a type maps directly and can be copied bit for bit (blitted) from managed to unmanaged code, and vice versa, you add the StructLayoutAttribute
with the LayoutKind.Sequential
enum value. (This is also useful when writing data to and from filestreams where a sequential layout may be expected.)
Since the unmanaged (C++) definition for struct
does not map to the C# definition, there is no direct mapping of unmanaged struct to managed struct. Instead, developers should follow the usual C# guidelines about whether the type should behave like a value or a reference type, and whether the size is small (approximately less than 16 bytes).
One inconvenient aspect of Win32 API programming is the fact that the APIs frequently report errors in inconsistent ways. For example, some APIs return a value (0
, 1
, false
, and so on) to indicate an error, whereas others set an out
parameter in some way. Furthermore, the details of what went wrong require additional calls to the GetLastError()
API and then an additional call to FormatMessage()
to retrieve an error message corresponding to the error. In summary, Win32 error reporting in unmanaged code seldom occurs via exceptions.
Fortunately, the P/Invoke designers provided a mechanism for error handling. To enable it, if the SetLastError
named parameter of the DllImport
attribute is true
, it is possible to instantiate a System .ComponentModel.Win32Exception()
that is automatically initialized with the Win32 error data immediately following the P/Invoke call (see Listing 23.6).
class VirtualMemoryManager { [DllImport("kernel32.dll", ", SetLastError = true)] private static extern IntPtr VirtualAllocEx( IntPtr hProcess, IntPtr lpAddress, IntPtr dwSize, AllocationType flAllocationType, uint flProtect); // ... [DllImport("kernel32.dll", SetLastError = true)] static extern bool VirtualProtectEx( IntPtr hProcess, IntPtr lpAddress, IntPtr dwSize, uint flNewProtect, ref uint lpflOldProtect); [Flags] private enum AllocationType : uint { // ... } [Flags] private enum ProtectionOptions { // ... } [Flags] private enum MemoryFreeType { // ... } public static IntPtr AllocExecutionBlock( int size, IntPtr hProcess) { IntPtr codeBytesPtr; codeBytesPtr = VirtualAllocEx( hProcess, IntPtr.Zero, (IntPtr)size, AllocationType.Reserve | AllocationType.Commit, (uint)ProtectionOptions.PageExecuteReadWrite); if (codeBytesPtr == IntPtr.Zero) { throw new System.ComponentModel.Win32Exception(); } uint lpflOldProtect = 0; if (!VirtualProtectEx( hProcess, codeBytesPtr, (IntPtr)size, (uint)ProtectionOptions.PageExecuteReadWrite, ref lpflOldProtect)) { throw new System.ComponentModel.Win32Exception(); } return codeBytesPtr; } public static IntPtr AllocExecutionBlock(int size) { return AllocExecutionBlock( size, GetCurrentProcessHandle()); } }
This code enables developers to provide the custom error checking that each API uses while still reporting the error in a standard manner.
Listings 23.1 and 23.3 declared the P/Invoke methods as internal or private. Except for the simplest of APIs, wrapping methods in public wrappers that reduce the complexity of the P/Invoke API calls is a good guideline that increases API usability and moves toward object-oriented type structure. The AllocExecutionBlock()
declaration in Listing 23.6 provides a good example of this approach.
SafeHandle
Frequently, P/Invoke involves a resource, such as a handle, that code needs to clean up after using. Instead of requiring developers to remember this step is necessary and manually code it each time, it is helpful to provide a class that implements IDisposable
and a finalizer. In Listing 23.7, for example, the address returned after VirtualAllocEx()
and VirtualProtectEx()
requires a follow-up call to VirtualFreeEx()
. To provide built-in support for this process, you define a VirtualMemoryPtr
class that derives from System.Runtime.InteropServices.SafeHandle
.
Begin 2.0
public class VirtualMemoryPtr : System.Runtime.InteropServices.SafeHandle { public VirtualMemoryPtr(int memorySize) : base(IntPtr.Zero, true) { _ProcessHandle = VirtualMemoryManager.GetCurrentProcessHandle(); _MemorySize = (IntPtr)memorySize; _AllocatedPointer = VirtualMemoryManager.AllocExecutionBlock( memorySize, ProcessHandle); _Disposed = false; } public readonly IntPtr _AllocatedPointer; readonly IntPtr _ProcessHandle; readonly IntPtr _MemorySize; bool _Disposed; public static implicit operator IntPtr( VirtualMemoryPtr virtualMemoryPointer) { return virtualMemoryPointer.AllocatedPointer; } // SafeHandle abstract member public override bool IsInvalid { get { return _Disposed; } } // SafeHandle abstract member protected override bool ReleaseHandle() { if (!_Disposed) { _Disposed = true; GC.SuppressFinalize(this); VirtualMemoryManager.VirtualFreeEx(_ProcessHandle, _AllocatedPointer, _MemorySize); } return true; } }
System.Runtime.InteropServices.SafeHandle
includes the abstract members IsInvalid
and ReleaseHandle()
. You place your cleanup code in the latter; the former indicates whether this code has executed yet.
With VirtualMemoryPtr
, you can allocate memory simply by instantiating the type and specifying the needed memory allocation.
End 2.0
Once you declare the P/Invoke functions, you invoke them just as you would any other class member. The key, however, is that the imported DLL must be in the path, including the executable directory, so that it can be successfully loaded. Listings 23.6 and 23.7 demonstrate this approach. However, they rely on some constants.
Since flAllocationType
and flProtect
are flags, it is a good practice to provide constants or enums for each. Instead of expecting the caller to define these constants or enums, encapsulation suggests that you provide them as part of the API declaration, as shown in Listing 23.8.
class VirtualMemoryManager { // ... /// <summary> /// The type of memory allocation. This parameter must /// contain one of the following values. /// </summary> [Flags] private enum AllocationType : uint { /// <summary> /// Allocates physical storage in memory or in the /// paging file on disk for the specified reserved /// memory pages. The function initializes the memory /// to zero. /// </summary> Commit = 0x1000, /// <summary> /// Reserves a range of the process's virtual address /// space without allocating any actual physical /// storage in memory or in the paging file on disk. /// </summary> Reserve = 0x2000, /// <summary> /// Indicates that data in the memory range specified by /// lpAddress and dwSize is no longer of interest. The /// pages should not be read from or written to the /// paging file. However, the memory block will be used /// again later, so it should not be decommitted. This /// value cannot be used with any other value. /// </summary> Reset = 0x80000, /// <summary> /// Allocates physical memory with read-write access. /// This value is solely for use with Address Windowing /// Extensions (AWE) memory. /// </summary> Physical = 0x400000, /// <summary> /// Allocates memory at the highest possible address. /// </summary> TopDown = 0x100000, } /// <summary> /// The memory protection for the region of pages to be /// allocated. /// </summary> [Flags] private enum ProtectionOptions : uint { /// <summary> /// Enables execute access to the committed region of /// pages. An attempt to read or write to the committed /// region results in an access violation. /// </summary> Execute = 0x10, /// <summary> /// Enables execute and read access to the committed /// region of pages. An attempt to write to the /// committed region results in an access violation. /// </summary> PageExecuteRead = 0x20, /// <summary> /// Enables execute, read, and write access to the /// committed region of pages. /// </summary> PageExecuteReadWrite = 0x40, // ... } /// <summary> /// The type of free operation. /// </summary> [Flags] private enum MemoryFreeType : uint { /// <summary> /// Decommits the specified region of committed pages. /// After the operation, the pages are in the reserved /// state. /// </summary> Decommit = 0x4000, /// <summary> /// Releases the specified region of pages. After this /// operation, the pages are in the free state. /// </summary> Release = 0x8000 } // ... }
The advantage of enums is that they group together the various values. Furthermore, they can limit the scope to nothing else besides these values.
Whether they are focused on error handling, structs, or constant values, one goal of effective API developers is to provide a simplified managed API that wraps the underlying Win32 API. For example, Listing 23.9 overloads VirtualFreeEx()
with public versions that simplify the call.
class VirtualMemoryManager { // ... [DllImport("kernel32.dll", SetLastError = true)] static extern bool VirtualFreeEx( IntPtr hProcess, IntPtr lpAddress, IntPtr dwSize, IntPtr dwFreeType); public static bool VirtualFreeEx( IntPtr hProcess, IntPtr lpAddress, IntPtr dwSize) { bool result = VirtualFreeEx( hProcess, lpAddress, dwSize, (IntPtr)MemoryFreeType.Decommit); if (!result) { throw new System.ComponentModel.Win32Exception(); } return result; } public static bool VirtualFreeEx( IntPtr lpAddress, IntPtr dwSize) { return VirtualFreeEx( GetCurrentProcessHandle(), lpAddress, dwSize); } [DllImport("kernel32", SetLastError = true)] static extern IntPtr VirtualAllocEx( IntPtr hProcess, IntPtr lpAddress, IntPtr dwSize, AllocationType flAllocationType, uint flProtect); // ... }
One last key point related to P/Invoke is that function pointers in unmanaged code map to delegates in managed code. To set up a timer, for example, you would provide a function pointer that the timer could call back on, once it had expired. Specifically, you would pass a delegate instance that matches the signature of the callback.
Given the idiosyncrasies of P/Invoke, there are several guidelines to aid in the process of writing such code.
On occasion, developers may want to access and work with memory, and with pointers to memory locations, directly. This is necessary, for example, for certain operating system interactions as well as with certain types of time-critical algorithms. To support this capability, C# requires use of the unsafe code construct.
One of C#’s great features is the fact that it is strongly typed and supports type checking throughout the runtime execution. What makes this feature especially beneficial is that it is possible to circumvent this support and manipulate memory and addresses directly. You would do so when working with memory-mapped devices, for example, or if you wanted to implement time-critical algorithms. The key is to designate a portion of the code as unsafe.
Unsafe code is an explicit code block and compilation option, as shown in Listing 23.10. The unsafe
modifier has no effect on the generated CIL code itself, but rather is a directive to the compiler to permit pointer and address manipulation within the unsafe block. Furthermore, unsafe does not imply unmanaged.
class Program { unsafe static int Main(string[] args) { // ... } }
You can use unsafe
as a modifier to the type or to specific members within the type.
In addition, C# allows unsafe
as a statement that flags a code block to allow unsafe code (see Listing 23.11).
class Program { static int Main(string[] args) { unsafe { // ... } } }
Code within the unsafe
block can include unsafe constructs such as pointers.
Note
It is necessary to explicitly indicate to the compiler that unsafe code is supported.
When you write unsafe code, your code becomes vulnerable to the possibility of buffer overflows and similar outcomes that may potentially expose security holes. For this reason, it is necessary to explicitly notify the compiler that unsafe code occurs. To accomplish this, set AllowUnsafeBlocks
to true
in your CSPROJ file, as shown in Listing 23.12.
<Project Sdk="Microsoft.NET.Sdk">
<PropertyGroup>
<OutputType>Exe</OutputType>
<TargetFramework>netcoreapp1.0</TargetFramework>
<ProductName>Chapter20</ProductName>
<WarningLevel>2</WarningLevel>
<AllowUnsafeBlocks>True</AllowUnsafeBlocks>
</PropertyGroup>
<Import Project="..Versioning.targets" />
<ItemGroup>
<ProjectReference Include="..SharedCodeSharedCode.csproj" />
</ItemGroup>
</Project>
Alternatively, you can pass the property on the command line when running dotnet build
(see Output 23.1).
Output 23.1
dotnet build /property:AllowUnsafeBlocks=True
Or, if invoking C# compiler directly, you need the /unsafe
switch (see Output 23.2).
Output 23.2
csc.exe /unsafe Program.cs
With Visual Studio, you can activate this feature by checking the Allow Unsafe Code checkbox from the Build tab of the Project Properties window.
The /unsafe
switch enables you to directly manipulate memory and execute instructions that are unmanaged. Requiring /unsafe
, therefore, makes explicit any exposure to potential security vulnerabilities that such code might introduce. With great power comes great responsibility.
Now that you have marked a code block as unsafe, it is time to look at how to write unsafe code. First, unsafe code allows the declaration of a pointer. Consider the following example:
byte* pData;
Assuming pData
is not null
, its value points to a location that contains one or more sequential byte
s; the value of pData
represents the memory address of the byte
s. The type specified before the *
is the referent type—that is, the type located where the value of the pointer refers. In this example, pData
is the pointer and byte
is the referent type, as shown in Figure 23.1.
Because pointers are simply integers that happen to refer to a memory address, they are not subject to garbage collection. C# does not allow referent types other than unmanaged types, which are types that are not reference types, are not generics, and do not contain reference types. Therefore, the following command is not valid:
string* pMessage;
Likewise, this command is not valid:
ServiceStatus* pStatus;
where ServiceStatus
is defined as shown in Listing 23.13. The problem, once again, is that ServiceStatus
includes a string
field.
struct ServiceStatus { int State; string Description; // Description is a reference type }
In addition to custom structs that contain only unmanaged types, valid referent types include enums, predefined value types (sbyte
, byte
, short
, ushort
, int
, uint
, long
, ulong
, char
, float
, double
, decimal
, and bool
), and pointer types (such as byte**
). Lastly, valid syntax includes void*
pointers, which represent pointers to an unknown type.
Once code defines a pointer, it needs to assign a value before accessing it. Just like reference types, pointers can hold the value null
, which is their default value. The value stored by the pointer is the address of a location. Therefore, to assign the pointer, you must first retrieve the address of the data.
You could explicitly cast an int
or a long
into a pointer, but this rarely occurs without a means of determining the address of a particular data value at execution time. Instead, you need to use the address operator (&
) to retrieve the address of the value type:
byte* pData = &bytes[0]; // Compile error
The problem is that in a managed environment, data can move, thereby invalidating the address. The resulting error message will be “You can only take the address of [an] unfixed expression inside a fixed statement initializer.” In this case, the byte referenced appears within an array, and an array is a reference type (a movable type). Reference types appear on the heap and are subject to garbage collection or relocation. A similar problem occurs when referring to a value type field on a movable type:
int* a = &"message".Length;
Either way, assigning an address of some data requires that the following criteria are met:
The data must be classified as a variable.
The data must be an unmanaged type.
The variable needs to be classified as fixed, not movable.
If the data is an unmanaged variable type but is not fixed, use the fixed
statement to fix a movable variable.
To retrieve the address of a movable data item, it is necessary to fix, or pin, the data, as demonstrated in Listing 23.14.
byte[] bytes = new byte[24]; fixed (byte* pData = &bytes[0]) // pData = bytes also allowed { // ... }
Within the code block of a fixed
statement, the assigned data will not move. In this example, bytes
will remain at the same address, at least until the end of the fixed
statement.
The fixed
statement requires the declaration of the pointer variable within its scope. This avoids accessing the variable outside the fixed
statement, when the data is no longer fixed. However, as a programmer, you are responsible for ensuring that you do not assign the pointer to another variable that survives beyond the scope of the fixed
statement—possibly in an API call, for example. Unsafe code is called “unsafe” for a reason; you must ensure that you use the pointers safely, rather than relying on the runtime to enforce safety on your behalf. Similarly, using ref
or out
parameters will be problematic for data that will not survive beyond the method call.
Since a string is an invalid referent type, it would appear invalid to define pointers to strings. However, as in C++, internally a string is a pointer to the first character of an array of characters, and it is possible to declare pointers to characters using char*
. Therefore, C# allows for declaring a pointer of type char*
and assigning it to a string within a fixed
statement. The fixed
statement prevents the movement of the string during the life of the pointer. Similarly, it allows any movable type that supports an implicit conversion to a pointer of another type, given a fixed statement.
You can replace the verbose assignment of &bytes[0]
with the abbreviated bytes
, as shown in Listing 23.15.
byte[] bytes = new byte[24]; fixed (byte* pData = bytes) { // ... }
Depending on the frequency and time needed for their execution, fixed
statements may have the potential to cause fragmentation in the heap because the garbage collector cannot compact fixed objects. To reduce this problem, the best practice is to pin blocks early in the execution and to pin fewer large blocks rather than many small blocks. Unfortunately, this preference must be tempered with the practice of pinning as little as possible for as short a time as possible, so as to minimize the chance that a collection will happen during the time that the data is pinned. To some extent, .NET 2.0 reduces this problem through its inclusion of some additional fragmentation-aware code.
Potentially you might need to fix an object in place in one method body and have it remain fixed until another method is called; this is not possible with the fixed
statement. If you are in this unfortunate situation, you can use methods on the GCHandle
object to fix an object in place indefinitely. You should do so only if it is absolutely necessary, however; fixing an object for a long time makes it highly likely that the garbage collector will be unable to efficiently compact memory.
You should use the fixed
statement on an array to prevent the garbage collector from moving the data. However, an alternative is to allocate the array on the call stack. Stack allocated data is not subject to garbage collection or to the finalizer patterns that accompany it. Like referent types, the requirement is that the stackalloc
data is an array of unmanaged types. For example, instead of allocating an array of bytes on the heap, you can place it onto the call stack, as shown in Listing 23.16.
byte* bytes = stackalloc byte[42];
Because the data type is an array of unmanaged types, the runtime can allocate a fixed buffer size for the array and then restore that buffer once the pointer goes out of scope. Specifically, it allocates sizeof(T) * E
, where E
is the array size and T
is the referent type. Given the requirement of using stackalloc
only on an array of unmanaged types, the runtime restores the buffer back to the system by simply unwinding the stack, thereby eliminating the complexities of iterating over the f-reachable queue (see the “Garbage Collection” section and discussion of finalization in Chapter 10) and compacting reachable data. Thus, there is no way to explicitly free stackalloc
data.
The stack is a precious resource. Although it is small, running out of stack space will have a big effect—namely, the program will crash. For this reason, you should make every effort to avoid running out stack space. If a program does run out of stack space, the best thing that can happen is for the program to shut down/crash immediately. Generally, programs have less than 1MB of stack space (and possibly a lot less). Therefore, take great care to avoid allocating arbitrarily sized buffers on the stack.
Accessing the data stored in a variable of a type referred to by a pointer requires that you dereference the pointer, placing the indirection operator prior to the expression. For example, byte data = *pData;
dereferences the location of the byte
referred to by pData
and produces a variable of type byte
. The variable provides read/write access to the single byte
at that location.
Using this principle in unsafe code allows the unorthodox behavior of modifying the “immutable” string, as shown in Listing 23.17. In no way is this strategy recommended, even though it does expose the potential of low-level memory manipulation.
string text = "S5280ft"; Console.Write("{0} = ", text); unsafe // Requires /unsafe switch { fixed (char* pText = text) { char* p = pText; *++p = 'm'; *++p = 'i'; *++p = 'l'; *++p = 'e'; *++p = ' '; *++p = ' '; } } Console.WriteLine(text);
The results of Listing 23.17 appear in Output 23.3.
Output 23.3
S5280ft = Smile
In this case, you take the original address and increment it by the size of the referent type (sizeof(char)
), using the pre-increment operator. Next, you dereference the address using the indirection operator and then assign the location with a different character. Similarly, using the +
and –
operators on a pointer changes the address by the * sizeof(T)
operand, where T
is the referent type.
The comparison operators (==
, !=
, <
, >
, <=
, and >=
) also work to compare pointers. Thus, their use effectively translates to a comparison of address location values.
One restriction on the dereferencing operator is the inability to dereference a void*
. The void*
data type represents a pointer to an unknown type. Since the data type is unknown, it can’t be dereferenced to produce a variable. Instead, to access the data referenced by a void*
, you must convert it to another pointer type and then dereference the latter type.
You can achieve the same behavior as implemented in Listing 23.17 by using the index operator rather than the indirection operator (see Listing 23.18).
string text; text = "S5280ft"; Console.Write("{0} = ", text); unsafe // Requires /unsafe switch { fixed (char* pText = text) { pText[1] = 'm'; pText[2] = 'i'; pText[3] = 'l'; pText[4] = 'e'; pText[5] = ' '; pText[6] = ' '; } } Console.WriteLine(text);
The results of Listing 23.18 appear in Output 23.4.
Output 23.4
S5280ft = Smile
Modifications such as those in Listings 23.17 and 23.18 can lead to unexpected behavior. For example, if you reassigned text
to "S5280ft"
following the Console.WriteLine()
statement and then redisplayed text
, the output would still be Smile
because the address of two equal string literals is optimized to one string literal referenced by both variables. In spite of the apparent assignment
text = "S5280ft";
after the unsafe code in Listing 23.17, the internals of the string assignment are an address assignment of the modified "S5280ft"
location, so text
is never set to the intended value.
Dereferencing a pointer produces a variable of the pointer’s underlying type. You can then access the members of the underlying type using the member access dot operator in the usual way. However, the rules of operator precedence require that *x.y
means *(x.y)
, which is probably not what you intended. If x
is a pointer, the correct code is (*x).y
, which is an unpleasant syntax. To make it easier to access members of a dereferenced pointer, C# provides a special member access operator: x->y
is a shorthand for (*x).y
, as shown in Listing 23.19.
unsafe { Angle angle = new Angle(30, 18, 0); Angle* pAngle = ∠ System.Console.WriteLine("{0}° {1}' {2}"", pAngle->Hours, pAngle->Minutes, pAngle->Seconds); }
The results of Listing 23.19 appear in Output 23.5.
Output 23.5
30° 18' 0
As promised at the beginning of this chapter, we finish up with a full working example of what is likely the most “unsafe” thing you can do in C#: obtain a pointer to a block of memory, fill it with the bytes of machine code, make a delegate that refers to the new code, and execute it. In this example, we use assembly code to determine the processor ID. If run on a Windows machine, it prints the processor ID. Listing 23.20 shows how to do it.
using System; using System.Runtime.InteropServices; using System.Text; class Program { public unsafe delegate void MethodInvoker(byte* buffer); public unsafe static int ChapterMain() { if (RuntimeInformation.IsOSPlatform(OSPlatform.Windows)) { unsafe { byte[] codeBytes = new byte[] { 0x49, 0x89, 0xd8, // mov %rbx,%r8 0x49, 0x89, 0xc9, // mov %rcx,%r9 0x48, 0x31, 0xc0, // xor %rax,%rax 0x0f, 0xa2, // cpuid 0x4c, 0x89, 0xc8, // mov %r9,%rax 0x89, 0x18, // mov %ebx,0x0(%rax) 0x89, 0x50, 0x04, // mov %edx,0x4(%rax) 0x89, 0x48, 0x08, // mov %ecx,0x8(%rax) 0x4c, 0x89, 0xc3, // mov %r8,%rbx 0xc3 // retq }; byte[] buffer = new byte[12]; using (VirtualMemoryPtr codeBytesPtr = new VirtualMemoryPtr(codeBytes.Length)) { Marshal.Copy( codeBytes, 0, codeBytesPtr, codeBytes.Length); MethodInvoker method = Marshal.GetDelegateForFunctionPointer<MethodInvoker>(codeBytesPtr); fixed (byte* newBuffer = &buffer[0]) { method(newBuffer); } } Console.Write("Processor Id: "); Console.WriteLine(ASCIIEncoding.ASCII.GetChars(buffer)); } // unsafe } else { Console.WriteLine("This sample is only valid for Windows"); } return 0; } }
The results of Listing 23.20 appear in Output 23.6.
Output 23.6
Processor Id: GenuineIntel
As demonstrated throughout this book, C# offers great power, flexibility, consistency, and a fantastic structure. This chapter highlighted the ability of C# programs to perform very low-level machine-code operations.
Before we end the book, Chapter 24 briefly describes the underlying execution framework and shifts the focus from the C# language to the broader context in which C# programs execute.
52.15.112.69