Chapter 10. Building Managed Code

 

Philosophy: One of my biggest challenges is to keep my intellectual arteries sufficiently pliable to adapt to and accept inevitable change.

 
 --Paul Harvey, radio legend, August 4, 2002 (age: 83 yrs)

Once again, I start this chapter with some basic definitions. If you are familiar with the .NET Framework, you might want to skip this section or bear with me on this remedial review. If you’re familiar with other Web service technologies, you might find this enlightening, or you might opt to skip this chapter altogether. I believe that all builders should be familiar with the basic blocks of the .NET Framework that I talk about here. This understanding will help in the move from classic Win32 (native) builds to managed code.

The Official Definition of Managed Code

The “official” definition of managed code from Partition 1 (Architecture) of the Tool Developers Guide in the .NET Framework SDK documentation is as follows:

Managed code is simply code that provides enough information to allow the common language runtime (CLR) to provide a set of core services, including these:

  • Given an address inside the code for a method, locate the metadata describing the method

  • Walk the stack

  • Handle exceptions

  • Store and retrieve security information

Managed code requires the .NET Framework (.NET FX, Fx, or just “the Framework” is the shorthand notation) to be installed on a computer to execute or run. The .NET Framework consists of three major parts: the CLR, the Framework Class Library, and ASP.NET.

You can install the .NET Framework on the platforms shown in Table 10.1.

Table 10.1. Platforms That the .NET Framework Can Be Installed On

Supports All of the .NET Framework

Supports the Entire .NET Framework Except Microsoft ASP.NET

Windows 98

Windows 2000 (all versions—no Service Packs required)

Windows 98 SE

Windows XP Professional

Windows Me

 

Windows NT 4.0 (all versions—Service Pack 6a required)

 

Windows XP Home Edition

 

Windows Server 2003 is the first operating system from Microsoft that shipped with the .NET Framework. All future operating systems from Microsoft will also include the .NET Framework, so you do not have to download or redistribute the parts that your code needs to run. You can install the .NET Framework on the existing platforms mentioned in Table 10.1 in various ways, but the easiest is to go to the Windows Update site (http://windowsupdate.microsoft.com) or just type windowsupdate in the address line of your browser. The Windows update site might prompt you to install required hotfixes or service packs before installing the framework. This is a good thing. Really.

You can find a lot of information on .NET by searching on the Internet. I want to keep this chapter focused on the aspects of building the managed code rather than on providing details of how .NET works, but this brief overview is necessary so we can get to the points of building managed code.

What Is the CLR, and How Does It Relate to Managed Code?

As mentioned in the previous section, the .NET Framework provides a runtime environment called the CLR—usually referred to as (or just) “the runtime.” The CLR runs the code and provides services that make the development process easier.

Compilers and tools expose the runtime’s functionality and enable you to write code that benefits from this managed execution environment. Managed code is developed with a language compiler that targets the runtime; it benefits from features such as cross-language integration, cross-language exception handling, enhanced security, versioning and deployment support, a simplified model for component interaction, and debugging and profiling services.

To enable the runtime to provide services to managed code, language compilers must emit metadata that describes the types, members, and references in your code. Metadata is stored with the code; every loadable CLR portable executable (PE) file contains this metadata. The runtime uses the metadata to locate and load classes, lay out instances in memory, resolve method invocations, generate native code, enforce security, and set runtime context boundaries.

Managed data is a special memory heap that the CLR allocates and releases automatically through a process called garbage collection. Garbage collection is a mechanism that allows the computer to detect when an object can no longer be accessed. It then automatically releases the memory used by that object. From there, it calls a clean-up routine, called a “finalizer,” which the user writes. Some garbage collectors, like the one used by .NET, compact memory and decrease your program’s working set. I find the garbage collector in .NET to be the most impressive aspect of the platform.

Conversely, unmanaged code cannot use managed data; only managed code can access managed data. Unmanaged code does not enjoy the benefits afforded by the CLR: garbage collection, enhanced security, simplified deployment, rich debugging support, consistent error handling, language independence, and even the possibility of running on different platforms.

You can still create unmanaged code (which is the new name for the standard Win32 code you wrote before .NET [native]) with Visual Studio .NET by creating a Microsoft Foundation Class (MFC) or an Active Template Library (ATL) project in the latest version of Visual C++, which is included with Visual Studio .NET. In Chapter 9, “Build Security,” I discuss why you might still want to create unmanaged code. Furthermore, you might still have some legacy components that your .NET application needs to interop with.

You can also create managed code with Visual C++ thanks to something called C++ with Managed Extensions. There is also talk that Microsoft will support Visual Basic 6.0 for some time to come. Because the .NET Framework represents such a fundamental shift from Win32/COM, the two platforms will likely coexist for a number of years.

Managed Execution Process

According to the “.NET Framework Developer’s Guide,” the managed execution process includes the following steps:

  1. Choose a compiler—To obtain the benefits provided by the CLR, you must use one or more language compilers that target the runtime, such as Visual Basic, C#, Visual C++, or JScript.

  2. Compile your code to Microsoft Intermediate Language (MSIL)—Compiling translates your source code into MSIL and generates the required metadata. This is the only part of the execution process that the build team really cares about.

  3. Compile MSIL to native code—At execution time, a just-in-time (JIT) compiler translates the MSIL into native code. During this compilation, code must pass a verification process that examines the MSIL and metadata to find out whether the code can be determined to be type safe.

  4. Execute your code—The CLR provides the infrastructure that enables execution to take place, besides a variety of services that can be used during execution.

Figure 10.1 depicts this .NET compilation process in graphical form.

.NET compilation process.

Figure 10.1. .NET compilation process.

The .NET applications are developed with a high-level language, such as C# or VB.NET. The next step is to compile this code into MSIL. MSIL is a full-fledged, object-aware language, and it’s possible (but unlikely—an analogy might be to write an application in an assembly language) to build applications using nothing but MSIL. The JIT Compiler (aka jitter) occurs at the assembly level. JIT compilation takes into account the fact that some code might never be called during execution. Rather than using time and memory to convert all the MSIL in a portable executable (PE) file to native code, it converts the MSIL as needed during execution and stores the resulting native code so that it is accessible for subsequent calls. Sometimes people confuse JIT compiling for “building,” but it is only the bold text in Figure 10.1 that the build team really cares about.

This JIT compiling is what makes .NET rather unique and sometimes confusing when compared to unmanaged code builds. In the old world of building you would just compile and link everything into an executable binary and then ship the binary or binaries. In the .NET or Web services world, you ship “assemblies” that need to be JIT compiled or “assembled” by the .NET Framework.

Note that the compilers for the .NET languages are included free with the .NET Framework. In addition, the C++ compiler is now free. Also notice that there is no concept of “linking” in .NET. Instead, code gets linked dynamically in the “runtime” platform that .NET provides.

The Definition of Assemblies As It Pertains to the .NET Framework

Assemblies are the building blocks of .NET Framework applications; they form the fundamental unit of deployment, version control, reuse, activation scoping, and security permissions. An assembly is a collection of types and resources that are built to work together and form a logical unit of functionality. An assembly provides the CLR with the information it needs to be aware of type implementations. To the runtime, a type does not exist outside the context of an assembly. The simplest way to look at an assembly is that it is either a .NET (managed) DLL or an EXE. Sometimes, it can be a file that contains a group of DLLs, but that’s rare.

Now that we have discussed some basic building blocks of the .NET Framework, let’s move on to discuss some things you need to do when building managed code.

Delay Signing and When to Use It

In working with Shawn Farkas, a tester in the CLR team, he has a lot of good points about delayed signing, which I have gathered in this section. Most people who develop .NET applications know about the delay signing feature of the CLR. (If you don’t, check out MSDN’s “Delay Signing an Assembly” for more details.) Basically, delay signing allows a developer to add the public key token to an assembly, without having access to the private key token. Because the public key token is part of an assembly’s strong name, assemblies under development can carry the same identity that they will have when they are signed; however, every developer doesn’t have to have access to the private keys.

For instance, to get an assembly signed at Microsoft, we have to submit it to a special signing group. These are the only people who have access to the full Microsoft key pair. Obviously, we don’t want to go through this process for every daily build of the framework, let alone for each developer’s private builds. (Imagine the debugging process if you had to wait for a central key group to sign each build you created.) Instead of going through all this overhead, we delay sign our assemblies until we get ready to make a release to the public, at which point we go through the formal signing process. You’ll learn more about this topic in Chapter 15, “Customer Service and Support.”

A delay-signed assembly contains only the public key token of the signing key, not an actual signature. (That’s because the person producing the delay-signed assembly most likely doesn’t have access to the private key that’s necessary to create a signature.) Inside the PE file that was produced, a delay-signed assembly has space reserved for a signature in the future, but that signature is just a block of zeros until the real signature is computed. Because this block is not likely to be the actual signature value of the assembly, these assemblies will fail to verify upon loading because their signatures are incorrect.

Obviously, it wouldn’t be useful if a delay-signed assembly were completely unable to load. To work around this problem, you need to use the Strong Named tool (sn.exe) included in the .NET Fx tools to add assemblies to the skip verification list. The specific command line is as follows:

sn -Vr assembly [userlist]

Assembly is the name of the assembly to skip. In addition to referring to a specific assembly, Assembly can be specified in the form *,publicKeyToken to skip verification for all assemblies with a given public key token. Users is a comma-separated list of users for which verification is skipped. If this part is left out, verification is skipped for all users.

The Problem with This Command

What this command does is tell the runtime not to verify the signature on an assembly that has the given public key token (if you use the *,publicKeyToken format), or just on a specific assembly. This is a gigantic security hole. You can easily read public key tokens from any assembly that you have access to. If you run ILDasm on System.dll, inside the manifest, you find the following line:

.publickey = (00 00 00 00 00 00 00 00 04 00 00 00 00 00 00 00 )

This corresponds to the public key assigned to any assembly that is standardized by ECMA/ISO. You can easily compute the token from this value, but an easier way to get it would be to look at ILDasm on any assembly that references mscorlib. For instance, looking at the manifest of System.Xml.dll under ILDasm shows the following lines:

.assembly extern System
{
  .publickeytoken = (B7 7A 5C 56 19 34 E0 89 ) // .zV.4..
  .ver 1:0:5000:0
}

This code shows the ECMA/ISO public key token. It’s easy for a malicious developer to write an assembly named System.dll, with an assembly version of 1.0.5000.00, and put the public key just extracted from System.dll into the assembly. He won’t be able to compute a valid signature because he doesn’t have access to the ECMA/ISO private key pair, but that hardly matters because you’ve turned off strong name verification for this particular public key token. All he has to do is install this assembly in place of System.dll in your GAC, and he now owns your machine.

For this reason, don’t skip verification for assemblies unless you are developing them yourself, and be extra careful about what code is downloaded onto your machine that might claim to be from your organization.

Protecting Yourself from the Security Hole You Just Created

Even if you take these precautions inside your company, how can you be sure that someone external to your company cannot somehow disable checking the strong name on your assemblies and swap your assembly with an evil counterpart?

The short answer to this is that you can’t. The skip verification list is stored in the registry under HKLMSoftwareMicrosoftStrongName Verification<asmName,publicKeyToken>, which is protected by an Access Control List (ACL) that contains both users and groups and the level of access that each has. Anyone can read an ACL, but only administrators can write to it. If a malicious developer manages to write a public key token into your user’s skip verification list, one of two things has happened:

  • Someone has modified the ACL, allowing more write access to this key than usual.

  • The malicious developer is already an administrator on the machine.

If the first bullet is true, revert the ACL to allow only administrators to write to the key, thus closing the hole. If the second bullet is true, the malicious developer already owns your machine. As an admin, this malicious developer could conceivably replace the CLR with a hacked version that doesn’t verify assembly signatures, or perhaps doesn’t implement CAS. If you’ve gotten into this second situation, “game over, thanks for playing.” The malicious person will already have control of your box and can do as he wants.

In summary, delay-signed assemblies increase security in development shops by reducing the number of people who need access to an organization’s private keys. However, the requirement that delay-signed assemblies need to be registered in the skip verification list means that developers’ machines are open to various forms of attack. Make sure that your developers are aware of the situation and don’t overuse your skip verification list, to help make your machines more secure in these environments. Again, this is something that gets driven out of a CBT.

One Solution or Many Solution Files?

In VS 2002/2003, C# and VB don’t have any notion on being up-to-date or doing incremental builds. Because the time-stamp of C#/VB output assemblies always changes when you build, any projects that depend on them will always be out-of-date with respect to those assemblies and will need to be rebuilt. This story is somewhat better in VS 2005, but there still is no notion of an “incremental build” against an assembly dependency. So if the assembly has changed at all, no matter how minor the change, all of the dependent projects will have to be rebuilt. This can be a big performance hit on your build times.

How does Microsoft get around this? One answer is faster, bigger hardware, but a more practical one is the concept of the Central Build Team doing these large, time-consuming builds—to make sure everything works and plays well. The developers would then use file references in their own private solution file to decrease build times. If the code changes in the referenced files, the IDE will not automatically rebuild it, so they might not get all the current changes. This can be problematic at best.

Here are some different ideas to get around this problem:

  • Use a single solution file that contains all your .NET projects for a daily build, and keep that solution file checked into source control that the CBT owns. The Central Build Team members are the only ones allowed to make changes to the “golden” solution, and the developers put in a request to the CBT if they need to add or remove projects.

  • Each developer has a “private” solution file on his machine (which I’m willing to bet he has already) that he does not check in to source control. This allows him to have faster build times and use file references instead of project references, thus avoiding long rebuild times.

  • Another option is to break up the big single solution file and have each component team check in a “master” solution file that the CBT owns. This would take more time to set up but is probably the best way to build .NET projects.

Summary

This chapter started with a crash course in .NET. In this section, I pointed out the most relevant parts of the framework to the build team. It probably seems like we took an awfully long road to get to the two main points of this chapter: delayed signing tips and how many solution files you should be using to build. In fact, the only other major component of the .NET Framework that we did not talk about is the Global Assembly Cache (GAC).

Recommendations

You will find that building projects for the .NET Framework is a bit different than the classic “unmanaged code builds” that have been around before Web services were ever dreamed up. I went over the parts of building .NET code that tend to trip people in this chapter; the following is a quick list of recommendations:

  • If you build managed code, learn the basic terms of the .NET Framework and the compilation process explained in this chapter.

  • Use delayed signing when developing your project to avoid having to sign the assemblies in conjunction with your daily build.

  • Understand the risk of exposing your developer’s machine to external attacks because of the skip verification list that is created when delaying signing.

  • Decide what is the most practical way of setting up your solution files for your .NET projects. Then enforce whatever policy you come up with through your CBT.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.145.45.5