Chapter 10 Code Safety

There is a well-worn adage in software engineering that you should first make your code correct, then make it fast. Most of this book has focused strictly on performance, but this chapter is a little bit of an aside into some important topics that, while not strictly related to performance, may help you in your pursuit of high-performance, scalable applications. By undertaking some good practices to ensure the stability and reliability of your code, you free yourself to make more drastic changes for performance’s sake. When problems do occur, you will more easily narrow down the location of the issue.

  1. Understand the Underlying OS, APIs, and Hardware

Heavy performance optimization is going to defy any abstractions you want to impose on your software. As mentioned numerous times in this book, you must understand the APIs you call in order to make intelligent decisions about how to use them, or whether to use them at all.

That is not enough, however. Take threading, for example. While various versions of the .NET Framework have added abstraction on top of threads that make asynchronous programming easier, taking advantage of this fully will require you to understand how these features interact with the underlying OS threads and its scheduling algorithm. The same is true for debugging memory problems. The GC heap is remarkably simple to inspect, but if you have a huge process that loads thousands of types from hundreds of assemblies, you may run into problems outside of the pure managed world, which will require you to understand a process’s full memory layout.

Finally, the hardware is just as important. In the chapter on JIT, I mentioned things like locality of reference—putting bits of code and data that are used together physically near each other in memory so that they can be efficiently included in a processor’s cache. If you are lucky, your code will target a single hardware platform. If not, then you need to understand how it executes code differently. You may have different memory limits or different caches sizes, or even more substantial differences such as completely different memory models.

  1. Restrict API Usage in Certain Areas of Your Code

There is no reason why you should allow all components to use the full breadth of every Framework and system API. For example, if you have a strict Task-based processing model, then centralize that functionality and prohibit any other components from accessing anything in the System.Threading namespace.

These kinds of rules are particularly important for systems with an extension model. You usually want the platform executing all the hard, dangerous code, while the extensions do simple actions in their respective domain.

An excellent tool for enforcing these rules is FxCop, which is a free static code analysis tool that ships with Visual Studio. It comes with standard rules in categories such as Performance, Globalization, Security, and more, but you can add a library of your own rules. Many of the performance rules we discuss in this book can be represented as FxCop rules, for example:

  • Prohibiting use of “dangerous” namespaces
  • Banning use of Regex, especially if used improperly
  • Banning types or APIs that typically cause LOH allocations
  • Banning APIs that have better alternatives such as TryParse in lieu of Parse
  • Finding instances of double-casting
  • Finding instances of boxing

Before you start writing rules, keep in mind that FxCop can only analyze IL and metadata. It has no knowledge of C# or any other high-level language. Because of this, you will not be able to enforce static checks that rely on specific language patterns. Writing your own FxCop rules is easy, but there is little to no official documentation, and you will find yourself relying on analyzing the IL of your programs and making extensive use of IntelliSense to poke through the FxCop API. The more you understand IL, the more complicated rules you can develop.

You will first need to install the FxCop SDK, which is trickier than it should be. If you have Visual Studio Professional or better, then it has been included and rebranded Code Analysis in the IDE, but it is still FxCop underneath. On my machine, the relevant files are located in C:Program Files (x86)Microsoft Visual Studio 11.0Team ToolsStatic Analysis ToolsFxCop.

If you cannot get access to the right version of Visual Studio, there are still a few options. The easiest way is from CodePlex at http://www.writinghighperf.net/go/32. If that project has disappeared by the time you read this, then try the Windows 7.1 SDK, which appears to have a broken web installer now, but you can get the ISO image at http://www.writinghighperf.net/go/33 and extract the installer from SetupWinSDKNetFxToolscab1.cab. There is a file inside that archive that begins with the name WinSDK_FxCopSetup.exe. Extract that file and rename it to FxCopSetup.exe and you are on your way.

In the source code accompanying this book you will find projects related to FxCop. These are in their own solution file to avoid breaking the build for rest of the sample projects. FxCopRules contains the rules that will be loaded by the FxCop engine and run against some target assembly. FxCopViolator contains a class with a number of violations that the rules will test against. Follow along with these projects as I explain the various components.

Before you can build the rules, you may need to edit to the FxCopRules.csproj file to point to the correct SDK path. The current values are:

<PropertyGroup>
<FxCopSdkDir>C:Program Files (x86)Microsoft Fxcop 10.0</FxCopSdkDir>
</PropertyGroup>
<ItemGroup>
<Reference Include="$(FxCopSdkDir)FxCopSdk.dll" />
<Reference Include="$(FxCopSdkDir)Microsoft.CCi.dll" />
</ItemGroup>

Update the FxCopSdkDir value to point to the FxCop installation directory, or wherever you have placed the appropriate DLLs.

Next, you will need to create a Rules.xml file that contains the metadata for each rule. Our first rule will look like this:

<?xml version="1.0" encoding="utf-8" ?>
<Rules FriendlyName="Custom Rules">
<Rule TypeName="DisallowStaticFieldsRule"
Category="Custom.Arbitrary"
CheckId="HP100">
<Name>Static fields are not allowed</Name>
<Description>Static fields are not allowed because they lead to problems with thread safety.</Description>
<Url>http://internaldocumentationsite/FxCop/HP100</Url>
<Resolution>Make the static field '{0}' either readonly or const.</Resolution>
<MessageLevel Certainty="90">Error</MessageLevel>
<FixCategories>Breaking</FixCategories>
<Email>[email protected]</Email>
<Owner>Ben Watson</Owner>
</Rule>
</Rules>

Note that the TypeName attribute must match the name of the rule class that we define next. This XML file must be included in the project with the Build Action set to Embedded Resource.

Each rule we define must derive from a class provided by the FxCop SDK and include some common information, such as the location of the XML rules manifest. To make this more convenient, it is a good idea to create a base class for all of your rules that provides this common functionality.

using Microsoft.FxCop.Sdk;
using System.Reflection;

namespace FxCopRules
{
public abstract class BaseCustomRule : BaseIntrospectionRule
{
// The manifest name is the default namespace plus the name
// of the XML rules file, without the extension.
private const string ManifestName = "FxCopRules.Rules";

// The assembly where the rule manifest is
// embedded (the current assembly in our case).
private static readonly Assembly ResourceAssembly =
typeof(BaseCustomRule).Assembly;

protected BaseCustomRule(string ruleName)
:base(ruleName, ManifestName, ResourceAssembly)
{
}
}
}

Next, define a class that derives from BaseCustomRule that will be for a specific violation you want to check. The first example will disallow all static fields, but allow const and readonly fields.

public class DisallowStaticFieldsRule : BaseCustomRule
{
public DisallowStaticFieldsRule()
: base(typeof(DisallowStaticFieldsRule).Name)
{
}

public override ProblemCollection Check(Member member)
{
var field = member as Field;
if (field != null)
{
// Find all static data that isn't const or readonly
if (field.IsStatic && !field.IsInitOnly && !field.IsLiteral)
{
// field.FullName is an optional argument that will be used
// to format the Resolution string’s {0} parameter.
var resolution = this.GetResolution(field.FullName);
var problem = new Problem(resolution, field.SourceContext);
this.Problems.Add(problem);
}
}
return this.Problems;
}
}

The BaseCustomRule class provides a number of virtual Check method overrides with various types of arguments which you can override to provide your functionality (by default, these methods do nothing). IntelliSense is your friend while writing FxCop rules, and it reveals the following Check methods:

  • Check(ModuleNode moduleNode)
  • Check(Parameter parameter)
  • Check(Resource resource)
  • Check(TypeNode typeNode)
  • Check(string namespaceName, TypeNodeCollection types)

You can also examine individual lines of IL code from any method. Here’s a rule that prohibits string case conversion.

public class DisallowStringCaseConversionRule : BaseCustomRule
{
public DisallowStringCaseConversionRule()
: base(typeof(DisallowStringCaseConversionRule).Name)
{ }

public override ProblemCollection Check(Member member)
{
var method = member as Method;
if (method != null)
{
foreach (var instruction in method.Instructions)
{
if (instruction.OpCode == OpCode.Call
|| instruction.OpCode == OpCode.Calli
|| instruction.OpCode == OpCode.Callvirt)
{
var targetMethod = instruction.Value as Method;
if (targetMethod.FullName == "System.String.ToUpper"
|| targetMethod.FullName == "System.String.ToLower")
{
var resolution = this.GetResolution(method.FullName);
var problem = new Problem(resolution,
method.SourceContext);
this.Problems.Add(problem);
}
}
}
}

return this.Problems;
}
}

For a final example, let’s look at a different way to tell FxCop to traverse the code. In addition to the Check methods described previously, you can override dozens of Visit* methods. These are called in a recursive descent through every node in the program graph, starting at the node you pick. You override just the Visit methods you need. Here’s an example that uses this to add a rule against instantiating a Thread object:

public class DisallowThreadCreationRule : BaseCustomRule
{
public DisallowThreadCreationRule() : base(typeof(DisallowThreadCreationRule).Name) { }

public override ProblemCollection Check(Member member)
{
var method = member as Method;
if (method != null)
{
VisitStatements(method.Body.Statements);
}

return base.Check(member);
}

public override void VisitConstruct(Construct construct)
{
if (construct != null)
{
var binding = construct.Constructor as MemberBinding;
if (binding != null)
{
var instanceInitializer =
binding.BoundMember as InstanceInitializer;
if (instanceInitializer.DeclaringType.FullName
== "System.Threading.Thread")
{
var problem = new Problem(this.GetResolution(),
construct.SourceContext);
this.Problems.Add(problem);
}
}
}

base.VisitConstruct(construct);
}
}

It is pretty straightforward once you learn how it works. The biggest obstacle to creating your own rules is really the lack of documentation. To learn more about custom FxCop rules, read an excellent walkthrough by Jason Kresowaty at http://www.writinghighperf.net/go/34.

  1. Centralize and Abstract Performance-Sensitive and Difficult Code

You should keep particularly difficult or performance-sensitive code centralized for easy maintenance and to prevent the rest of the system from making performance mistakes. This is a stronger rule than the well-known DRY (Don’t Repeat Yourself) principle; that is, do not have the same code in two locations—refactor it to have a single copy of the code, reusable in multiple locations.

You should also keep as much performance-sensitive in one place for easy maintenance, preferably behind APIs that the rest of your application uses. For example, if your application downloads files via HTTP, you could wrap this in an API that exposes only the parts of downloading that the rest of your program needs to know (e.g., the URL you are requesting and the downloaded content). The API manages the complexity of the HTTP call and your entire application goes through that API every time it needs to make an HTTP call. If you discover a performance problem with downloading, or need to enforce a download queue, or any other change, it is trivial to do behind the API. Remember that those APIs need to maintain the asynchronous nature of the operation.

  1. Isolate Unmanaged or Unsafe Code

For many reasons, you should move away from unmanaged code if at all possible. As discussed in the introduction, the benefits of unmanaged code are often exaggerated, but the danger of memory corruption is all too real.

That said, if you have to keep any unmanaged code around (say, to talk to a legacy system, and it is too expensive to move the entire interface to the managed world), then isolate it well. There are many ways to do the isolation, but you absolutely want to avoid having random bits of your system call into unmanaged code all over the place. This is a recipe for chaos.

Ideally, split the unmanaged code into its own process to provide strict OS-level isolation. If that is not possible and you need the unmanaged code to be loaded into the same process, try to keep it in as few DLLs as possible and have all calls to it go through a centralized API that can enforce standard safeguards.

Treat managed code that is marked unsafe exactly like unmanaged code and isolate it to as small a scope as you can. You will also need to enable unsafe code in the project settings.

  1. Prefer Code Clarity to Performance Until Proven Otherwise

Code readability and maintenance is more important than performance until proven otherwise. If you find you do need to make deep changes for performance reasons, do it in a way that is as transparent to the code above it as possible. Keep the level above it as clear as possible.

Once you do make the code worse to read in favor of performance, make sure you document in the code why you are doing it so that someone does not come by after you and “clean up” your elegant optimization by making it simpler.

  1. Summary

To ensure your code is safe, you must understand the implementation details at all levels. Isolate your riskiest code, especially native or unsafe code, to specific modules to limit exposure. Ban problematic APIs and coding patterns and enforce reasonable code standards to encourage safe practices. Enforce these practices with FxCop or other static analysis build tools. Do not sacrifice code clarity or maintainability for performance unless it is particularly justified.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.226.4.191