CHAPTER 36

image

Execution-Time Code Generation

If you come from a C++ background, it’s likely you have a very “compile-time” view of the world. Because a C++ compiler does all code generation when the code is compiled, C++ programs are static systems that are fully known at compile time.

The Common Language Runtime provides a different way of doing things. The compile-time world still exists, but it’s also possible to build dynamic systems where new code is added by loading assemblies or even by writing custom code on the fly.

Loading Assemblies

In the .NET Common Language Runtime, it’s possible to load an assembly from disk and to create instances of classes from that assembly. To demonstrate this, I’ll show how to build a simple logging facility that can be extended by the customer at runtime to send informational messages elsewhere.

The first step is to define the standard part of the facility.

public interface ILogger
{
    void Log(string message);
}
public class LogDriver
{
    List<ILogger> m_loggers = new List<ILogger>();
    public LogDriver()
    {
    }
    public void AddLogger(ILogger logger)
    {
       m_loggers.Add(logger);
    }
    public void Log(string message)
    {
       foreach (ILogger logger in m_loggers)
       {
            logger.Log(message);
       }
    }
}
public class LogConsole : ILogger
{
    public void Log(string message)
    {
       Console.WriteLine(message);
    }
}

You first define the ILogger interface that your loggers will implement and the LogDriver class that calls all the registered loggers whenever the Log() function is called. There’s also a LogConsole implementation that logs messages to the console. This file is compiled to an assembly named LogDriver.dll.

In addition to this file, there’s a small class to exercise the loggers.

class Test
{
    public static void Main()
    {
       LogDriver logDriver = new LogDriver();
       logDriver.AddLogger(new LogConsole());
       logDriver.Log("Log start: " + DateTime.Now.ToString());
       for (int i = 0; i < 5; i++)
       {
            logDriver.Log("Operation: " + i.ToString());
       }
       logDriver.Log("Log end: " + DateTime.Now.ToString());
    }
}

This code merely creates a LogDriver, adds a LogConsole to the list of loggers, and does some logging.

Making It Dynamic

It’s now time to add some dynamic ability to your system. A mechanism is needed so the LogDriver class can discover loggers in assemblies it does not know about at compile time. To keep the sample simple, the code will look for assemblies named LogAddIn*.dll.

The first step is to come up with another implementation of ILogger. The LogAddInToFile class logs messages to logger.log and lives in LogAddInToFile.dll.

// file=LogAddInToFile.cs
// compile with: csc /r:..logdriver.dll /target:library logaddintofile.cs
using System;
using System.Collections;
using System.IO;
public class LogAddInToFile: ILogger
{
    StreamWriter streamWriter;
    public LogAddInToFile()
    {
       streamWriter = File.CreateText(@"logger.log");
       streamWriter.AutoFlush = true;
    }
    public void Log(string message)
    {
       streamWriter.WriteLine(message);
    }
}

This class doesn’t require much explanation. Next, the code to load the assembly needs to be added to the LogDriver class.

    public void ScanDirectoryForLoggers()
    {
       DirectoryInfo dir = new DirectoryInfo(@".");
       foreach (FileInfo f in dir.GetFiles(@"LogAddIn*.dll"))
       {
            ScanAssemblyForLoggers(f.FullName);
       }
    }
    void ScanAssemblyForLoggers(string filename)
    {
       Assembly a = Assembly.LoadFrom(filename);
       foreach (Type t in a.GetTypes())
       {
            if (t.GetInterface("ILogger") != null)
            {
                ILogger iLogger = (ILogger) Activator.CreateInstance(t);
                m_loggers.Add(iLogger);
            }
       }
    }
}

The ScanDirectoryForLoggers() function looks in the current directory for any files that match the expected file format. When one of the files is found, ScanAssemblyForLoggers() is called. This function loads the assembly and then iterates through each of the types contained in the assembly. If the type implements the ILogger interface, then an instance of the type is created using Activator.CreateInstance(), the instance is cast to the interface, and the interface is added to the list of loggers.

If an even more dynamic implementation is desirable, a FileChangeWatcher object could be used to watch a specific directory, and any assemblies copied to that directory could be then be loaded.

There are a few caveats when loading assemblies from a disk. First, the runtime locks assemblies when they are loaded. Second, it’s not possible to unload a single assembly, so if unloading an assembly is required (to update a class, for example), it will need to be loaded in a separate application domain since application domains can be unloaded. For more information on application domains, consult the .NET Common Language Runtime documentation.

Code Generation at Runtime

If your code contains a complex algorithm and the performance depends on the speed of that algorithm, it may at times be more efficient to create a custom version of the algorithm and use that instead of the general-purpose one.

image Note  The XML serializer and the regular expression engine are two places that use this technique to improve execution speed.

This technique involves three steps: expressing the custom version of the algorithm in code, translating that version into executable code and loading it, and calling the custom version of the code.

Expressing the Algorithm in Code

The algorithm can be expressed at two different levels: either at the C# source code level or at the .NET Intermediate Language (IL) level.

The code can be written in C# by simply creating a file with the C# source or by using the classes in the System.CodeDom namespace. Using the CodeDom namespace requires you to think like the compiler and parse the code to create the CodeDom constructs. It is usually easier to create the C# source directly.

To express the code directly in .NET IL, you can either use the classes in the Reflection.Emit namespace or, if the amount of code is small, use the DynamicMethod class.

It is significantly more work to express an algorithm in .NET IL, since you must do the work of the compiler. If you choose to take this approach, my recommendation is to first write your code in C# and compile it, use the ILDasm utility to view the resulting IL, and then base your IL on that code. Be sure to compile in release mode, because the debug mode IL contains extraneous information.

Translating and Loading the Code

If you have created the .NET IL directly, this step is pretty much done for you; you have a chunk of IL in hand, and it’s fairly straightforward to get a delegate from it that you can use.

If you created C# code, you will need to use the CodeDomProvider.CompileAssemblyFromFile() method; this performs the compilation and loads the resulting assembly. You can then use reflection to create an instance of the class.

Calling the Code

Methods can be invoked in three ways.

  • Using reflection and method.Invoke()
  • Getting a delegate to a method and calling through that
  • Getting an interface to a class and calling through an interface method

The first approach is fairly slow; the second and third approaches are at least ten times faster.

Design Guidelines

It’s much easier to work at the C# code level, but if you do this, you have to pay the cost of compiling the C# code into IL, which is on the order of half a second. If you use the code often and/or the speedup is significant, this may not matter, but for small chunks of code, the cost may be prohibitive.

When calling code, make sure to use a strongly typed interface or delegate; using reflection (or dynamic typing) will be much slower.

The C# Compiler As a Service

None of the described ways of generating code is particularly straightforward or elegant. What you would really like is a way to hook into the C# compiler and have it do the hard work for you. Unfortunately, the C# compiler (including C# 5.0) is written in C++, and there is no way to hook into it from C#.

There is, however, a project (code name Roslyn) to rewrite the C# compiler in C#1 and ship that version as C# 6.0. Preview versions of that compiler are available, which are interesting to explore.2

Roslyn provides many new capabilities; you can reach into the compiler at any of the compilation phases—parsing, binding, and IL generation—and get the current state and even modify it. I will limit this exploration to using Roslyn to generate code on the fly.

image Note  This example uses prerelease code. It is likely that some of the details will change before this version of C# is shipped.

You will be writing code to cube (raise to the third power) numbers. Consider the following code:

class CuberGenerator
{
    static string CubedExpression(double x)
    {
       return x.ToString() + " * " + x.ToString() + " * " + x.ToString();
    }
    public static Func<double> GetCuber(double x)
    {
       string program =
            @"class Cuber {public static double Cubed() { return " +
            CubedExpression(x) +
            "; } }";
       Console.WriteLine(program);
       SyntaxTree tree = SyntaxTree.ParseCompilationUnit(program);
       Compilation compilation = Compilation.Create(
                      "CuberGenerator.dll",
                      new CompilationOptions(OutputKind.DynamicallyLinkedLibrary),
                      new[] { tree },
                      new[] { new AssemblyFileReference(typeof(object).Assembly.Location) });
       Assembly compiledAssembly;
       using (var stream = new MemoryStream())
       {
            EmitResult compileResult = compilation.Emit(stream);
            compiledAssembly = Assembly.Load(stream.GetBuffer());
       }
       Type cuber = compiledAssembly.GetType("Cuber");
       MethodInfo getValue = cuber.GetMethod("Cubed");
       Func<double> cubedValue =
                (Func<double>)Delegate.CreateDelegate(typeof(Func<double>), getValue);
       return cubedValue;
    }
}

You start by creating the class that will perform the operation as a string. The string is then passed to SyntaxTree.ParseCompilationUnit(), which is equivalent to running the first part of the C# compiler over the class you’ve created. That generates a SyntaxTree, which contains a parsed version of the program. The syntax tree is then passed to Compilation.Create(), which implements the second part of compilation, converting the syntax tree to .NET IL that can be executed by the runtime. The assembly is then loaded into an Assembly instance, and reflection is used to locate the method to be called. This can then be called using the following code:

Func<double> cuber = CuberGenerator.GetCuber(7);
Console.WriteLine(cuber());

This example is, of course, contrived; it makes no sense to generate custom code just to cube a number. It is more reasonable if a general implementation of an algorithm is much slower than a specific one.

The SyntaxTree that was generated can also be used for source analysis; if you want to determine how many times a specific type is used as a method parameter, you can do that by traversing the syntax tree.

1 And to rewrite the VB .NET compiler in VB .NET.

2 The previews can run on both Visual Studio 2010 and Visual Studio 2012. Search for download Roslyn CTP.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.17.186.247