A part of the strength and flexibility of .NET comes from its compiler platform. Most people have known it under its project name Roslyn. With the compiler platform, developers can analyze their code, enforce coding guidelines, and more. Besides Roslyn, Microsoft has also introduced source generators. Source generators leverage Roslyn to generate code at compile time and include that generated code into the compilation.
Roslyn
Developers rely heavily on their tools to help their development. Just look at what Visual Studio does to help you write better code, or look at the rich ecosystem of Visual Studio extensions. IDE features like IntelliSense and Find All References need an understanding of the current code base; this is typically information that a compiler can provide. Compilers used to be black boxes that took your source code and transformed it into, in our case, intermediate language. With Roslyn, Microsoft aimed to open up the compiler platform and provide it with an API set for everyone to use to write code enhancing tools.
Roslyn ships with an SDK. That SDK provides us with all the tools we need to hook into the compiler pipeline. From the SDK we get compiler APIs, diagnostic APIs, scripting APIs, and workspace APIs.
Compiler API
Diagnostic API
The diagnostic API is what gives us the “squiggly lines” in our code. It’s an API that analyzes syntax, assignments, and semantics based on Roslyn analyzers. It generates warnings or errors. This API can be used by linting tools to, for example, fail a build when certain team guidelines are not respected in a pull request.
Scripting API
This is part of the compiler layer and can be used to run code snippets as scripts. This is used by, for example, the C# Read, Evaluate, Print Loop, or REPL to run snippets of C# against a running assembly.
Workspace API
The workspace API provides the entry point for code analysis and refactorings over entire solutions. It powers IDE functions like Find All References and Formatting.
Syntax Tree
It contains the full information of the code that was typed by the developers, including comments, compiler pre-directives, and whitespaces.
The exact original code can be reconstructed from a syntax tree. A syntax tree is an immutable construct that was parsed from the original source code. In order to provide the full power of analytics and code refactoring, the syntax tree needs to be able to reproduce the exact code it was parsed from.
Syntax trees are thread-safe and immutable. A syntax tree is a state snapshot of the code. In-framework factory methods make sure that requested changes are pushed back to the code and a new syntax tree is generated based on the latest state of the source code. Syntax trees have a lot of optimalizations in place so that new instances of a tree can be generated very fast with little memory use.
Roslyn SDK
The Syntax Visualizer is a great visual aid when working with the syntax tree. After installing the .NET Compiler Platform SDK, you will have access to new project templates in Visual Studio.
With these templates, you can build your own analyzers, code fixes, or code refactorings, both as a stand-alone console application and as a Visual Studio extension in the VSIX format. These templates are available for both C# and Visual Basic.
Creating an Analyzer
The logic of detecting MSBuild instances is abstracted away by the Roslyn SDK; all we need to do is call MSBuildLocator.QueryVisualStudioInstances().ToArray() to get a list of versions installed. Let’s empty the Main method and start implementing a code analyzer ourselves.
When analyzing code, we will need a SyntaxTree object. A SyntaxTree holds a parsed representation of a code document. In our example, we will inspect a piece of code and print the using statements in the console. Once we have our syntax tree parsed, we can extract a CompilationUnitSyntax object. This object represents our code document, divided into members, using directives and attributes.
Generating the syntax tree and compilation unit
We are using a very simple code example to get the point across. We parse the code into a SyntaTree and extract the CompilationUnitRoot from there.
Next we will need a CSharpSyntaxWalker object. A syntax walker is an implementation of the Visitor design pattern. The Visitor pattern describes a way to decouple an object structure from an algorithm; more information on the pattern is found at https://en.wikipedia.org/wiki/Visitor_pattern.
Custom using directive syntax walker
In this example, we are overriding the VisitUsingDirective method from the CSharpSyntaxWalker base class. The base class comes with many override methods that each visits a specific type of syntax nodes. The VisitUsingDirective method visits all using directives in our syntax tree. The complete list of methods that can be overwritten is found at https://docs.microsoft.com/en-us/dotnet/api/microsoft.codeanalysis.csharp.csharpsyntaxwalker.
Using the UsingDirectivesWalker
This has been a very simple example of how to extract a specific piece of code from a code snippet. This should help you get started with Roslyn. If you want to read more and dive deeper into Roslyn, the complete Roslyn documentation is a great resource: https://docs.microsoft.com/en-us/dotnet/csharp/roslyn-sdk/.
Source Generators
A recent addition to the Compiler Platform is source generators. Source generators run during the compilation of your code. They can generate extra code files based on analysis of your code and include them in the compilation.
Source generators can be used to prevent the use of reflection. Instead of generating runtime classes, it might be possible to generate extra classes at compile time, of course depending on your use case. Being able to generate extra classes at compile time instead of runtime almost always means a performance increase. It is important to know, and remember, that source generators can only generate and inject extra code; they cannot change the code that was written by the developer.
Writing a Source Generator
Let us look at an example. For the example, we are going to write a source generator that takes any class that is decorated with a certain attribute and generate a record-type DTO from that class; DTOs are explained in more detail in the previous chapter. We will keep it quite simple for this demo generator, so do not worry about violating any DTO best practices in the generated code.
ISourceGenerator interface
ISourceGenerator consists of two methods. The Initialize method sets up the generator, while the Execute method does the actual generating of code.
Defining the attribute to filter on
Example of a decorated class
SyntaxReceiver
A Syntax Receiver is a class that implements the ISyntaxReceiver interface. The interface contains one member, an OnVisitSyntaxNode method. This method will be executed for every node in the syntax tree build by the Compiler Platform SDK. In this implementation, we inspect every node to see if it is of type ClassDeclarationSyntax. There are declaration syntax types for every type of node we can expect, including ClassDeclarationSyntax, InterfaceDeclarationSyntax, PropertyDeclarationSyntax, and so on. Once we have a ClassDeclarationSyntax that contains attributes, we use LINQ to check if the class contains our custom attribute. Once we have the IdentifierNameSyntax, we can verify if it has the name of the attribute we are filtering on, in this case GenerateDto. At this point, we have successfully detected a class that was decorated with the GenerateDto attribute, but we are not generating code yet; we are just traversing the syntax tree; that is why we save the found class nodes in an immutable property. The syntax receiver is single instance for every generator run anyway, so we can safely use properties to bring data from the receiver to the generator.
Initializing a source generator
Checking for the receiver
Grabbing properties and using statements
Generating the using directives
Generating namespace and class declarations
Generating parameters and injecting the code
Finally we use the list of properties we have from the class declaration to generate the record parameters. We can grab the datatype from the property’s Type property and the name from the Identifier property. We use the StringBuilder’s Append method to make sure that all parameters are appended on one line instead of adding a line break between each one. The parameters are separated with a comma, and the final comma is removed. Finally we close the brackets and our code is finished. We can use the AddSource method on the GeneratorExecutionContext object to inject the source into the codebase right before the code gets compiled. Our generated code is now part of the user code and will be treated as such by the compiler.
Adding a source generator to a project
Using the generated DTO objects
Note that you might need to restart Visual Studio before Intellisense recognizes the generated objects.
Debugging Source Generators
Debugging a source generator
Select New instance of Visual Studio 2022. VS2022 will start up, load in the source file for the generator, and pause right at the Debugger.Launch statement, just like if it was a breakpoint. From this point on, we are in debug mode; we can inspect variables, step over or into statements, and so on. The Debugger.Launch call can be placed anywhere in the generator, even in a syntax receiver.
Wrapping Up
.NET’s compiler platform is a powerful platform that does so much more than just compiling code. It is a complete inspection and linting tool. The platform ships with an SDK that allows us to write our own inspections and fixes; this helps when working in teams to guard team agreements on code style but also for detecting bugs and anti-patterns.
Since .NET 5 the platform also has source generators. With source generators, we can generate code at compile time that gets injected into the compiler pipelines as if it was user-written code. Source generators can be a great help and can often replace places where previously we would have used reflection, for example, to generate DTO types like we have seen.