How it works...

The instructions for this recipe are a bit more complex than the others, as we need to manage two separate crates. If your code doesn't compile, compare your version with the one used in the book at https://github.com/SirRade/rust-standard-library-cookbook/tree/master/chapter_five. We need to separate the code into two crates because providing a custom derive requires creating a procedural macro, as indicated by all of the instances of proc_macro in the code. A procedural macro is Rust code that runs alongside the compiler and interacts directly with it. Because of the special nature and unique restrictions of such code, it needs to be in a separate crate that is annotated with the following:

[lib]
proc-macro = true

This crate is typically named after the main crate with the _derive suffix added. In our example, the main crate is called chapter_five, so the crate providing the procedural macro is called chapter_five_derive.

In our example, we are going to create a derived version of good, old Hello World: a struct or enum deriving from HelloWorld will implement the HelloWorld trait, providing a hello_world() function with a friendly greeting containing its own name. Additionally, you can specify a HelloWorldName attribute to alter the message.

The code in custom.rs should be self-explanatory. We begin by importing our derivation crate [2] where we need to include the #[macro_use] attribute in order to actually import the procedural macros. We then define our HelloWorld trait [5] and proceed to derive it on a bunch of structures [13, 16 and 19], just like we would with built-in derives such as Debug or Clone. Australia gets a custom message via the HelloWorldName attribute. Lastly, in the main function, we call the generated hello_world() function.

Let's take a look at chapter-five-derive/src/lib.rs now. Procedural macro crates typically begin by importing the proc_macro, quote, and syn crates. Attentive readers will have noticed that we didn't add the proc_macro to our [dependencies] section in the crate's Cargo.toml. We didn't need to because this special support crate is provided by the standard Rust distribution.

The quote crate provides the quote! macro, which lets us translate Rust code into tokens that the compiler can use. The really useful feature of this macro is that it supports code interpolation of a variable by writing a # in front of it. This means that when we write the following, the value inside the struct_name variable is interpreted as Rust code:

impl HelloWorld for #struct_name { ... }

If struct_name has the Switzerland value, the following code will be generated:

impl HelloWorld for Switzerland { ... }

The syn crate is a Rust parser built upon the nom parser combinator framework (https://github.com/Geal/nom), which you should check out as well if you're thinking about writing a parser. In fact, some of the crates used in Chapter 4, Serialization, are written with nom, too. Back on track, syn parses the code annotated by your custom attributes or derives and lets you work with the generated abstract syntax tree.

The convention for a custom derive is to create a function with the name of the derive in snake_case (pub fn hello_world, in our case) that parses the annotated code and then calls a function that generates the new code. The second function typically has the name of the first one, prefixed with impl. In our code, this is fn impl_hello_world.

In a proc_macro crate, only functions tagged with proc_macro_derive are allowed to be published. The consequence of this is, by the way, that we are not able to move our HelloWorld trait into this crate; it wouldn't be allowed to be pub.

The proc_macro_derive annotation requires you to specify which name will be used for the derive (HelloWorld for us) and which attributes it allows. If we didn't want to accept the HelloWorldName attribute, we could simply omit the entire attributes section and annotate our function like this:

#[proc_macro_derive(HelloWorld)]

Because hello_world hooks itself directly into the compiler, it both accepts and returns a TokenStream, which is the compiler-internal representation of Rust code. We start by turning the TokenStream back into a String in order to be parsed again by syn. This is not an expensive action, as the TokenStream we receive from the compiler is not the entire program, but only the part annotated by our custom derive. For example, the String behind the TokenStream of the first struct annotated by HelloWorld is simply the following:

struct Switzerland;

We then parse the said string with syn::parse_derive_input(&s), which basically tells syn that the code we want to parse is a struct or enum that is deriving something.

We then generate the code with the following:

let gen = impl_hello_world(&ast);

Then we convert it back into a TokenStream with this:

gen.parse()

The TokenStream is then injected back into the code by the compiler. As you can see, a procedural macro cannot change existing code, but only analyze it and generate additional code.

Here is the process described in hello_world:

Convert the TokenStream into a String
Parse the String with syn
Generate an implementation of another method
Parse the implementation back into a TokenStream

It is very typical for a custom derive. You can reuse the code presented in nearly all basic procedural macros.

Let's move on to impl_hello_world now. With the help of the ast passed, we can analyze the annotated structure. The ident member, which stands for identifier, tells us the name of the struct or enum. For instance, in the first struct that derives from HelloWorld, this is the "Switzerland" string.

We then decide which name to use in the greeting with the help of the get_name_attribute little helper function, which we will look at in a moment. It returns the value of the HelloWorldName attribute if it has been set. If not, we default to the identifier, converted to a string via as_ref[29]. How this is done is explained in the next recipe.

Finally, we create some quote::Tokens by writing the implementation and surrounding it with quote!. Notice again how we interpolate variables into the code by writing # in front of it. Additionally, while printing, we surround #identifier with stringify!, which turns an identifier into a string. We don't need to do this with #hello_world_identifier because it already holds a string. To understand why this is needed, let's look at the code that would be generated for the Switzerland struct if we didn't include stringify!:

impl HelloWorld for Switzerland {
    fn hello_world() {
        println!(
            "The struct or enum {} says: "Hello world from {}!"",
            Switzerland,
            "Switzerland"
        );
    }
}

Try it out for yourself, and you will be greeted with an error message stating that something along the lines of "`Switzerland` cannot be formatted with the default formatter". This is because we are not printing the "Switzerland" string, but instead trying to print the concept of the Switzerland struct itself, which is clearly nonsense. To fix this, we just need to make sure that the interpolated variable is surrounded by quotes ("), which is exactly what stringify! does.

Let's look at the final piece of the puzzle now: get_name_attribute. This function might look a little intimidating at first. Let's go through it step by step:

if let Some(attr) = ast.attrs.iter().find(|a| a.name() == ATTR_NAME) { ... }

Here we'll go through all available attributes and search for one named "HelloWorldName". If we don't find any, the function call already ends by returning None. Otherwise, we continue with the following line:

if let syn::MetaItem::NameValue(_, ref value) = attr.value { ... }

syn::MetaItem is simply how syn calls attributes. This line is necessary because there are many ways to write attributes in Rust. For example, a syn::MetaItem::Word can be written like #[foo]. An example for syn::MetaItem::List is #[foo(Bar, Baz, Quux)]. #[derive(...)] itself also a syn::MetaItem::List. We, however, are only interested in syn::MetaItem::NameValue, which is an attribute in the form of #[foo = Bar]. If the HelloWorldName attribute is not in this form, we panic! with a message explaining what the problem is. A panic in procedural macro results in a compiler error. You can verify this by replacing #[HelloWorldName = "the Land Down Under" in custom.rs with #[HelloWorldName].

Contrary to normal programs, because of procedural macros panic! at compile time, it's okay for them to panic! often. When doing so, remember that errors originating from other crates are very nasty to debug, doubly so in any kind of macros, so it's incredibly important to write the error messages as explicitly as possible.

The last check we need to do is on the value of HelloWorldName. As we are going to print it, we want to accept only strings:

if let syn::Lit::Str(ref value_as_str, _) = *value { ... }

On success, we return the string. Otherwise, we again panic! with an error message detailing the problem.

Table of Contents for How it works...

Create new playlist

Sign In

Sign Up

Table of Contents for
How it works...