Complex macros

We can specify that we want multiple expressions in the left-hand side pattern of the macro definition by adding a * for zero or more matches or a + for one or more matches. Let's see how we can do that with a simplified my_vec![] macro:

macro_rules! my_vec {
($($x: expr),*) => {{
let mut vector = Vec::new();
$(vector.push($x);)*
vector
}}
}

Let's see what is happening here. First, we see that on the left side, we have two variables, denoted by the two $ signs. The first makes reference to the actual repetition. Each comma-separated expression will generate a $x variable. Then, on the right side, we use the various repetitions to push $x to the vector once for every expression we receive.

There is another new thing on the right-hand side. As you can see, the macro expansion starts and ends with a double brace instead of using only one. This is because, once the macro gets expanded, it will substitute the given expression for a new expression: the one that gets generated. Since what we want is to return the vector we are creating, we need a new scope where the last sentence will be the value of the scope once it gets executed. You will be able to see it more clearly in the next code snippet.

We can call this code with the main() function:

fn main() {
let my_vector = my_vec![4, 8, 15, 16, 23, 42];
println!("Vector test: {:?}", my_vector);
}

It will be expanded to this code:

fn main() {
let my_vector = {
let mut vector = Vec::new();
vector.push(4);
vector.push(8);
vector.push(15);
vector.push(16);
vector.push(23);
vector.push(42);
vector
};
println!("Vector test: {:?}", my_vector);
}

As you can see, we need those extra braces to create the scope that will return the vector so that it gets assigned to the my_vector binding.

You can have multiple repetition patterns on the left expression and they will be repeated for every use, as needed on the right. There is a nice example illustrating this behavior in the first edition of the official Rust book, which I have adapted here:

macro_rules! add_to_vec {
($( $x:expr; [ $( $y:expr ),* ]);* ) => {
&[ $($( $x + $y ),*),* ]
}
}

In this example, the macro can receive one or more $x; [$y1, $y2,...] input. So, for each input, it will have one expression, then a semicolon, then a bracket with multiple sub-expressions separated by a comma, and finally, another bracket and a semicolon. But what does the macro do with this input? Let's check to the right-hand side of it.

As you can see, this will create multiple repetitions. We can see that it creates a slice (&[T]) of whatever we feed to it, so all the expressions we use must be of the same type. Then, it will start iterating over all $x variables, one per input group. So if we feed it only one input, it will iterate once for the expression to the left of the semicolon. Then, it will iterate once for every $y expression associated with the $x expression, add them to the + operator, and include the result in the slice.

If this was too complex to understand, let's look at an example. Let's suppose we call the macro with 65; [22, 34] as input. In this case, 65 will be $x, and 22, 24, and so on will be $y variables associated with 65. So, the result will be a slice like this: &[65+22, 65+34]. Or, if we calculate the results: &[87, 99].

If, on the other hand, we give two groups of variables by using 65; [22, 34]; 23; [56, 35] as input, in the first iteration, $x will be 65, while in the second one, it will be 23. The $y variables of 64 will be 22 and 34, as before, and the ones associated with 23 will be 56 and 35. This means that the final slice will be &[87, 99, 79, 58], where 87 and 99 work the same way as before and 79 and 58 are the extension of adding 23 to 56 and 23 to 35.

This gives you much more flexibility than the functions, but remember, all this will be expanded during compile time, which can make your compilation time much slower and the final codebase larger and slower still if the macro used duplicates too much code. In any case, there is more flexibility to it yet.

So far, all variables have been of the expr kind. We have used this by declaring $x:expr and $y:expr but, as you can imagine, there are other kinds of macro variables. The list follows:

  • expr: Expressions that you can write after an = sign, such as 76+4 or if a==1 {"something"} else {"other thing"}.
  • ident: An identifier or binding name, such as foo or bar.
  • path: A qualified path. This will be a path that you could write in a use sentence, such as foo::bar::MyStruct or foo::bar::my_func.
  • ty: A type, such as u64 or MyStruct. It can also be a path to the type.
  • pat: A pattern that you can write at the left side of an = sign or in a match expression, such as Some(t) or (a, b, _).
  • stmt: A full statement, such as a let binding like let a = 43;.
  • block: A block element that can have multiple statements and a possible expression between braces, such as {vec.push(33); vec.len()}.
  • item: What Rust calls items. For example, function or type declarations, complete modules, or trait definitions.
  • meta: A meta element, which you can write inside of an attribute (#[]). For example, cfg(feature = "foo").
  • tt: Any token tree that will eventually get parsed by a macro pattern, which means almost anything. This is useful for creating recursive macros, for example.

As you can imagine, some of these kinds of macro variables overlap and some of them are just more specific than the others. The use will be verified on the right-hand side of the macro, in the expansion, since you might try to use a statement where an expression must be used, even though you might use an identifier too, for example.

There are some extra rules, too, as we can see in the Rust documentation (https://doc.rust-lang.org/book/first-edition/macros.html#syntactic-requirements). Statements and expressions can only be followed by =>, a comma, or a semicolon. Types and paths can only be followed by =>, the as or where keywords, or any commas, =, |, ;, :, >, [, or {. And finally, patterns can only be followed by =>, the if or in keywords, or any commas, =, or |.

Let's put this in practice by implementing a small Mul trait for a currency type we can create. This is an adapted example of some work we did when creating the Fractal Credits digital currency. In this case, we will look to the implementation of the Amount type (https://github.com/FractalGlobal/utils-rs/blob/49955ead9eef2d9373cc9386b90ac02b4d5745b4/src/amount.rs#L99-L102), which represents a currency amount. Let's start with the basic type definition:

#[derive(Copy, Clone, PartialEq, Eq, PartialOrd, Ord)]
pub struct Amount {
value: u64,
}

This amount will be divisible by up to three decimals, but it will always be an exact value. We should be able to add an Amount to the current Amount, or to subtract it. I will not explain these trivial implementations, but there is one implementation where macros can be of great help. We should be able to multiply the amount by any positive integer, so we should implement the Mul trait for u8, u16, u32, and u64 types. Not only that, we should be able to implement the Div and the Rem traits, but I will leave those out, since they are a little bit more complex. You can check them in the implementation linked earlier.

The only thing the multiplication of an Amount with an integer should do is to multiply the value by the integer given. Let's see a simple implementation for u8:

use std::ops::Mul;

impl Mul<u8> for Amount {
type Output = Self;

fn mul(self, rhs: u8) -> Self::Output {
Self { value: self.value * rhs as u64 }
}
}

impl Mul<Amount> for u8 {
type Output = Amount;

fn mul(self, rhs: Amount) -> Self::Output {
Self::Output { value: self as u64 * rhs.value }
}
}

As you can see, I implemented it both ways so that you can put the Amount to the left and to the right of the multiplication. If we had to do this for all integers, it would be a big waste of time and code. And if we had to modify one of the implementations (especially for Rem functions), it would be troublesome to do it in multiple code points. Let's use macros to help us.

We can define a macro, impl_mul_int!{}, which will receive a list of integer types and then implement the Mul trait back and forward between all of them and the Amount type. Let's see:

macro_rules! impl_mul_int {
($($t:ty)*) => ($(
impl Mul<$t> for Amount {
type Output = Self;

fn mul(self, rhs: $t) -> Self::Output {
Self { value: self.value * rhs as u64 }
}
}

impl Mul<Amount> for $t {
type Output = Amount;

fn mul(self, rhs: Amount) -> Self::Output {
Self::Output { value: self as u64 * rhs.value }
}
}
)*)
}

impl_mul_int! { u8 u16 u32 u64 usize }

As you can see, we specifically ask for the given elements to be types and then we implement the trait for all of them. So, for any code that you want to implement for multiple types, you might as well try this approach, since it will save you from writing a lot of code and it will make it more maintainable.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
52.14.134.130