Developing nonstandard string literals

There is a special kind of macro for defining nonstandard string literals, which look like a literal string but instead a macro is called when it is referenced.

A good example would be Julia's regular expression literal—for example, r"^hello". It is not a standard string literal because of the r prefix in front of the double quote. Let's first check the data type of such a literal. We can see that Regex object is created from the string:

We can also create our own nonstandard string literals. Let's try to work through a fun example together here.

Suppose that, for development purposes, we want to conveniently create sample data frames with different types of columns. The syntax for doing so is a little bit tedious:

Imagine that we occasionally need to create tens of columns with different data types. The code for creating such a data frame would be very long, and as a programmer, I would be extremely bored typing that all out. So we could design a string literal so that it contains the specification for constructing such a data frame—let's call it a ndf (numerical data frame) literal.

The specification on ndf just needs to encode the desired number of rows and column types. For instance, the literal ndf"100000:f64,i16" can be used to represent the preceding sample data frame, where 100,000 rows are needed, with two columns labeled as the Float64 and Int16 columns.

To implement this feature, we just define a macro called @ndf_str. The macro takes a string literal and creates the desired data frame accordingly. The following is one way to implement the macro:

macro ndf_str(s)
nstr, spec = split(s, ":")
n = parse(Int, nstr) # number of rows
types = split(spec, ",") # column type specifications

num_columns = length(types)

mappings = Dict(
"f64"=>Float64, "f32"=>Float32,
"i64"=>Int64, "i32"=>Int32, "i16"=>Int16, "i8"=>Int8)

column_types = [mappings[t] for t in types]
column_names = [Symbol("x$i") for i in 1:num_columns]

DataFrame([column_names[i] => rand(column_types[i], n)
for i in 1:num_columns]...)
end

The first few lines parse the string and determine the number of rows (n), as well as the types of the columns (types). Then, a dictionary called mappings is created to map the shorthand to the corresponding numeric types. The column names and types are generated from the type and mapping data. Finally, it calls the DataFrame constructor and returns the result.

Now that we have the macro defined, we can easily create new data frames, as follows:

Nonstandard string literals can be quite useful in certain cases. We can see a string specification as a mini domain-specific language that is encoded in the string. As long as the string specification is well defined, it can make the code a lot shorter and more concise.

You may have noticed that the ndf_str macro returns a regular DataFrame object rather than an expression object, as it would normally do with macros. This is perfectly fine because the final DataFrame object will be returned as-is. You may think of an evaluation of a constant as just the constant itself. We can just return a value rather than an expression here because the returned value does not involve any variables from the call site or from the module.

A curious mind might ask - why can't we just create a regular function for this? We can certainly do that for this dummy example. However, using a string literal could improve performance in some cases.

For example, when we use the Regex string literal in a function, the Regex object is created at compile-time and so it is executed only once. If we use the Regex constructor instead, then the object would be created every single time the function is called.

We have now concluded the topic of macros. We learned how to create macros by taking expressions and generating a new expression. We used the @macroexpand macro to debug the macro expansion process. We also learned how to handle macro hygiene. Finally, we took a look at nonstandard string literals and created our own using a macro.

Next, we will look at another metaprogramming facility called generated functions, which can be used to solve a different kind of problem than what regular macros can handle.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.135.197.201