© Thomas Mailund 2018

Thomas Mailund, Domain-Specific Languages in R, https://doi.org/10.1007/978-1-4842-3588-1_6

6. Lambda Expressions

Thomas Mailund

(1)Aarhus N, Staden København, Denmark

With the techniques we have seen so far, we are now able to implement some useful domain-specific languages. In this chapter, we examine a toy example, lambda expressions. It is perhaps not something we would use in real-world code, as it simply gives an alternative syntax to anonymous functions, which are already supported in R. However, it is an excellent example of code that is potentially useful and gives us a chance to experiment with syntax.

We will use the rlang package.

library(rlang)

Anonymous functions

Lambda expressions are Anonymous functions, in other words, functions we have not named. We already have anonymous functions in R. This is the default kind of functions since a function is anonymous until we assign it to a variable. If we do not want to save a function in a variable to get access to it later, we can just use the function expression to create it where we need it. For example, to map over a vector of numbers, we could write the following:

sapply(1:4, function(x) x**2)

## [1]  1  4  9 16

This is a toy example since vector expressions are preferable in situations like the following, but it illustrates the point.

(1:4)**2

## [1]  1  4  9 16

Using function expressions is verbose, so we might want to construct an alternative syntax for anonymous functions. We can then use it as an exercise in constructing a domain-specific language. Our goal is to change the previous sapply syntax into this syntax:

sapply(1:4, x := x**2)

We use the := assignment operator for two reasons. One, we can overload it, something we cannot do with -> or <-. Two, it has the lowest precedence of the operators, so the operator we create will be called with the left- and right-hand sides before these are evaluated.

To implement this syntax, we need to make the left-hand side of assignments into function headers, which means pair lists of arguments. We also need to make the right-hand side into a function body we can evaluate in the environment where we define the lambda expression. The good news is that this only involves techniques we have already seen. We can write a function for turning a list of arguments into a pair list that we can use to define the formal arguments of a function like this:

make_args_list <- function(args) {
  res <- replicate(length(args), substitute())
  names(res) <- args
  as.pairlist(res)
}

For the assignment operator, we need to use substitute to avoid evaluating the two arguments. We then use make_args_list to turn the left-hand side into formal arguments, but we keep the right-hand side expression as it is. After that, we turn the combination into a function using new_function from the rlang package. Since we want to evaluate the new function in the scope where we define the lambda expression, we use caller_env to get this environment and provide it to new_function. The entire implementation is as simple as this:

`:=` <- function(header, body) {
  header <- substitute(header)
  body <- substitute(body)
  args <- make_args_list(as.character(header))
  new_function(args, body, caller_env())
}

Now, we can use the new syntax as syntactic sugar for anonymous functions.

sapply(1:4, x := x**2)

## [1]  1  4  9 16

What about lambda expressions with more than one argument? We might want syntax similar to this:

mapply(x,y := x*y, x = 1:6, y = 1:2)

However, this is not possible since we cannot override how R interprets commas. If we want to group some parameters, we need to put them in a function call. We can do something like this:

mapply(.(x,y) := x*y, x = 1:6, y = 1:2)

## [1]  1  4  3  8  5 12

What happens here is that the make_args_list translates all the components of the left-hand expression into function parameters. A function call object is just like any other expression list, so in this particular example, we create a function with three arguments, ., x, and y. Since . is not used inside the function body, it does not matter that we do not provide it when the function is called. However, if we reuse one of the parameter names as the function name in the call, this happens:

mapply(x(x,y) := x*y, x = 1:6, y = 1:2)

## Error in (function (x, x, y) : argument 1 matches multiple formal arguments

We can get rid of the function name in calls by removing the first element in the list.

`:=` <- function(header, body) {
  header <- substitute(header)
  if (is.call(header)) header <- header[-1]
  body <- substitute(body)
  args <- make_args_list(as.character(header))
  new_function(args, body, caller_env())
}

Now the earlier example will work.

mapply(x(x,y) := x*y, x = 1:6, y = 1:2)

## [1]  1  4  3  8  5 12

Experiments with Alternatives to the Syntax

Using an assignment operator to define a function in this way might not be the most obvious syntax you could choose, but we have plenty of options for playing around with alternatives.

We could start with the functionality that we have implemented as a single function. There is no reason to have a special syntax if all we need is a single function, so instead, we could implement lambda expressions like this:

lambda <- function(...) {
  spec <- eval(substitute(alist(...)))
  n <- length(spec)
  args <- make_args_list(spec[-n])
  body <- spec[[n]]
  new_function(args, body, caller_env())
}

The idea here is that the lambda function will take a list of arguments where the last element in the list is the function body and the preceding are the parameters of the lambda expression.

sapply(1:4, lambda(x, 4 * x**2))

## [1]  4 16 36 64

mapply(lambda(x, y, y*x), x = 1:4, y = 4:7)

## [1]  4 10 18 28

The eval(substitute(alist(...))) expression might look a little odd if you are not used to it. What we do is take the variable number of arguments, captured by the three dots argument, and create an expression that turns those into a list. The function alist, unlike list, will not evaluate the expressions but keep the arguments as they are, which is what we want in this case. The substitute expression only creates the expression, so we need to evaluate it with eval to get the actual list. Once we have the list, we make the first arguments into function parameters and the last into the body of the lambda expression and create the function.

In production code, we should add some checks to make sure that the lambda expression parameters are symbols and not general expressions. However, the full functionality for lambda expressions is present in the function we have just written.

Of course, the lambda function does not behave like a normal function. The non-standard evaluation (NSE) we apply to make a function out of the arguments to lambda is very different from how functions normally behave, where the arguments we provide are considered values rather than symbols and expressions. To make it clear from the syntax that something different is happening, you could change the syntax. For example, we could go for square brackets instead of parentheses. We can implement a version that uses those like this:

lambda <- structure(NA, class = "lambda")
`[.lambda` <- function(x, ...) {
  spec <- eval(substitute(alist(...)))
  n <- length(spec)
  args <- make_args_list(spec[-n])
  body <- spec[[n]]
  new_function(args, body, caller_env())
}

We use it like this:

sapply(1:4, lambda[x, 4 * x**2])

## [1]  4 16 36 64

mapply(lambda[x, y, y*x], x = 1:4, y = 4:7)

## [1]  4 10 18 28

The approach here is to make lambda an object with a class we can use for defining a special case of the subscript operator. The sole purpose of lambda is to dispatch the subscript function to the right specialization, and that specialization of the subscript operator is the one that creates the new function. The only difference is that it takes an extra first argument, which is the lambda object. We do not use it for anything, so we just ignore it.

Don’t Do This at Home

Implementing syntactic sugar for lambda expressions as we have done only saves us minimal typing compared to using function expressions. Those familiar with function expressions should know that this will potentially do more harm than good, but it might not be the case with our home-made syntax for them. Consequently, I do not recommend that you construct a new syntax for language constructions that are already implemented in R. We implemented the lambda expressions here to illustrate how we can construct new syntax with very little code.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.220.1.239