Function composition is not only powerful and expressive but also pleasant to work with. It’s used to some extent in any programming style, but in FP, it’s used extensively. For example, have you noticed that when you use LINQ to work with lists, you can get a lot done with only a few lines of code? That’s because LINQ is a functional API, designed with composition in mind.
In this chapter, we’ll cover the basic concept and techniques of function composition and illustrate its use with LINQ. We’ll also implement an end-to-end server-side workflow in which we’ll use the Option
API introduced in chapter 6. This example illustrates many of the ideas and benefits of the functional approach, so we’ll end the chapter with a discussion of those.
Let’s start by reviewing function composition and how it relates to method chaining. Function composition is part of any programmer’s implicit knowledge. It’s a mathematical concept you learn in school and then use every day without thinking about it too much. Let’s quickly brush up on the definition.
Given two functions, f and g, you can define a function h to be the composition of those two functions, notated as follows:
Applying h to a value x is the same as applying g to x and then applying f to the result:
For example, say you want to get an email address for someone working at Manning. You can have a function calculate the local part (identifying the person) and another append the domain:
record Person(string FirstName, string LastName); static string AbbreviateName(Person p) => Abbreviate(p.FirstName) + Abbreviate(p.LastName); static string AppendDomain(string localPart) => $"{localPart}@manning.com"; static string Abbreviate(string s) => s.Substring(0, Math.Min(2, s.Length)).ToLower();
AbbreviateName
and AppendDomain
are two functions that you can compose to get a new function that yields the Manning email for my hypothetical collaborator. Take a look at the following listing.
Func<Person, string> emailFor = p => AppendDomain(AbbreviateName(p)); ❶ var joe = new Person("Joe", "Bloggs"); var email = emailFor(joe); email // => [email protected]
❶ emailFor
composes AppendDomain
with AbbreviateName
.
There are a couple of things worth noting. First, you can only compose functions with matching types: if you’re composing (f · g), the output of g must be assignable to the input type of f.
Second, in function composition, functions appear in the reverse order in which they’re performed, so f · g is sometimes read as “f after g.” For example, in AppendDomain(AbbreviateName(p))
, you first execute the rightmost function and then the one to its left. This is not ideal for readability, especially if you want to compose several functions.
C# doesn’t have any special syntactic support for function composition, and although you could define a HOF Compose
to compose two or more functions, this doesn’t improve readability. This is why in C# it’s best to resort to method chaining instead.
The method chaining syntax (that is, chaining the invocation of several methods with the .
operator) provides a more readable way of achieving function composition in C#. Given an expression, you can chain to it any method that’s defined as an instance or extension method on the type of the expression. For instance, the previous example would need to be modified as follows:
static string AbbreviateName(this Person p) ❶ => Abbreviate(p.FirstName) + Abbreviate(p.LastName); static string AppendDomain(this string localPart) ❶ => $"{localPart}@manning.com";
❶ The this
keyword makes this an extension method
You can now chain these methods to obtain the email for the person. The following listing shows this approach.
var joe = new Person("Joe", "Bloggs"); var email = joe.AbbreviateName().AppendDomain(); email // => [email protected]
Notice that now the extension methods appear in the order in which they will be executed. This significantly improves readability, especially as the complexity of the workflow increases (longer method names, additional parameters, more methods to be chained), and it’s why method chaining is the preferable way of achieving function composition in C#.
Function composition is so important that it should also hold in the world of elevated values. Let’s stay with the current example of determining a person’s email address, but now we have an Option<Person>
as a starting value. You would assume that the following holds:
Func<Person, string> emailFor = p => AppendDomain(AbbreviateName(p)); ❶ var opt = Some(new Person("Joe", "Bloggs")); var a = opt.Map(emailFor); ❷ var b = opt.Map(AbbreviateName) ❸ .Map(AppendDomain); ❸ a.Equals(b) // => true
❶ emailFor
is composed of Append-Domain
with AbbreviateName
.
❸ Maps AbbreviateName
and AppendDomain
in separate steps
Whether you map AbbreviateName
and AppendDomain
in separate steps or map their composition emailFor
in a single step, the result shouldn’t change. You should be able to safely refactor between these two forms.
More generally, if h = f · g, then mapping h onto a functor should be equivalent to mapping g over that functor and then mapping f over the result. This should hold for any functor and for any pair of functions—it’s one of the functor laws, so any implementation of Map
should observe it.1
If this sounds complicated, that’s probably because it describes something that you intuitively feel should always obviously hold. Indeed, it’s not easy to break this law, but you could come up with a mischievous functor that, say, keeps an inner counter of how many times Map
is applied (or otherwise changes its state with every call to Map
), and then the preceding wouldn’t hold because b
would have a greater inner count than a
.
Simply put, Map
should apply a function to the functor’s inner value(s) and do nothing else so that function composition holds when working with functors just as it does with normal values. The beauty of this is that you can use any functional library in any programming language and use any functor with confidence that a refactoring such as changing between a
and b
in the preceding snippet will be safe.
You can write entire programs with function composition. Each function somehow processes its input, and the output becomes the input to the following function. When you do this, you start to look at your program in terms of data flow: the program is just a set of functions, and data flows through the program through one function and into the next. Figure 7.1 illustrates a linear flow—the simplest and most useful kind.
In the previous example, we made the AbbreviateName
and AppendDomain
methods chainable by making them extension methods. This is also the approach taken in the design of LINQ, and if you look at System.Linq.Enumerable
, you’ll see that it contains dozens of extension methods for working with IEnumerable
. Let’s look at an example of composing functions with LINQ.
Imagine that, given a population, you want to find the average earnings of the richest quartile (that is, the richest 25% of people in the target population). You could write something like the following listing.
record Person(decimal Earnings); static decimal AverageEarningsOfRichestQuartile(List<Person> population) => population .OrderByDescending(p => p.Earnings) .Take(population.Count / 4) .Select(p => p.Earnings) .Average();
Notice how cleanly you can write this query using LINQ (compared to, say, writing the same query imperatively with control flow statements). You may have some sense that internally the code will iterate over the list and that Take
will have an if
check to only yield the requested number of items, but you don’t really care. Instead, you can lay out your function calls in the form of a flat workflow—a linear sequence of instructions:
Notice how similar the code is to the workflow description. Let’s look at it in terms of data flow: you can see the AverageEarningsOfRichestQuartile
function as a simple program. Its input is a List<Person>
, and the output is a decimal
.
Furthermore, AverageEarningsOfRichestQuartile
is effectively the composition of four functions, so that the input data flows through four transformative steps and is, thus, stepwise transformed into the output value as figure 7.2 shows.
The first function, OrderByDescending
, preserves the type of the data and yields a population sorted by earnings. The second step also preserves the type of the data but changes the cardinality: if the input population is composed of n people, Take
now only yields n/4 people. Select
preserves the cardinality but changes the type to a list of decimal
s, and Average
again changes the type to return a single decimal
value.2
Let’s try to generalize this idea of data flow so that it applies not only to queries on IEnumerable
but to data in general. When something of interest happens in your program (a request, a mouse click, or simply your program being started), you can think of that something as input. That input, which is data, then goes through a series of transformations as the data flows through a sequence of functions in your program.
The simple AverageEarningsOfRichestQuartile
function shown in listing 7.3 demonstrates how the design of the LINQ library allows you to compose general-purpose functions into specific queries. There are some properties that make some functions more composable than others:3
Chainable—A this
argument (implicit on instance methods and explicit on extension methods) makes it possible to compose through chaining.
General—The more specific the function, the fewer cases where it’s useful to compose it.
Shape-preserving—The function preserves the shape of the structure, so if it takes an IEnumerable
, it returns an IEnumerable
, and so on.
And, naturally, functions are more composable than actions. Because an Action
has no output value, it’s a dead end, so it can only come at the end of a pipeline.
Notice that the LINQ functions we’ve used all score 100% based on these criteria, with the exception of Average
, which is not shape-preserving. Also note that the core functions we defined in the Option
API do well.
How composable is AverageEarningsOfRichestQuartile
? Well, about 40%: it’s pure, and it has an output value, but it’s not an extension method, and it’s extremely specific. To demonstrate this, look at some code that consumes the function as part of a unit test:
[TestCase(ExpectedResult = 75000)] public decimal AverageEarningsOfRichestQuartile() { var population = Range(1, 8) .Select(i => new Person(Earnings: i * 10000)) .ToList(); return PopulationStatistics .AverageEarningsOfRichestQuartile(population); }
The test passes, but the code also shows that AverageEarningsOfRichestQuartile
doesn’t share the qualities of the LINQ methods it’s composed of: it’s not chainable, and it’s so specific that you’d hardly hope to reuse it. Let’s change that:
Split it into two more general functions: AverageEarnings
(so you can query the average earnings for any segment of the population) and RichestQuartile
(after all, there are many other properties of the richest quartile you may be interested in).
static decimal AverageEarnings(this IEnumerable<Person> pop) => pop.Average(p => p.Earnings); static IEnumerable<Person> RichestQuartile(this IEnumerable<Person> pop) => pop.OrderByDescending(p => p.Earnings) .Take(pop.Count / 4);
Notice how easy it was to do this refactoring! This is because of the compositional nature of the function we refactored: the new functions just compose fewer of the original building blocks. (If you had an implementation of the same logic with for
and if
statements, the refactoring would probably not have been as easy.) You can now rewrite the test as follows:
[TestCase(ExpectedResult = 75000)] public decimal AverageEarningsOfRichestQuartile() => SamplePopulation .RichestQuartile() .AverageEarnings(); List<Person> SamplePopulation => Range(1, 8) .Select(i => new Person(Earnings: i * 10000)) .ToList();
You can see how much more readable the test is now. By refactoring to smaller functions and to the extension method syntax, you’ve created more composable functions and a more readable interface.
TIP If you compose two pure functions, the resulting function is also pure, giving you all the benefits discussed in chapter 3. As a result, libraries consisting mainly of pure, composable functions (like LINQ) tend to be powerful and pleasant to use.
In this section, you’ve seen how LINQ provides (among many other things) a set of readily composable functions that work effectively with IEnumerable
. Next, we’ll see how we can use declarative, flat workflows when working with Option
. Let’s start by clarifying what we mean by workflows and why they matter.
Workflows are a powerful way of understanding and expressing application requirements. A workflow is a meaningful sequence of operations leading to a desired result. For example, a cooking recipe describes the workflow for preparing a dish.
Workflows can be effectively modeled through function composition. Each operation in the workflow can be performed by a function, and these functions can be composed into function pipelines that perform the workflow, just as you saw in the previous example involving data flowing through different transformations in a LINQ query.
We’re now going to look at a more complex workflow of a server processing a command. The scenario is that of a user requesting to make a money transfer through the Bank of Codeland (BOC) online banking application. We’re only concentrating on the server side, so the workflow is kicked off when the server receives a request to make a transfer. We can write a specification for the workflow as follows:
If the account has sufficient funds, debit the amount from the account.
Wire the funds via the SWIFT network.4
The entire money transfer workflow is fairly complex, so to get us started, let’s simplify it as follows:
Let’s say that all the steps following validation are part of the subworkflow of actually booking the transfer, which should only be triggered if validation passes (see figure 7.3).
Let’s take a stab at implementing this high-level workflow. Assume that the server uses ASP.NET Core to expose an HTTP API and that it’s set up so that requests are authenticated and routed to the appropriate MVC controller (in section 9.5.3, I’ll show you how to build Web APIs without the need for controllers), making it the entry point for implementing the workflow:
using Microsoft.AspNetCore.Mvc; public class MakeTransferController : ControllerBase { IValidator<MakeTransfer> validator; [HttpPost, Route("api/MakeTransfer")] ❶ public void MakeTransfer ([FromBody] MakeTransfer transfer) ❷ { if (validator.IsValid(transfer)) Book(transfer); } void Book(MakeTransfer transfer) => // actually book the transfer... }
❶ POST
requests to this route are routed to this method.
❷ Deserializes the request body into a MakeTransfer
The details about the requested transfer are captured in a MakeTransfer
type, which is sent in the body of the user’s request. Validation is delegated to a service on which the controller depends, which implements this interface:
Now to the interesting part, the workflow itself:
public void MakeTransfer([FromBody] MakeTransfer transfer) { if (validator.IsValid(transfer)) Book(transfer); } void Book(MakeTransfer transfer) => // actually book the transfer...
That’s the imperative approach of explicit control flow. I’m always wary of using if
s: a single if
may look harmless, but if you start allowing one if
, nothing is keeping you from having dozens of nested if
s as additional requirements come in, and the complexity that ensues is what makes applications error-prone and difficult to reason about. Next, we’ll look at how to use function composition instead.
Remember that idea we had about data flowing through various functions? Let’s try to think of the transfer request as data flowing through validation and into the Book
method that performs the transfer. Figure 7.4 shows how this would look.
There’s a bit of a problem with types: IsValid
returns a Boolean, whereas Book
requires a MakeTransfer
object, so these two functions don’t compose, as figure 7.5 illustrates.
Furthermore, we need to ensure that the request data flows through the validation and into Book
only if it passes validation. This is where Option
can help us: we can use None
to represent an invalid transfer request and Some<MakeTransfer>
for a valid one.
Notice that, in doing so, we’re expanding the meaning we give to Option
. We interpret Some
not just to indicate the presence of data, but also the presence of valid data, just like we do in the smart constructor pattern. We can now rewrite the controller method as the following listing demonstrates.
public void MakeTransfer([FromBody] MakeTransfer transfer) => Some(transfer) .Where(validator.IsValid) .ForEach(Book); void Book(MakeTransfer transfer) => // actually book the transfer...
We lift the transfer data into an Option
and apply the IsValid
predicate with Where
; this yields a None
if validation fails, in which case, Book
won’t be called. In this example, Where
is the highly composable function that allows us to glue everything together. This style may be unfamiliar, but it’s actually very readable: “Keep the transfer if it’s valid, then book it.”
Once you have a workflow in place, it becomes easy to make changes, such as adding a step to the workflow. Suppose you want to normalize the request before validating it so that things like whitespace and casing don’t cause validation to fail.
How would you go about it? You need to define a function that performs the new step and then integrate it into your workflow. The following listing shows how to do this.
public void MakeTransfer([FromBody] MakeTransfer transfer) => Some(transfer) .Map(Normalize) ❶ .Where(validator.IsValid) .ForEach(Book); MakeTransfer Normalize(MakeTransfer request) => // ...
❶ Plugs a new step into the workflow
More generally, if you have a business workflow, you should aim to express it by composing a set of functions, where each function represents a step in the workflow, and their composition represents the workflow itself. Figure 7.6 shows this one-to-one translation from steps in the workflow to functions in a pipeline.
To be precise, in this case we’re not composing these functions directly—as you’ve seen, the signatures don’t allow this—but rather as arguments to the HOFs defined in Option
, as figure 7.7 shows.
Next, let’s see how we can implement the rest of the workflow.
Domain modeling means creating a representation for the entities and behaviors specific to the business domain in question. In this case, we need a representation for the bank account from which the transferred funds will be debited. We’ll look at domain modeling in more detail in chapter 11, but it’s good to see the fundamentals in the current scenario.
Let’s start with a ridiculously simplistic representation of a bank account that just captures the account balance. This is enough to illustrate the fundamental differences between the OO and functional approaches. The following listing shows how an OO implementation could look.
public class Account { public decimal Balance { get; private set; } public Account(decimal balance) { Balance = balance; } public void Debit(decimal amount) { if (Balance < amount) throw new InvalidOperationException("Insufficient funds"); Balance -= amount; } }
In OOP, data and behavior live in the same object, and methods in the object can typically modify the object’s state. By contrast, in FP data is captured with “dumb” data objects while behavior is encoded in functions, so we’ll separate the two. We’ll use an AccountState
object that only contains state and a static Account
class that contains functions for interacting with an account.
More importantly, notice how the preceding implementation of Debit
is full of side effects: exceptions if business validation fails and state mutation. Instead, we’re going to make Debit
a pure function. Instead of modifying the existing instance, we’ll return a new AccountState
with the new balance.
What about avoiding the debit if the funds on the account are insufficient? Well, by now you should have learned the trick! Use None
to signal an invalid state and skip the following computations! The following listing provides a functional counterpart to the code in listing 7.6.
public record AccountState(decimal Balance); ❶ public static class Account ❷ { public static Option<AccountState> Debit (this AccountState current, decimal amount) => (current.Balance < amount) ? None ❸ : Some(new AccountState(current.Balance - amount)); ❹ }
❶ An immutable record, only containing data
❸ None
here signals that the debit operation failed.
❹ Some
wraps the new state of the account as a result of the operation.
Notice how the OO implementation of Debit
in listing 7.6 isn’t composable: it has side effects and returns void
. The functional counterpart in listing 7.7 is completely different: it’s a pure function and returns a value, which can be used as input to the next function in the chain. Next, we’ll integrate this into the end-to-end workflow.
Now that we have the main workflow skeleton and our simple domain model in place, we’re ready to complete the end-to-end workflow. We still need to implement the Book
function, which should do the following:
Let’s define two services that capture DB and SWIFT access:
public interface IRepository<T> { Option<T> Get(Guid id); void Save(Guid id, T t); } interface ISwiftService { void Wire(MakeTransfer transfer, AccountState account); }
Using these interfaces is still an OO pattern, but let’s stick to it for now (you’ll see how to use just functions in chapter 9). Note that IRepository.Get
returns an Option
to acknowledge the fact that there’s no guarantee that an item will be found for any given Guid
. The following listing displays the fully implemented controller, including the Book
method that was missing until now.
public class MakeTransferController : ControllerBase { IValidator<MakeTransfer> validator; IRepository<AccountState> accounts; ISwiftService swift; public void MakeTransfer([FromBody] MakeTransfer transfer) => Some(transfer) .Map(Normalize) .Where(validator.IsValid) .ForEach(Book); void Book(MakeTransfer transfer) => accounts.Get(transfer.DebitedAccountId) .Bind(account => account.Debit(transfer.Amount)) .ForEach(account => { accounts.Save(transfer.DebitedAccountId, account); swift.Wire(transfer, account); }); }
Let’s look at the newly added Book
method. Notice that accounts.Get
returns an Option
(in case no account was found with the given ID), and Debit
also returns an Option
(in case there were insufficient funds). Therefore, we compose these two operations with Bind
. Finally, we use ForEach
to perform the side effects we need: saving the account with the new balance and wiring the funds to SWIFT.
There are a couple of obvious shortcomings in the overall solution. First, we’re effectively using Option
to stop the computation if something goes wrong along the way, but we’re not giving any feedback to the user as to whether the request was successful or why. In chapter 8, you’ll see how to remedy this with Either
and related structures; this allows you to capture error details without fundamentally altering the approach shown here.
Another problem is that saving the account and wiring the funds should be done atomically: if the process fails in the middle, we could have debited the funds without sending them to SWIFT. Solutions to this issue tend to be infrastructure-specific and aren’t specific to FP.5 Now that I’ve come clean about what’s missing, let’s discuss the good bits.
Something that should stand out when you look at the controller in listing 7.8 is that there are no if
statements, no for
statements, and so forth. In fact, there are practically no statements at all!
One fundamental difference between the functional and imperative style is that imperative code relies on statements; functional code relies on expressions. (For a refresher on how these differ, see the “Expressions, statements, declarations” sidebar.) In essence, expressions have a value; statements don’t. While expressions such as function calls can have side effects, statements only have side effects, so they don’t compose.
If you create workflows by composing functions as we have, side effects naturally gravitate towards the end of the workflow: functions like ForEach
don’t have a useful return value, so that’s where the pipeline ends. This helps to isolate side effects, even visually.
The idea of programming without using statements can seem quite foreign at first, but as the code in this and previous chapters demonstrates, it’s perfectly feasible in C#. Notice that the only statements are the two within the last ForEach
. This is fine because we want to have two side effects—there’s no point hiding that.
I recommend you try coding using just expressions. It doesn’t guarantee good design, but it certainly promotes better design.
When we prefer expressions to statements, our code becomes more declarative. It declares what’s being computed rather than instructing the computer on which specific operations to carry out. In other words, it’s higher-level and closer to the way in which we communicate with other human beings. For example, the top-level workflow in our controller reads as follows:
Discounting things like Map
and Where
, which essentially act as glue between the operations, this reads much like the verbal, bullet-point definition of the workflow. This means the code is closer to the spoken language and, hence, easier to understand and to maintain. Let’s contrast the imperative and declarative styles in table 7.1.
Tells the computer what to do (for example, “Add this item to this list”). |
Tells the computer what you want (for example, “Give me all the items that match a condition”). |
Side effects naturally gravitate toward the end of the expression evaluation.a | |
Statements can be readily translated into machine instructions. |
There is more indirection (hence, potentially more optimizations) in the process of translating expressions to machine instructions. |
a This is because side-effecting functions don’t normally return a value that can be used in further evaluation.
Another thing worth pointing out is that, because declarative code is higher-level, it’s hard to look at the implementation and see that it works without the confidence of unit tests. This is actually a good thing: it’s much better to convince yourself through unit tests than to rely on the false confidence of looking at the code and seeing that it looks like it’s doing the right thing.
The implementation we’ve looked at sheds some light on a natural way to structure applications with function composition. In any reasonably complex application, we tend to introduce some form of layering, distinguishing a hierarchy of high- to low-level components where the highest-level components are entry points into the application (in our example, the controller), and the lowest are exit points (in our example, the repository and SWIFT service).
Unfortunately, I’ve worked on many projects where layering is more of a curse than a blessing, as you need to traverse several layers for any operation. This is because there’s a tendency to structure invocations between layers, as in figure 7.8.
In this approach, there’s an implicit assumption that a layer should only call into an immediately adjacent layer. This makes the architecture rigid. Furthermore, it means that the whole implementation will be impure: because the lowest-level components have side effects (they typically access the DB or external APIs), everything above is also impure—a function that calls an impure function is itself impure.
In the approach demonstrated in this chapter, the interaction between layers looks more like figure 7.9.
A higher-level component can depend on any lower-level component but not vice versa. This is a more flexible and effective approach to layering. In our example, there’s a top-level workflow that composes functions exposed by lower-level components. There are a couple of advantages here:
You get a clear, synthetic overview of the workflow within the top-level component. This doesn’t preclude you from defining subworkflows within a lower-level component.
Mid-level components can be pure. In our example, the interaction between components looks like figure 7.10.
As you can see, the domain representation can (and should!) consist of pure functions only because there’s no interaction with lower-level components; there’s only computation of a result based on inputs. The same could be true of other functionality like validation (depending on what the validation consists of). Therefore, this approach helps you to isolate side effects and facilitates testing. Because the domain model and other mid-level components are pure functions, they can easily be tested without the need for mocks.
Without looking at any code or documentation, write the type of the functions OrderBy
, Take
, and Average
, which were used to implement AverageEarningsOfRichestQuartile
.
Check your answer with the MSDN documentation: http://mng.bz/MvwD. How is Average
different?
Implement a general-purpose Compose
function that takes two unary functions and returns the composition of the two.
Function composition means combining two or more functions into a new function, and it’s widely used in FP.
In C#, the extension method syntax allows you to use function composition by chaining methods.
Functions lend themselves to being composed if they are pure, chainable, and shape-preserving.
Workflows are sequences of operations that can be effectively expressed in your programs through function pipelines: one function for each step of the workflow with the output of each function fed into the next.
The LINQ library has a rich set of easily composable functions to work with IEnumerable
s, and you can use it as inspiration to write your own APIs.
Functional code prefers expressions over statements, unlike imperative code.
Relying on expressions leads to your code becoming more declarative and, hence, more readable.
1 There’s a second, even simpler functor law: if you Map
the identity function (x
→
x
) over a functor f
, the resulting functor is identical to f
. Simply put, the identity function should hold in the elevated world of functors.
2 Average
also causes the whole chain of methods to be evaluated because it’s the only “greedy” method in the chain.
3 These are general guidelines. It will always be possible to compose functions that don’t have these properties, but in practice, these properties are good indicators of how easy and useful it will be to compose those functions.
4 SWIFT is an interbank network; as far as we’re concerned, it’s just a third-party application with which we need to communicate.
5 This problem is difficult and fairly common in distributed architectures. If you’re storing the accounts in a database, you could be tempted to open a DB transaction, save the account within the transaction, wire the funds, and only commit once that’s done. This still leaves you unprotected if the process dies after wiring the funds but before committing the transaction. A thorough solution is to atomically create a single task, representing both operations, and have a process that performs both and removes the task only when both have successfully been carried out. This means that any of the operations are potentially performed more than once so provisions need to be made for the operations to be idempotent. A reference text on these sorts of problems and solutions is Enterprise Integration Patterns by Gregor Hohpe and Bobby Woolf (Addison-Wesley, 2004).
3.144.86.121