11 Representing state and change

This chapter covers

  • The pitfalls of state mutation
  • Representing change without mutation
  • Enforcing immutability
  • Separating data and logic

Greek philosopher Heraclitus said that we cannot step into the same river twice; the river constantly changes, so the river that was there a moment ago is no longer. Many programmers would disagree, objecting that it’s the same river but its state has changed. Functional programmers try to stay true to Heraclitus’s thinking and would create a new river with every observation.

Most programs are built to represent things and processes in the real world, and because the world constantly changes, programs must somehow represent that change. The question is how we represent change. Commercial applications written in the imperative style have state mutation at their core: objects represent entities in the business domain, and change in the world is modeled by mutating the state of these objects.

We’ll start by looking at the weaknesses we introduce in our programs when we use mutation. We’ll then see how we can avoid these problems at the source by representing change without using mutation and, more pragmatically, how to enforce immutability in C#. Finally, because much of our programs’ data is stored in data structures, we’ll introduce the concepts and techniques behind functional data structures, which are also immutable.

11.1 The pitfalls of state mutation

State mutation is when memory is updated in place, and an important problem with it is that concurrent access to a shared mutable state is unsafe. You’ve already seen examples demonstrating loss of information due to concurrent updates in chapters 1 and 3; let’s now look at a more object-oriented scenario. Imagine a Product class with an Inventory field, representing the number of units in stock:

public class Product
{
   public int Inventory { get; private set; }
   public void ReplenishInventory(int units) => Inventory += units;
   public void ProcessSale(int units) => Inventory -= units;
}

If Inventory is mutable as this example shows, and you have concurrent threads updating its value, that can lead to race conditions, and the results can be unpredictable. Imagine that you have a thread replenishing the inventory, while another thread concurrently processes a sale, diminishing the inventory as figure 11.1 shows. If both threads read the value at the same time, and the thread with the sale has the last update, you’ll end up with an overall decrease in inventory.

Figure 11.1 Loss of data as a result of concurrent updates. Both threads cause the Inventory value to be updated concurrently with the result that one of the updates is lost.

Not only has the update to replenish the inventory been lost, but the first thread now potentially faces a completely invalid state: a product that’s just been replenished has zero inventory.

If you’ve done some basic multithreading, you’re probably thinking, “Easy! You just need to wrap the updates to Inventory in a critical section using the lock statement.” It turns out that this solution, which works for this simple case, can become the source of some difficult bugs as the complexity of the system increases. (A sale affects not only the inventory, but the sales order, the company balance sheet, and so on.)

If things can fail when a single variable is set, imagine when an update to an entity involves updating several fields. For example, imagine that when you update the inventory, you also set a flag indicating whether the product is low on inventory as the following listing shows.

Listing 11.1 Temporary inconsistency as a result of non-atomic updates

class Product
{
   int inventory;
 
   public bool IsLowOnInventory { get; private set; }
 
   public int Inventory
   {
      get => inventory;
      private set
      {
         inventory = value;
                                      
         IsLowOnInventory = inventory <= 5;
      }
   }
}

At this point, the object can be in an invalid state from the perspective of any thread reading its properties.

This code defines an invariant: when inventory is 5 or less, then IsLowOnInventory must be true.

In a single-threaded setting, there aren’t any problems with the preceding code. But in a multithreaded setting, a thread could be reading the state of this object just as another thread is performing the update in the window during which Inventory has been updated but IsLowOnInventory hasn’t. (Notice that this window widens if the logic to compute IsLowOnInventory becomes more expensive.) During that window, the invariant can be broken, so the object would appear to be in an invalid state to the first thread. This will, of course, happen very rarely, and it will be nearly impossible to reproduce. This is part of the reason why bugs caused by race conditions are so hard to diagnose.

Indeed, race conditions are known to have caused some of the most spectacular failures in the software industry. If you have a system with concurrency and state mutation, it’s impossible to prove that the system is free of race conditions.1 In other words, if you want concurrency (and, given today’s tendency toward multicore processors and distributed computing, you hardly have a choice) and strong guarantees of correctness, you simply must give up mutation.

Lack of safe concurrent access may be the biggest pitfall of a shared mutable state, but it’s not the only one. Another problem is the risk of introducing coupling—a high degree of interdependence between different parts of your system. In figure 11.1, Inventory is encapsulated, meaning it can only be set from within the class, and according to OOP theory, that’s supposed to give you a sense of comfort. But how many methods in the Product class can set the inventory value? How many code paths lead into these methods so that they ultimately affect the value of Inventory? How many parts of the application can get the same instance of the Product and rely on the value of Inventory, and how many will be affected if you introduce a new component that causes Inventory to change?

For a non-trivial application, it’s difficult to answer these questions completely. This is why inventory, even though it’s a private field and can be set only via a private setter, qualifies as a global mutable state; as far as we can tell, it could be mutated by any part of the program via public methods in the enclosing class. As a result, mutable state couples the behavior of the various components that read or update that state, making it difficult to reason about the behavior of the system as a whole.

Finally, shared mutable state implies loss of purity. As explained in chapter 3, mutating global state (remember, that’s all state that’s not local to a function, including private variables) constitutes a side effect. If you represent change in the world by mutating objects in your system, you lose the benefits of function purity. For these reasons, the functional paradigm discourages state mutation altogether.

NOTE In this chapter, you’ll learn how to work with immutable data objects. That’s an important technique, but keep in mind that it’s not always sufficient to represent entities that change with time. Immutable data objects can represent the state of an entity at any given point in time, somewhat like a frame in a film, but to represent the entity itself, to get the full moving picture, you need a further abstraction that links those successive states together. We’ll discuss techniques for accomplishing that in chapters 13, 15, 18, and 19.

Local mutation is OK

Not all state mutation is equally evil. Mutating local state (state that’s only visible within the scope of a function) is inelegant but benign. For example, imagine the following function:

int Sum(int[] ints)
{
   var result = 0;
   foreach (int i in ints) result += i;
   return result;
}

Although we’re updating result, this isn’t visible from outside the scope of the function. As a result, this implementation of Sum is actually a pure function: it has no observable side effects from the point of view of a calling function.

Naturally, this code is also low-level. You can normally achieve what you want with built-in functions like Sum, Aggregate, and so on. In practice, it’s rare that you’ll find a legitimate case for mutating local variables.

11.2 Understanding state, identity, and change

Let’s look more closely at change and mutation.2 By change, I mean change in the real world, such as when 50 units of stock become available for sale. Mutation means data is updated in place; as you saw in the Product class, when the Inventory value is updated, the previous value for Inventory is lost.

In FP, we represent change without mutation: values aren’t updated in place. Instead, we create new instances that represent the data with the desired changes, as figure 11.2 shows. The fact that the current level of inventory is 53 doesn’t obliterate the fact that it was previously 3.

Figure 11.2 In FP, change can be represented by creating new versions of the data.

In FP, we work with immutable values: once a value is initialized, it’s never updated.

Wrapping your head around immutable objects

If you’ve always used mutation to represent change, creating replicas of objects when their properties are updated can seem counterintuitive. For example, consider this code:

record Product(int Inventory);
 
static Product ReplenishInventory(Guid id, int units)
{
   Product original = RetrieveProduct(id);
   Product updated = new Product(original.Inventory + units);
   return updated;
}

In this code, Product is immutable, so we represent new inventory becoming available by creating a new Product instance. You may feel awkward about this because now there are two competing Product instances in memory, only one of which accurately represents the real-world product.

Note that in this example, the updated instance is returned, while the original instance runs out of scope and will therefore be garbage-collected. In many cases, the obsolete instance will simply be “forgotten” rather than overwritten.

But there are cases in which you do want several views of an entity to coexist. For example, say your employer offers free shipping for orders that are over $40. You might like to have a view of the order before and after a user removes an item to warn them if they have just lost the right to free delivery. Or, an update may be part of an in-memory transaction, and you may want to revert to the previous state of the entity if the transaction fails.

The idea that only the latest or current view of the data is valuable is just a prejudice deriving from mainstream practice. When you give it up, many new possibilities appear.

To refine or redefine your intuition about change and mutation, it’s useful to distinguish between things that change and things that don’t.

11.2.1 Some things never change

There are some things that we think of as inherently immutable. For example, your age may change from 30 to 31, but the number 30 is still the number 30, and 31 is still 31.

This is modeled in the Base Class Library (BCL) in that all primitive types are immutable. What about more complex types? Dates are a good example. The third of March is still the third of March, even though you may change an appointment in your calendar from the third of March to the fourth. This is also reflected in the BCL in that types that are used to represent dates such as DateTime are immutable.3 See this for yourself by typing the following in the REPL (use DateTime instead of DateOnly if you don’t have .NET 6):

var momsBirthday = new DateOnly(1966, 12, 13);
var johnsBirthday = momsBirthday;               
 
// some time goes by...
 
johnsBirthday = johnsBirthday.AddDays(1);       
 
johnsBirthday // => 14/12/1966
momsBirthday  // => 13/12/1966                  

John has the same birthday as Mom.

You realize that John’s birthday is actually one day later.

Mom’s birthday was not affected.

In the preceding example, we start by saying that Mom and John have the same birthday, so we assign the same value to momsBirthday and johnsBirthday. When we then use AddDays to create a later date and assign it to johnsBirthday, this leaves momsBirthday unaffected. In this example, we are doubly protected from mutating the date:

  • Because System.DateOnly is a struct, it’s copied upon assignment, so momsBirthday and johnsBirthday are different instances.

  • Even if DateOnly were a class, so that momsBirthday and johnsBirthday pointed to the same instance, the behavior would still be the same because AddDays creates a new instance, leaving the underlying instance unaffected.

If, on the other hand, DateOnly were a mutable class and AddDays mutated the days of its instance, the value of momsBirthday would be updated as a result—or, rather, as a side effect—of updating johnsBirthday. (Imagine explaining to Mom that that’s the reason for your belated birthday wishes.)

Immutable types in the .NET framework

Here are the most commonly used immutable types in .NET’s Base Class Library:

  • DateTime, TimeSpan, DateTimeOffset, DateOnly, TimeOnly

  • Delegate

  • Guid

  • Nullable<T>

  • String

  • Tuple<T1>, Tuple<T1, T2>, ...

  • Uri

  • Version

Furthermore, all primitive types are immutable.

Now let’s define a custom immutable type. Say we represent a Circle like so:

readonly record struct Circle(Point Center, double Radius);

You would probably agree that it makes no sense that a circle should ever grow or shrink because it’s a completely abstract geometric entity. The preceding implementation reflects this by declaring the struct as readonly, which makes it immutable. This means that it will not be possible to update the values for Radius and Center; once created, the state of the circle can never change.4

Structs should be immutable

Notice that I’ve defined Circle as a value type. Because value types are copied when passed between functions, it makes sense that structs should be immutable. This isn’t enforced by the compiler, so you could create a mutable struct. In fact, if you declare a record struct without the readonly modifier, you get a mutable struct.

Unlike with classes, any changes you make to a mutable struct propagate down but not up the call stack, potentially leading to unexpected behavior. For this reason, I recommend you always stick to immutable structs, the only exceptions being warranted by proven performance requirements.

If you have a circle and you’d like a circle double the size, you can define functions to create a new circle based on an existing one. Here’s an example:

static Circle Scale(this Circle c, double factor)
   => c with { Radius = c.Radius * factor };

OK, so far we haven’t used mutation, and these examples are pretty intuitive. What do numbers, dates, and geometric entities have in common? Their value captures their identity: they are value objects. If you change the value of a date . . . well, it identifies a different date! The problems begin when we consider objects whose value and identity are different things. We’ll look at this next.

11.2.2 Representing change without mutation

Many real-world entities change with time: your bank account, your calendar, your contacts list—all these things have a state that changes with time. Figure 11.3 illustrates this idea.

Figure 11.3 An entity whose state changes over time

For such entities, their identity isn’t captured by their value because their identity remains constant, whereas their value changes with time. Instead, their identity is associated with different states at different points in time. Your age may change, or your salary, but your identity doesn’t. To represent such entities, programs must model not only an entity’s state (that’s the easy part), but the transitions from one state to another and often the association of an identity with the entity’s current state.

We’ve discussed some reasons why mutation provides an imperfect mechanism for managing state transitions. In FP, states are not mutated; they’re snapshots that, like the frames of a film, represent an evolving reality but are in themselves static.

11.3 Using records to capture the state of domain entities

To illustrate immutable data objects in C#, let’s start working on AccountState, which we’ll use to represent the state of a bank account in the BOC application. The following listing shows our model.

Listing 11.2 A simple model for the state of a bank account

public enum AccountStatus
{ Requested, Active, Frozen, Dormant, Closed }
 
public record AccountState
(
   CurrencyCode Currency,
   AccountStatus Status = AccountStatus.Requested,
   decimal AllowedOverdraft = 0m,
   IEnumerable<Transaction> TransactionHistory = null
);
 
public record Transaction
(
   decimal Amount,
   string Description,
   DateTime Date
);

For brevity, I’ve omitted the definition of CurrencyCode, which simply wraps a string value such as EUR or USD similarly to the ConnectionString and SqlTemplate types we saw in section 9.4.1.

Because AccountState has several fields and not all may be meaningful all the time, I have provided some reasonable default values for all fields except the currency. To create an AccountState, all you really need is its currency:

var newAccount = new AccountState(Currency: "EUR");

This creates an AccountState with a default status of Requested. When you’re ready to activate the account, you can do this by using a with expression:

public static AccountState Activate(this AccountState original)
   => original with { Status = AccountStatus.Active };

This creates a new instance of AccountState, populated with all the values from the original except for Status, which is set to the new value. The original object is still intact:

var original = new AccountState(Currency: "EUR");
var activated = original.Activate();
 
original.Status    // Requested
original.Currency  // "EUR"
 
activated.Status   // Active
activated.Currency // "EUR"

Notice that you can use with expressions that set more than one property:

public static AccountState RedFlag(this AccountState original)
   => original with
   {
      Status = AccountStatus.Frozen,
      AllowedOverdraft = 0m
   };

Performance impact of using immutable objects

Working with immutable objects means that every time your data needs to change, you create a new, modified instance rather than mutating the object in place. “But isn’t that terribly inefficient?” you may be thinking.

There is indeed a small performance penalty for creating modified copies, as well as for creating a greater number of objects that will eventually need to be garbage-collected. This is also why FP isn’t practical in languages that lack automatic memory management.

But the performance impact is smaller than you might think because the modified instance is a shallow copy of the original. That is, objects referenced by the original object aren’t copied; only the reference is copied. With the exception of the field being updated, the new object is a bitwise replica of the original.

For example, when you create a new AccountState with an updated status, the list of transactions won’t be copied. Instead, the new object references the original list of transactions. (This too should be immutable, so it’s OK for different instances to share it.)

with expressions are fast. Of course, in-place updates are even faster, so there’s a tradeoff between performance and safety. The performance penalty of creating shallow copies is likely to be negligible in the wide majority of cases. My advice is to put safety first and optimize later as needed.

Next, let’s see how we can further improve this model.

11.3.1 Fine-grained control on record initialization

Have another look at the proposed definition of AccountState (replicated in the following snippet) and see if you can spot any potential problems with it:

public record AccountState
(
   CurrencyCode Currency,
   AccountStatus Status = AccountStatus.Requested,
   decimal AllowedOverdraft = 0m,
   IEnumerable<Transaction> TransactionHistory = null
);

There are in fact a couple of issues here. One thing that immediately stands out is the default value of null for the list of transactions. The reason for providing a default value is that when a new account is created, it will have no previous transactions, so it makes sense to have this as an optional parameter. But we also don’t want null to potentially cause a NullReferenceException. Secondly, this record definition allows you to create an account by changing the currency of an existing account, like so:

var usdAccount = newAccount with { Currency = "USD" };

This makes no sense. Although the status of an account may go from, say, Requested to Active, once an account is opened with a given currency, that should never change. We’d like our model to represent this. Let’s see how we can address both issues, starting with the latter.

Read-only vs. init-only properties

When you use positional records, the compiler creates an init-only auto property for each parameter you declare. This is a property with a get and an init method; the latter is a setter that can only be called when the record instance is initialized. If we were to explicitly declare the Currency property as a public init-only auto property, just as the compiler would generate, it would look like this:

public record AccountState
(
   CurrencyCode Currency,
   AccountStatus Status = AccountStatus.Requested,
   decimal AllowedOverdraft = 0m,
   IEnumerable<Transaction> TransactionHistory = null
)
{
   public CurrencyCode Currency { get; init; } = Currency;
}

The following listing breaks this down so that you can see what every bit means.

Listing 11.3 Explicitly defining a property in a positional record definition

public record AccountState(CurrencyCode Currency /*...*/)
{
 
 
   public CurrencyCode Currency     
   {
      get;                          
      init;                         
   }
   =                                
   Currency;                        
 
 
}

Currency here refers to the name of the property.

Gets the value of the property

Allows the value to be set only upon record initialization

Introduces the property initializer

Currency here refers to the constructor parameter; this means that upon initialization the Currency property is set to the value provided for the Currency constructor parameter.

When you use a with expression to create a modified version of a record, the runtime creates a clone of the original and then calls the init method of any properties for which you’ve provided new values. Now, writing the property explicitly allows us to override the compiler’s defaults; in this case, we want to define the Currency property as a read-only auto property by removing the init method:

public CurrencyCode Currency { get; } = Currency;

Then a with expression attempting to create a modified version of an account with a different currency will not compile because there’s no init method for setting the Currency of the copy.

Immutable objects never change, so all properties of an immutable object must be either read-only or init-only:

  • Use init-only properties if it makes sense to create a copy where a property is given an updated value.

  • Use read-only properties otherwise.

As you’ve seen, the compiler-generated properties of positional records are init-only, so you need to explicitly declare them if you want them to be read-only.

Initializing an optional list to be empty

Now let’s go back to the problem of TransactionHistory, which is initialized to be null when no value is passed to the constructor for AccountState. What we really want is to have an empty list as the default value, so ideally we’d like to write

public record AccountState
(
   // ...
   IEnumerable<Transaction> TransactionHistory
      = Enumerable.Empty<Transaction>()
);

But this doesn’t compile because default values for optional arguments must be compile- time constants. The most concise solution is to explicitly define the Transaction-History property and use a property initializer, as the following listing shows.

Listing 11.4 Initializing a record with an empty list

public record AccountState
(
   CurrencyCode Currency,
   AccountStatus Status = AccountStatus.Requested,
   decimal AllowedOverdraft = 0m,
   IEnumerable<Transaction> TransactionHistory = null
)
{
   public IEnumerable<Transaction>
      TransactionHistory { get; init; }
      = TransactionHistory                      
         ?? Enumerable.Empty<Transaction>();    
 }

Refers to the constructor parameter

Uses an empty list if the constructor was given null

While default values for method arguments must be compile-time constants, property initializers don’t have this constraint. Therefore, we can include some logic in the property initializer. The previous code replaces the auto-generated property for TransactionHistory with an explicit declaration; it’s essentially saying, “When a new AccountState is created, use the value given for the optional TransactionHistory constructor parameter to populate the TransactionHistory property, but use an empty list if it’s null.”

There are other possible approaches: you could explicitly define a constructor and have this logic in the constructor, or define a full property with a backing field and have this logic in the property’s init method.

11.3.2 Immutable all the way down

There is one more tweak. For an object to be immutable, all its members must be immutable. If you look at the definition for AccountState, there’s a catch. TransactionHistory is defined as an IEnumerable<Transaction>, and while Transaction is immutable, there are many mutable lists that implement IEnumerable. For example, consider the following code:

var mutableList = new List<Transaction>();
 
var account = new AccountState
(
   Currency: "EUR",
   TransactionHistory: mutableList
);
 
account.TransactionHistory.Count() // => 0
 
mutableList.Add(new(-1000, "Create trouble", DateTime.Now));
 
account.TransactionHistory.Count() // => 1

This code creates an AccountState with a mutable list; it then holds a reference to that list so that the list can still be mutated. As a result, we cannot say that our definition of AccountState is truly immutable.

There are two possible solutions. You could change the type definition, declaring TransactionHistory to be an ImmutableList rather than an IEnumerable. Alternatively, you could rewrite the property as the following listing shows.

Listing 11.5 Making a record immutable even if given a mutable list

using System.Collections.Immutable;
 
public record AccountState // ...
{
   public CurrencyCode Currency { get; } = Currency;
 
   public IEnumerable<Transaction> TransactionHistory { get; init; }
      = ImmutableList.CreateRange
         (TransactionHistory ?? Enumerable.Empty<Transaction>());
}

This code creates an ImmutableList from the given IEnumerable, thus making AccountState truly immutable.

TIP If given an ImmutableList, CreateRange will just return it so that you don’t incur any overhead by using this approach. Otherwise, it will create a defensive copy, ensuring that any subsequent mutation to the given list does not affect AccountState.

If an account has an immutable list of transactions, how do you add a transaction to the list? You don’t. You create a new list that has the new transaction as well as all existing ones, and that will be part of a new AccountState. The following listing shows that adding a child to an immutable object involves the creation of a new parent object.

Listing 11.6 Adding a child to an immutable object

using LaYumba.Functional;                              
 
public static AccountState Add
   (this AccountState account, Transaction trans)
   => account with
   {
      TransactionHistory
         = account.TransactionHistory.Prepend(trans)   
   };

Includes Prepend as an extension method on IEnumerable

A new IEnumerable, including existing values and the one being added

Notice that in this particular case, we’re prepending the transaction to the list. This is domain-specific; in most cases, you’re interested in the latest transactions, so it’s efficient to keep the latest ones at the front of the list.

Copying a list every time a single element is added or removed may sound terribly inefficient, but this isn’t necessarily the case. We’ll discuss why in chapter 12.

Hurdles to using C# records

In this section, you’ve seen how we could use records to great effect to define custom immutable data types. However, records are a recent feature in C#, so it’s possible that you may encounter some hurdles when trying to adopt records.

Specifically, if you use an object-relational mapper (including Entity Framework), which uses change tracking to see which objects have changed and need to be updated in the DB, or relies on an empty constructor and settable properties to populate objects, you may not be able to use records. Another stumbling block could be serialization. While System.Text.Json supports serializing records to and from JSON, other serializers may not support records yet. In this case, consider using immutability by convention (discussed in the appendix). I expect that in time records will gain popularity and will eventually be supported by all major libraries.

11.4 Separating data and logic

One of the ways in which FP reduces coupling in your applications, therefore making them simpler and easier to maintain, is that it naturally leads to a separation between data and logic. This is the approach we’ve been following in the preceding section:

  • AccountState, which we defined in listing 11.2, only contains data.

  • Business logic, such as activating an account or adding a transaction, is modeled through functions.

We can group all these functions into a static Account class, including logic for creating new and updated versions of AccountState, as the following listing demonstrates.

Listing 11.7 A static class that includes account-specific business logic

public static class Account
{
   public static AccountState Create(CurrencyCode ccy) => new(ccy);
 
   public static AccountState Activate(this AccountState account)
      => account with { Status = AccountStatus.Active };
 
   public static AccountState Add
      (this AccountState account, Transaction trans)
      => account with
      {
         TransactionHistory
            = account.TransactionHistory.Prepend(trans)
      };
}

Account is a static class for representing changes to an account, including a factory function. While AccountState represents the state of the account at a given time, the functions in Account represent state transitions. This is illustrated in figure 11.4.

Figure 11.4 Representing state and logic related to an entity are separate concerns. In this example, AccountState captures the data representing an account, while Account is a collection of functions that model changes to an account.

When we write logic at a high level, we only rely on Account: for example,

var onDayOne = Account.Create("USD");
var onDayTwo = Account.Activate(onDayOne);

This means that FP allows you to treat representing state and representing state transitions as separate concerns. Also, business logic is higher-level compared to the data (Account depends on the lower-level AccountState).

Naming conventions

If you follow the approach of separating logic from data, you have to pick a naming convention to differentiate the data object from the class including the logic. Here, I used the entity name (Account) for the class containing the logic; this is because I like to have the best readability when referring to functions point-free: for example, Account.Activate in

Option<AccountState> Activate(Guid id)
   => GetAccount(id).Map(Account.Activate);

The more verbose AccountState, on the other hand, can often be omitted by using var. Other naming conventions are possible, of course. Pick what makes the most sense, and be consistent within your application.

Account is a class because C# syntax requires it (with the exception of top-level statements, you cannot declare methods or delegates outside of a class), but conceptually, it’s just a grouping of related functions. This can be referred to as a module. These functions don’t rely on any state in the enclosing class, so you can think of them as free-standing functions and of the class name as part of the namespace.

This separation between data (which is inert) and functions (which perform data transformations) is typical of FP. This is in stark contrast with OOP, where objects include both data and methods that mutate that data.

Separating data from logic results in simpler systems with less coupling that are, therefore, easier to understand and to maintain. It is also a logical choice when programming with distributed systems, where data structures need to be easy to serialize and pass between applications, while logic resides within those applications.

Data-oriented programming (DOP)

Several of the ideas I’ve discussed in this chapter are relevant to DOP, a paradigm that advocates separating logic from data as a means to decrease the complexity of an application. FP and DOP are distinct, but there is some overlap. The principles of DOP are

  1. Separate logic from data entities.

  2. Use immutable data.

  3. Use generic structures to represent data entities.

FP also advocates using immutable data, and the use of immutable data and pure functions naturally leads to separating logic from data entities, as I demonstrated in this section. There is definitely some overlap between FP and DOP.

As for the third principle, DOP advocates using generic structures to represent data; for example, instead of defining an AccountState type with a Currency property, you would use a dictionary, mapping the value for the account’s currency to the Currency key and similarly for other fields.a It turns out that you can represent data of any shape by using just lists, dictionaries, and primitives.

The main benefit of using generic structures to represent data is that you can handle data in a correspondingly general fashion; for example, given two snapshots of data of any shape, you can compare them and see what bits have changed. You can merge change sets and see if concurrent updates cause conflicts. That’s pretty powerful.

The obvious drawback is that you lose type safety, so it’s a bit of a hard sell for programmers who are used to working in statically typed languages like C#.

If you want to learn more about DOP, understand how separating logic and data simplifies life, and see why using generic structures to represent data entities can be worthwhile, see Data-Oriented Programming by Yehonathan Sharvit (Manning, 2021).

  

a If you wanted to follow this approach in C#, you would probably use the dynamic type for sugar-coating an underlying dictionary. This allows you to access field values with the dot notation.

Summary

  • FP discourages state mutation, preventing several drawbacks associated with state mutation, such as lack of thread safety, coupling, and impurity:

    • Things that don’t change are represented with immutable objects.
    • Things that change are also represented with immutable objects; these immutable snapshots represent an entity’s state at a given point. Change is represented by creating a new snapshot with the desired changes.
  • Use records to define custom immutable data types.

  • For a type to be immutable, all its children, including lists and other data structures, must also be immutable.

  • You can simplify your application and promote loose coupling by separating data from logic:

    • Use data objects (typically records) to encapsulate data.
    • Use functions (implemented as static methods within stateless static classes) to represent business logic.

1 The preceding examples refer to multithreading, but the same problems can arise if the source of concurrency is asynchrony or parallelism (these terms were described in the sidebar on the “Meaning and types of concurrency” in chapter 3).

2 The fundamental techniques I discuss in this section are ubiquitous in FP, but the concepts and metaphors I use to explain them are largely inspired by Rich Hickey, the creator of the Clojure programming language.

3 The creators of .NET took inspiration from Java, but in this case, they also learned from Java’s mistakes (Java had mutable dates until Java 8).

4 In reality, you can still mutate read-only variables by using reflection. But making a field read-only is a clear signal to any clients of your code that the field isn’t meant to be mutated.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.148.104.215