In this chapter
We’ve learned how to maintain immutability in our own code using copy-on-write. But we often have to interact with code that doesn’t use the copy-on-write discipline. There are libraries and existing code that we know treat data as mutable. How can you pass your immutable data to it? In this chapter, we learn a practice for maintaining immutability when interacting with code that might change your data.
It’s time again for MegaMart’s monthly Black Friday sale (yes, they do one every month). The marketing department wants to promote old inventory to clear it out of the warehouse. The code they have to do that is old and has been added to over time. It works and is crucial for keeping the business profitable.
All of the code we’ve been managing for the shopping cart has treated the cart as immutable using a copy-on-write discipline. However, the Black Friday promotion code does not. It mutates the shopping cart quite a lot. It was written years ago, it works reliably, and there’s just no time to go back and rewrite it all. We need a way to safely interface with this existing code.
To trigger the Black Friday promotion, we’ll need to add this line of code to add_item_to_cart().
function add_item_to_cart(name, price) {
var item = make_cart_item(name, price);
shopping_cart = add_item(shopping_cart, item);
var total = calc_total(shopping_cart);
set_cart_total_dom(total);
update_shipping_icons(shopping_cart);
update_tax_dom(total);
black_friday_promotion(shopping_cart); ❶
}
❶ we need to add this line of code, but it will mutate the shopping cart
Calling this function will violate copy-on-write, and we can’t modify black_friday_promotion(). Luckily, there is another discipline that will let us call the function safely without violating copy-on-write. The discipline is called defensive copying. We use it to exchange data with code that mutates data.
The marketing team’s Black Friday sale code is untrusted. We don’t trust it because it doesn’t implement the copy-on-write immutability discipline that our code follows.
Our code forms a safe zone where we trust all of the functions to maintain immutability. We can mentally relax while we’re using code inside that circle.
The Black Friday code is outside of that safe zone, but our code still needs to run it. And in order to run it, we need to exchange data with it through its inputs and outputs.
Just to be extra clear: Any data that leaves the safe zone is potentially mutable. It could be modified by the untrusted code. Likewise, any data that enters the safe zone from untrusted code is potentially mutable. The untrusted code could keep references to it and modify it after sending it over. The challenge is to exchange data without breaking our immutability.
We’ve seen the copy-on-write pattern, but it won’t quite help us here. In the copy-on-write pattern, we copy before modifying it. We know exactly what modifications will happen. We can reason about what needs to be copied. On the other hand, in this case, the Black Friday routine is so big and hairy that we don’t know exactly what will happen. We need a discipline with more protective power that will completely shield our data from modification. That discipline is called defensive copying. Let’s see how it works.
The solution to the problem of exchanging data with untrusted code is to make copies—two, in fact. Here’s how it works.
O is for original
C is for copy
As data enters the safe zone from the untrusted code, we can’t trust that the data is immutable. We immediately make a deep copy and throw away the mutable original. Since only trusted code has a reference to that copy, it’s immutable. That protects you as data enters.
You still need protection when data leaves. As we’ve said before, any data that leaves the safe zone should be considered mutable because the untrusted code can modify it. The solution is to make a deep copy and send the copy to the untrusted code. That protects you as data leaves.
That’s defensive copying in a nutshell. You make copies as data enters; you make copies as data leaves. The goal is to keep your immutable originals inside the safe zone and to not let any mutable data inside the safe zone. Let’s apply this discipline to Black Friday.
We need to call a function that mutates its argument, but we don’t want to break our hard-won immutable discipline. We can use defensive copies to protect data and maintain immutability. It’s called defensive because you are defending your original from modifications.
black_friday_promotion() modifies its argument, the shopping cart. We can deep copy the shopping cart and pass the copy to the function. That way, it won’t modify the original.
function add_item_to_cart(name, price) {
var item = make_cart_item(name, price);
shopping_cart = add_item(shopping_cart,
item);
var total = calc_total(shopping_cart);
set_cart_total_dom(total);
update_shipping_icons(shopping_cart);
update_tax_dom(total);
black_friday_promotion(shopping_cart);
}
function add_item_to_cart(name, price) {
var item = make_cart_item(name, price);
shopping_cart = add_item(shopping_cart,
item);
var total = calc_total(shopping_cart);
set_cart_total_dom(total);
update_shipping_icons(shopping_cart);
update_tax_dom(total);
var cart_copy = deepCopy(shopping_cart); ❶
black_friday_promotion(cart_copy);
}
❶ copy data as it leaves
That’s great, except we need the output from black_friday_promotion(). Its output is the modifications it does to the shopping cart. Luckily, it has modified cart_copy. But can we use cart_copy safely? Is it immutable? What if black_friday_promotion() keeps a reference to that shopping cart and modifies it later? These are the kinds of bugs you find weeks, months, or years later. The solution is to make another defensive copy as the data enters our code.
function add_item_to_cart(name, price) {
var item = make_cart_item(name, price);
shopping_cart = add_item(shopping_cart,
item);
var total = calc_total(shopping_cart);
set_cart_total_dom(total);
update_shipping_icons(shopping_cart);
update_tax_dom(total);
var cart_copy = deepCopy(shopping_cart);
black_friday_promotion(cart_copy);
}
function add_item_to_cart(name, price) {
var item = make_cart_item(name, price);
shopping_cart = add_item(shopping_cart,
item);
var total = calc_total(shopping_cart);
set_cart_total_dom(total);
update_shipping_icons(shopping_cart);
update_tax_dom(total);
var cart_copy = deepCopy(shopping_cart);
black_friday_promotion(cart_copy);
shopping_cart = deepCopy(cart_copy); ❶
}
❶ copy data as it enters
And that’s the defensive copy pattern. As we’ve seen, you protect yourself by making copies. You copy data as it leaves your system, and you copy it as it comes back in.
The copies we make need to be deep copies. We’ll see how to implement that in just a moment.
Defensive copying is a discipline that maintains immutability when you have to exchange data with code that does not maintain immutability. We’ll call that code you don’t trust. Here are the two rules:
If you have immutable data that will leave your code and enter code that you don’t trust, follow these steps to protect your original:
If you are receiving data from untrusted code, that data might not be immutable. Follow these steps:
If you follow these two rules, you can interact with any code you don’t trust without breaking your immutable discipline.
Note that these rules could be applied in any order. Sometimes you send data out, and then data comes back. That’s what happens when your code calls a function from an untrusted library.
On the other hand, sometimes you receive data before you send data out. That happens when untrusted code calls a function in your code, like if your code is part of a shared library. Just keep in mind that the two rules can be applied in either order.
We are going to implement defensive copying a few more times. But before we move on to another implementation, let’s keep working on the code we just saw for the Black Friday promotion. We can improve it by wrapping it up.
Also note that sometimes there is no input or output to copy.
We have successfully implemented defensive copying, but the code is a bit unclear with all of the copying going on. Plus, we’re going to have to call black_friday_promotion() many times in the future. We don’t want to risk getting the defensive copying wrong. Let’s wrap up the function in a new function that includes the defensive copying inside it.
function add_item_to_cart(name, price) {
var item = make_cart_item(name, price);
shopping_cart = add_item(shopping_cart,
item);
var total = calc_total(shopping_cart);
set_cart_total_dom(total);
update_shipping_icons(shopping_cart);
update_tax_dom(total);
var cart_copy = deepCopy(shopping_cart);
black_friday_promotion(cart_copy); ❶
shopping_cart =
deepCopy(cart_copy);
}
function add_item_to_cart(name, price) {
var item = make_cart_item(name, price);
shopping_cart = add_item(shopping_cart,
item);
var total = calc_total(shopping_cart);
set_cart_total_dom(total);
update_shipping_icons(shopping_cart);
update_tax_dom(total);
shopping_cart =
black_friday_promotion_safe(shopping_cart);
}
function black_friday_promotion_safe(cart) {
var cart_copy = deepCopy(cart);
black_friday_promotion(cart_copy);
return deepCopy(cart_copy);
} ❶
❶ extract this code into a new function
Now we can call black_friday_promotion_safe() without worry. It protects our data from modification. And now it’s much more convenient and clear to see what’s going on.
Let’s look at another example.
Defensive copying is a common pattern that you might find outside of the traditional places. You may have to squint to see it, though.
Most web-based APIs are doing implicit defensive copying. Here’s a scenario of how that might look.
A web request comes into your API as JSON. The JSON is a deep copy of data from the client that is serialized over the internet. Your service does its work, then sends the response back as a serialized deep copy, also in JSON. It’s copying data on the way in and on the way back.
It’s doing defensive copying. One of the benefits of a service-oriented or microservices system is that the services are doing defensive copying when they talk to each other. Services with different coding practices and disciplines can communicate without problems.
Erlang and Elixir (two functional programming languages) implement defensive copying as well. Whenever two processes in Erlang send messages to each other, the message (data) is copied into the mailbox of the receiver. Data is copied on the way into a process and on the way out. The defensive copying is key to the high reliability of Erlang systems.
For more information on Erlang and Elixir, see https://www.erlang.org and https://elixir-lang.org.
We can tap into the same benefits that microservices and Erlang use in our own modules.
Use copy-on-write when you need to modify data you control.
Use defensive copying when exchanging data with untrusted code.
You should use copy-on-write everywhere inside the safe zone. In fact, the copy-on-write defines your immutability safe zone.
Use copy-on-write at the borders of your safe zone for data that has to cross in or out.
Shallow copy—relatively cheap
Deep copy—relatively expensive
The difference between a deep copy and a shallow copy is that a deep copy shares no structure with the original. Every nested object and array is copied. In a shallow copy, we can share a lot of the structure—anything that doesn’t change can be shared.
In a deep copy, we make copies of everything. We use a deep copy because we don’t trust that any of it will be treated as immutable by the untrusted code.
Deep copies are obviously more expensive. That’s why we don’t do them everywhere. We only do them where we can’t guarantee that copy-on-write will be followed.
Deep copy is a simple idea that should have a simple implementation. However, in JavaScript it is quite hard to get right because there isn’t a good standard library. Implementing a robust one is beyond the scope of this book.
I recommend using the implementation from the Lodash library (see lodash.com). Specifically, the function _.cloneDeep() (see lodash.com/docs/#cloneDeep) does a deep copy of nested data structures. The library is trusted by thousands if not millions of JavaScript developers.
However, just for completeness, here is a simple implementation that may satisfy your curiosity. It should work for all JSON-legal types and functions.
function deepCopy(thing) {
if(Array.isArray(thing)) {
var copy = [];
for(var i = 0; i < thing.length; i++)
copy.push(deepCopy(thing[i])); ❶
return copy;
} else if (thing === null) {
return null;
} else if(typeof thing === "object") {
var copy = {};
var keys = Object.keys(thing);
for(var i = 0; i < keys.length; i++) {
var key = keys[i];
copy[key] = deepCopy(thing[key]); ❶
}
return copy;
} else {
return thing; ❷
}
}
❶ recursively make copies of all of the elements
❷ strings, numbers, booleans, and functions are immutable so they don’t need to be copied
This function will not hold up to the quirks of JavaScript. There are many more types out there that this will fail on. However, as an outline of what needs to be done, it does a decent job. It shows that arrays and objects need to be copied, but also that the function will recurse into all of the elements of those collections.
I highly recommend using a robust deep copy implementation from a widely used JavaScript library like Lodash. This deep copy function is just for teaching purposes and will not work in production.
Copy-on-write:
Obviously, I’m more important. I help people keep their data immutable.
Defensive copying:
That doesn’t make you more important. I help keep data immutable, too.
Well, my shallow copies are way more efficient than your deep copies.
Well, you only have to worry about that because you need to make a copy EVERY SINGLE TIME data is modified. I only need to make copies when data enters or leaves the safe zone.
Exactly my point! There wouldn’t even be a safe zone without me.
Well, I suppose you’re right about that. But your safe zone wouldn’t be any use at all if it couldn’t pass data to the outside. That’s where all the existing code and libraries are.
Well, I really think they should be using me in those legacy codebases and libraries, too. They could learn a lot from a discipline like me. Convert their writes to reads, and the reads naturally become calculations.
Listen, that is never going to happen. Just accept it. There’s too much code out there. There aren’t enough programmers in the whole world to ever rewrite it all.
You’re right! (sobbing) I should face facts. I’m worthless without you!
Oh, now I’m getting all emotional, too. (tears running down face) I can’t live without you, either!
(hugs) (hugs)
In this chapter we learned a more powerful yet more expensive discipline for maintaining immutability called defensive copying. It’s more powerful because it can implement immutability all by itself. It’s more expensive because it needs to copy more data. However, when you use defensive copying in tandem with copy-on-write, you can get all of the benefits of both—power when you need it, but shallow copies for efficiency.
In the next chapter, we will pull together everything we’ve learned so far and discover a way to organize our code to improve the design of our system.
18.118.1.232