In this chapter
We’ve been talking about immutability, and we’ve even implemented it in some places. In this chapter, we’re going to dive deep. We’ll learn to make immutable versions of all of the common JavaScript array and object operations we’re used to using.
We’ve already implemented some shopping cart operations using a copy-on-write discipline. Remember: That means we made a copy, modified the copy, then returned the copy. But there are a number of shopping cart operations we haven’t done yet. Here is a list of the operations we need or might need for the shopping cart and for shopping cart items:
** already implemented
*** operation on nested data
Jenna is skeptical that all of these operations can be done immutably. The fifth operation is harder, because it is modifying an item inside of a shopping cart. We call that nested data. How can we implement immutability on nested data? Well, let’s find out.
Let’s look at each of our operations in a new way. Some of our operations are reads. We are getting some information out of the data without modifying it. These are easy cases because nothing is being modified. Those don’t need any other work. Reads that get information from their arguments are calculations.
The rest of our operations are writes. They modify the data in some way. These will require a discipline, because we don’t want to modify any of the values that may be in use somewhere else.
Reads
Writes
** reads
*** writes
Three of our shopping cart operations are writes. We will want to implement these using an immutable discipline. As we’ve seen before, the discipline we’ll use is called copy-on-write. It’s the same discipline used in languages like Haskell and Clojure. The difference is those languages implement the discipline for you.
Because we’re in JavaScript, mutable data is the default, so we the programmers have to apply the discipline ourselves explicitly in the code.
What about an operation that reads and writes?
Sometimes we may want to modify the data (a write) and get some information out of it (read) at the same time. If you’re curious about that, we will get to that in just a few pages. The short answer is “Yes, you can do that.” You’ll see the long answer on page 122.
** write
*** reads
The copy-on-write discipline is just three steps in your code. If you implement these steps, you are doing copy-on-write. And if you replace every part of your code that modifies the global shopping cart with a copy-on-write, the shopping cart will never change. It will be immutable.
There are three steps of the copy-on-write discipline to be performed whenever you want to modify something that is immutable:
Reads
Writes
Let’s look at the function add_element_last() from the last chapter, which implements copy-on-write:
function add_element_last(array, elem) { ❶
var new_array = array.slice(); ❷
new_array.push(elem); ❸
return new_array; ❹
}
❶ we want to modify array
❷ make a copy
❸ modify the copy
❹ return the copy
Why does this work? How does this avoid modifying the array?
Copy-on-write converts writes into reads.
So here’s a question. Is add_element_last() a read or a write?
It doesn’t modify the data, and now it returns information, so it must be a read! We’ve essentially converted a write into a read. We’ll talk more about that soon.
Let’s look at another operation that modifies the cart. This one removes an item from the cart based on the name.
function remove_item_by_name(cart, name) {
var idx = null;
for(var i = 0; i < cart.length; i++) {
if(cart[i].name === name)
idx = i;
}
if(idx !== null)
cart.splice(idx, 1); ❶
}
❶ cart.splice() modifies the cart array
Is an array the best data structure to represent a shopping cart? Probably not. But this is the system as MegaMart implemented it. For now, at least, we have to work with what we found.
What does cart.splice() do?
.splice() is a method on arrays that lets you remove items from an array.
cart.splice(idx, 1) ❶ ❷
❶ remove one item
❷ at index idx
.splice() can do other things with different combinations of arguments, but we don’t use those things here.
This function modifies the cart (via cart.splice()). If we pass in the global shopping_cart to remove_item_by_name(), it will modify the global shopping cart.
However, we don’t want to modify the shopping cart anymore. We want to treat our shopping cart as immutable. Let’s apply a copy-on-write discipline to the remove_item_by_name() function.
We have a function that modifies the shopping cart, and we want it to use a copy-on-write discipline. First, we’ll make a copy of the cart.
Rules of copy-on-write
function remove_item_by_name(cart, name) {
var idx = null;
for(var i = 0; i < cart.length; i++) {
if(cart[i].name === name)
idx = i;
}
if(idx !== null)
cart.splice(idx, 1);
}
function remove_item_by_name(cart, name) {
var new_cart = cart.slice(); ❶
var idx = null;
for(var i = 0; i < cart.length; i++) {
if(cart[i].name === name)
idx = i;
}
if(idx !== null)
cart.splice(idx, 1);
}
❶ make a copy of the cart and save it to a local var
We’re making a copy, but we’re not doing anything with it. On the next page, we will change all code that modifies the cart argument to modify the copy of the cart.
Rules of copy-on-write
We just made a copy; now we need to use it. Let’s replace all usages of the original cart with our new copy.
Rules of copy-on-write
function remove_item_by_name(cart, name) {
var new_cart = cart.slice();
var idx = null;
for(var i = 0; i < cart.length; i++) {
if(cart[i].name === name)
idx = i;
}
if(idx !== null)
cart.splice(idx, 1);
}
function remove_item_by_name(cart, name) {
var new_cart = cart.slice();
var idx = null;
for(var i = 0; i < new_cart.length; i++) {
if(new_cart[i].name === name)
idx = i;
}
if(idx !== null)
new_cart.splice(idx, 1);
}
Now we aren’t modifying the original at all. But the copy is stuck inside the function. Next, let’s let it out by returning it.
Rules of copy-on-write
On the last page, we had gotten rid of all the modifications to the cart array. Instead, we modified a copy. Now we can do the last step of the copy-on-write and return the copy.
Rules of copy-on-write
function remove_item_by_name(cart, name) {
var new_cart = cart.slice();
var idx = null;
for(var i = 0; i < new_cart.length; i++) {
if(new_cart[i].name === name)
idx = i;
}
if(idx !== null)
new_cart.splice(idx, 1);
}
function remove_item_by_name(cart, name) {
var new_cart = cart.slice();
var idx = null;
for(var i = 0; i < new_cart.length; i++) {
if(new_cart[i].name === name)
idx = i;
}
if(idx !== null)
new_cart.splice(idx, 1);
return new_cart; ❶
}
❶ return the copy
And now we have a fully working copy-on-write version of the function. The only thing left to do now is to change how we use it. Let’s do that next.
Rules of copy-on-write
On the last page, we had gotten rid of all the modifications to the cart array. Instead, we modified a copy. Now we can do the last step of the copy-on-write and return the copy.
function delete_handler(name) {
remove_item_by_name(shopping_cart, name); ❶
var total = calc_total(shopping_cart);
set_cart_total_dom(total);
update_shipping_icons(shopping_cart);
update_tax_dom(total);
}
❶ this function used to modify the global
function delete_handler(name) {
shopping_cart = ❶
remove_item_by_name(shopping_cart, name);
var total = calc_total(shopping_cart);
set_cart_total_dom(total);
update_shipping_icons(shopping_cart);
update_tax_dom(total);
}
❶ now we have to modify the global in the caller
We will need to go to each call site for remove_item_by_name() and assign its return value to the global shopping_cart. We won’t do that here. It’s pretty boring.
We’ve made a few changes over a few pages. Let’s see it all in one place:
var idx = null;
for(var i = 0; i < cart.length; i++) {
if(cart[i].name === name)
idx = i;
}
if(idx !== null)
cart.splice(idx, 1);
}
function delete_handler(name) {
remove_item_by_name(shopping_cart, name);
var total = calc_total(shopping_cart);
set_cart_total_dom(total);
update_shipping_icons(shopping_cart);
update_tax_dom(total);
}
function remove_item_by_name(cart, name) {
var new_cart = cart.slice();
var idx = null;
for(var i = 0; i < new_cart.length; i++) {
if(new_cart[i].name === name)
idx = i;
}
if(idx !== null)
new_cart.splice(idx, 1);
return new_cart;
}
function delete_handler(name) {
shopping_cart =
remove_item_by_name(shopping_cart, name);
var total = calc_total(shopping_cart);
set_cart_total_dom(total);
update_shipping_icons(shopping_cart);
update_tax_dom(total);
}
We are going to do very similar copy-on-write operations all over. We can generalize the ones we’ve written so that they are more reusable, just like we did with add_element_last().
Let’s start with array’s .splice() method. We use .splice() in remove_item_by_name().
function removeItems(array, idx, count) {
array.splice(idx, count);
}
function removeItems(array, idx, count) {
var copy = array.slice();
copy.splice(idx, count);
return copy;
}
Now we can use it in remove_item_by_name().
function remove_item_by_name(cart, name) {
var new_cart = cart.slice(); ❶
var idx = null;
for(var i = 0; i < new_cart.length; i++) {
if(new_cart[i].name === name)
idx = i;
}
if(idx !== null)
new_cart.removeItems(idx, 1);
return new_cart;
}
❶ removeItems() copies the array so we don’t have to
function remove_item_by_name(cart, name) {
var idx = null;
for(var i = 0; i < cart.length; i++) {
if(cart[i].name === name)
idx = i;
}
if(idx !== null)
return removeItems(cart, idx, 1);
return cart; ❶
}
❶ bonus! we don’t have to make a copy of the array if we don’t modify it
Because we will likely use these operations a lot, implementing them in a reusable way can save a lot of effort. We won’t need to copy the boilerplate of copying the array or object all over.
One of the basic collection types in JavaScript is the array. Arrays in JavaScript represent ordered collections of values. They are heterogeneous, meaning they can have values of different types in them at the same time. You can access elements by index. JavaScript’s arrays are different from what are called arrays in other languages. You can extend and shrink them—unlike arrays in Java or C.
This gets the element at idx. Indexes start at 0.
> var array = [1, 2, 3, 4];
> array[2]
3
The assignment operator will mutate an array.
> var array = [1, 2, 3, 4];
> array[2] = "abc"
"abc"
> array
[1, 2, "abc", 4]
This contains the number of elements in the array. It’s not a method, so don’t use parentheses.
> var array = [1, 2, 3, 4];
> array.length
4
This mutates the array by adding el to the end and returns the new length of the array.
> var array = [1, 2, 3, 4];
> array.push(10);
5
> array
[1, 2, 3, 4, 10]
This mutates the array by dropping the last element and returns the value that was dropped.
> var array = [1, 2, 3, 4];
> array.pop();
4
> array
[1, 2, 3]
This mutates the array by adding el to the array at the beginning and returns the new length.
> var array = [1, 2, 3, 4];
> array.unshift(10);
5
> array
[10, 1, 2, 3, 4]
This mutates the array by dropping the first element (index 0) and returns the value that was dropped.
> var array = [1, 2, 3, 4];
> array.shift()
1
> array
[2, 3, 4]
This creates and returns a shallow copy of the array.
> var array = [1, 2, 3, 4];
> array.slice()
[1, 2, 3, 4]
This mutates the array by removing num items starting at idx and returns the removed items.
> var array = [1, 2, 3, 4, 5, 6];
> array.splice(2, 3); // remove 3 elements
[3, 4, 5]
> array
[1, 2, 6]
Sometimes a function plays two roles at the same time: It modifies a value and it returns a value. The .shift() method is a good example. Let’s see an example:
var a = [1, 2, 3, 4];
var b = a.shift();
console.log(b); // prints 1 ❶
console.log(a); // prints [2, 3, 4] ❷
❶ returns a value
❷ var a was modified
.shift() returns the first element of the array at the same time as it modifies the array. It’s both a read and a write.
How can you convert this to a copy-on-write?
In copy-on-write, we are essentially converting a write to a read, which means we need to return a value. But .shift() is already a read, so it already has a return value. How can we make this work? There are two approaches.
We will see both. When you have the choice, you should prefer the first approach. It more cleanly separates the responsibilities. As we saw in chapter 5, design is about pulling things apart.
We’ll see the first approach first.
Two approaches
There are two steps to this technique. The first is to split the read from the write. The second is to convert the write to a copy-on-write operation. That is done in the same way as any write operation.
The read of the .shift() method is simply its return value. The return value of .shift() is the first element of the array. So we just write a calculation that returns the first element of the array. It’s a read, so it shouldn’t modify anything. Because it doesn’t have any hidden inputs or outputs, it’s a calculation.
function first_element(array) {
return array[0]; ❶
}
❶ just a function that returns the first element (or undefined if the list is empty). it’s a calculation
We don’t need to convert first_element(), because as a read, it won’t modify the array.
The write of the .shift() method doesn’t need to be written, but we should wrap up the behavior of .shift() in its own function. We’ll discard the return value of .shift() just to emphasize that we won’t use the result.
function drop_first(array) {
array.shift(); ❶
}
❶ perform the shift but drop the return value
We have successfully separated the read from the write. But the write (drop_first()) mutates its argument. We should convert it to copy-on-write.
function drop_first(array) {
array.shift();
}
function drop_first(array) {
var array_copy = array.slice();
array_copy.shift();
return array_copy; ❶
}
❶ textbook copy-on-write here
Splitting the read from the write is the preferred approach because it gives us all of the pieces we need. We can use them separately or together. Before, we were forced to use them together. Now we have the choice.
This second approach, like the first approach, also has two steps. The first step is to wrap up the .shift() method in a function we can modify. That function is going to be both a read and a write. The second step is to convert it to be just a read.
The first one is to wrap up the .shift() method in a function we control and can modify. But in this case, we don’t want to discard the return value.
function shift(array) {
return array.shift();
}
In this case, we need to convert the shift() function we’ve written to make a copy, modify the copy, and return both the first element and the modified copy. Let’s see how we can do that.
function shift(array) {
return array.shift();
}
function shift(array) {
var array_copy = array.slice();
var first = array_copy.shift();
return { ❶
first : first,
array : array_copy
};
}
❶ we use an object to return two separate values
Another option is to use the approach we took on the previous page and combine the two return values into an object:
function shift(array) {
return {
first : first_element(array),
array : drop_first(array)
};
}
Because both of those functions are calculations, we don’t need to worry about the combination; it will also be a calculation.
If we read from a mutable value, we could get a different answer each time we read it, so reading mutable data is an action.
Writes modify data, so they are what make the data mutable.
If we get rid of all of the writes by converting them to reads, the data won’t ever change after it is created. That means it’s immutable.
Once we do make the data immutable, all of those reads become calculations.
The more data structures we treat as immutable, the more code we have in calculations and the less we have in actions.
We now have the tools to go through all of our code and convert everything to use immutable data everywhere. We convert all of the writes to reads. But there is a big problem that we haven’t faced: If everything is immutable, how can your application keep track of changes over time? How can the user add an item to their cart if nothing ever changes?
Kim makes a good point. We’ve implemented immutability everywhere, but we need one place that is still mutable so we can keep track of changes over time. We do have that place. It’s the shopping_cart global variable.
We are assigning new values to the shopping_cart global variable. It always points to the current value of the cart. In fact, we could say we are swapping in new values of the shopping cart.
The shopping_cart global variable is always going to point to the current value, and whenever we need to modify it, we’ll use this swapping pattern. This is a very common and powerful pattern in functional programming. Swapping makes it really easy to implement an undo command. We will revisit swapping and make it more robust in part 2.
Let’s be very clear: In general, immutable data structures use more memory and are slower to operate on than their mutable equivalents.
That said, there are many high-performance systems written using immutable data, including high-frequency trading systems, where performance is vitally important. That’s pretty good empirical proof that immutable data structures are fast enough for common applications. However, here are some more arguments.
Every application has performance bottlenecks that are hard to predict while you’re developing. Common wisdom dictates that we avoid optimizing before we’re sure the part we’re optimizing will make a difference.
Functional programmers prefer immutable data by default. Only if they find that something is too slow will they optimize for performance with mutable data.
Most languages (but certainly not all) have had years of research and hard work, making the garbage collector very fast. Some garbage collectors have been optimized so much that freeing memory is only one or two machine instructions. We can lean on all of that hard work. However, do try it for youself in your language.
If you look at the copy-on-write code that we’ve written so far, none of it is copying that much. For instance, if we have 100 items in our shopping cart, we’re only copying an array of 100 references. We aren’t copying all of the items themselves. When you just copy the top level of a data structure, it’s called a shallow copy. When you do a shallow copy, the two copies share a lot of references to the same objects in memory. This is known as structural sharing.
We are writing our own immutable data routines on top of JavaScript’s built-in data structures, using very straightforward code. This is fine for our application. However, languages that support functional programming often have immutable data structures built-in. These data structures are much more efficient than what we are doing. For instance, Clojure’s built in data structures are very efficient and were even the source of inspiration for other languages’ implementations.
How are they more efficient? They share a lot more structure between copies, which means less memory is used and less pressure is put on the garbage collector. They’re still based on copy-on-write.
So far, we’ve only made copy-on-write operations for JavaScript arrays. We need a way to set the price on a shopping cart item, which is represented with an object. The steps are the same:
We can make a copy of an array with the .slice() method. But there’s no equivalent way to make a copy of an object in JavaScript. What JavaScript does have, however, is a way of copying all keys and values from one object to another. If we copy all of the keys and values into an empty object, we’ve effectively made a copy. This method is called Object.assign(). Here’s how you use it to make a copy:
var object = {a: 1, b: 2};
var object_copy = Object.assign({}, object); ❶
❶ how you copy an object in JavaScript
We’ll use this method for copying objects. Here’s how we can use it to implement set_price(), which sets the price of an item object:
function setPrice(item, new_price) {
item.price = new_price;
}
function setPrice(item, new_price) {
var item_copy = Object.assign({}, item);
item_copy.price = new_price;
return item_copy;
}
The basic idea is just the same as for arrays. You can apply this to any data structure at all. Just follow the three steps.
JavaScript’s Objects are very much like hash maps or associative arrays that you find in other languages. Objects in JavaScript are collections of key/value pairs, where the keys are unique. The keys are always strings, but the values can be any type. Here are the operations we will use in our examples:
This looks up the value corresponding to key. If the key doesn’t exist, you’ll get undefined.
> var object = {a: 1, b: 2};
> object["a"]
1
You can also use a dot notation to access the values. This is convenient if key fits into JavaScript’s tokenization syntax rules.
> var object = {a: 1, b: 2};
> object.a
1
You can assign a value to a key using either syntax, which mutates the object. It sets the value for key. If key exists, it replaces the value. If the key doesn’t exist, it adds to it.
> var object = {a: 1, b: 2};
> object["a"] = 7;
7
> object
{a: 7, b: 2}
> object.c = 10;
10
> object
{a: 7, b: 2, c: 10}
This method mutates the object by removing a key/value pair given the key. You can use either look-up syntax.
> var object = {a: 1, b: 2};
> delete object["a"];
true
> object
{b: 2}
This one is complicated. Object.assign() copies all key/values pairs from object b to object a (mutating it). We can use it to make a copy of b by copying all key/value pairs to an empty object.
> var object = {x: 1, y: 2};
> Object.assign({}, object);
{x: 1, y: 2}
If we want to iterate through the key/value pairs in an object, we can do it indirectly by asking the object for all of its keys using the function Object.keys(). That returns an array of the keys in an object, which we can then loop through.
> var object = {a: 1, b: 2};
> Object.keys(object)
["a", "b"]
We still have one write left to convert to a read on our shopping cart. The operation that updates the price of an item given its name is still a write. However, that operation is interesting because it is modifying a nested data structure. It is modifying the item object inside of the shopping cart array.
It’s usually easier to convert the write for the deeper operation first. We implemented setPrice() in the exercise on page 137. We can use setPrice(), which operates on items, inside of setPriceByName(), which operates on carts.
function setPriceByName(cart, name, price) {
for(var i = 0; i < cart.length; i++) {
if(cart[i].name === name)
cart[i].price =
price;
}
}
function setPriceByName(cart, name, price) {
var cartCopy = cart.slice(); ❶
for(var i = 0; i < cartCopy.length; i++) {
if(cartCopy[i].name === name)
cartCopy[i] =
setPrice(cartCopy[i], price); ❷
}
return cartCopy;
}
❶ typical copy-on-write pattern of copying and modifying the copy
❷ we call a copy-on-write operation to modify the nested item
Nested writes follow the same pattern as non-nested writes. We make a copy, modify the copy, then return the copy. The only difference with nested operations is that we do another copy-on-write operation to modify the nested one.
If we modified the item directly, like we were doing in the original code, then our data would not be immutable. The references in the cart array may not change, but the values they refer to do change. That’s unnacceptable. The entire nested data structure has to remain unchanged for it to be immutable.
This is a very important concept. Everything in the nested data structure, from the top to the bottom, must be immutable. When we update a nested piece of data, we need to copy the inner value and all of the values on the way up to the top. It’s so important that we’ll spend a couple of pages really understanding what is getting copied.
Let’s say we have three items in a shopping cart: a t-shirt, shoes, and socks. Let’s take an inventory of our arrays and objects so far. We have one Array (the shopping cart) and three objects (a t-shirt, shoes, and socks in cart).
We want to set the price of the t-shirt to $13. To do that, we use the nested operation setPriceByName(), like so:
shopping_cart = setPriceByName(shopping_cart, "t-shirt", 13);
Let’s step through the code and count what gets copied:
function setPriceByName(cart, name, price) {
var cartCopy = cart.slice(); ❶
for(var i = 0; i < cartCopy.length; i++) {
if(cartCopy[i].name === name)
cartCopy[i] = setPrice(cartCopy[i], price); ❷
}
return cartCopy;
}
function setPrice(item, new_price) {
var item_copy = Object.assign({}, item); ❸
item_copy.price = new_price;
return item_copy;
}
❶ copy array
❷ we call setPrice() only once, when the loop finds the t-shirt
❸ copy object
We started with one array and three objects. What got copied? Well, only one array (the shopping cart) and one object (the t-shirt). Two objects were not copied. What’s going on?
We are making shallow copies of nested data, which results in structural sharing. That’s a lot of vocabulary words in one sentence. Let’s visualize it on the next page.
We started out with a shopping cart (one array) with three items (three objects). That’s four pieces of data total. We want to set the price of the t-shirt to $13.
We then made a shallow copy of the shopping cart. At first, the copy pointed to the same objects in memory.
The loop eventually found the t-shirt and called setPrice() on it. That function created a shallow copy of the t-shirt Object and changed the price to 13.
setPrice() returned that copy, and setPriceByName() assigned it in the array in place of the original t-shirt.
Although we had four pieces of data at the start (one array and three objects), only two of them (one array and one object) had to be copied. The other objects weren’t modified so we didn’t copy them. The original array and the copy are both pointing to everything that hasn’t changed. That’s the structural sharing that we’ve talked about before. As long as we don’t modify those shared copies, it is totally safe. Making copies allows us to keep the original and the copy without worrying that it will change.
In this chapter we learned the ins and outs of the copy-on-write discipline. Although it’s the same discipline you find in languages like Clojure and Haskell, in JavaScript you have to do all the work yourself. That’s why it’s convenient to wrap it up with some utility functions that handle everything for you. If you stick with those wrapper functions, you’ll be fine. Sticking with it is why it’s called a discipline.
The copy-on-write discipline is nice. However, not all of our code will use the wrappers we wrote. Most of us have lots of existing code written without the copy-on-write discipline. We need a way of exchanging data with that code without it changing our data. In the next chapter, we will learn another discipline called defensive copying.
3.143.4.181