6 Staying immutable in a mutable language

Image

In this chapter

  • Apply a copy-on-write discipline to ensure that data is not changed.
  • Develop copy-on-write operations for arrays and objects.
  • Make copy-on-write work well for deeply nested data.

We’ve been talking about immutability, and we’ve even implemented it in some places. In this chapter, we’re going to dive deep. We’ll learn to make immutable versions of all of the common JavaScript array and object operations we’re used to using.

Can immutability be applied everywhere?

We’ve already implemented some shopping cart operations using a copy-on-write discipline. Remember: That means we made a copy, modified the copy, then returned the copy. But there are a number of shopping cart operations we haven’t done yet. Here is a list of the operations we need or might need for the shopping cart and for shopping cart items:

Image Vocab time

We say data is nested when there are data structures within data structures, like an array full of objects. The objects are nested in the array. Think of nesting as in Russian nesting dolls—dolls within dolls within dolls.

We say data is deeply nested when the nesting goes on for a while. It’s a relative term, but an example might be objects within objects within arrays within objects within objects… The nesting can go on a long time.

Shopping cart operations

  1. Get number of items
  2. Get item by name
  3. Add an item**

    ** already implemented

  4. Remove an item by name
  5. Update quantity of an item by name***

    *** operation on nested data

Item operations

  1. Set the price
  2. Get price
  3. Get name
Image

Jenna is skeptical that all of these operations can be done immutably. The fifth operation is harder, because it is modifying an item inside of a shopping cart. We call that nested data. How can we implement immutability on nested data? Well, let’s find out.

Categorizing operations into reads, writes, or both

We can categorize an operation as either a read or a write

Let’s look at each of our operations in a new way. Some of our operations are reads. We are getting some information out of the data without modifying it. These are easy cases because nothing is being modified. Those don’t need any other work. Reads that get information from their arguments are calculations.

The rest of our operations are writes. They modify the data in some way. These will require a discipline, because we don’t want to modify any of the values that may be in use somewhere else.

Reads

  • Get information out of data
  • Do not modify the data

Writes

  • Modify the data

Shopping cart operations

  1. Get number of items.**
  2. Get item by name.**

    ** reads

  3. Add an item.***
  4. Remove an item by name.***
  5. Update quantity by name.***

    *** writes

Image Language safari

Immutable data is a common feature in functional programming languages, though not all of them have it by default. Here are some functional languages that are immutable by default:

  • Haskell
  • Clojure
  • Elm
  • Purescript
  • Erlang
  • Elixir

Others have immutable data in addition to mutable data by default. And some trust the programmer to apply the discipline where they choose.

Three of our shopping cart operations are writes. We will want to implement these using an immutable discipline. As we’ve seen before, the discipline we’ll use is called copy-on-write. It’s the same discipline used in languages like Haskell and Clojure. The difference is those languages implement the discipline for you.

Because we’re in JavaScript, mutable data is the default, so we the programmers have to apply the discipline ourselves explicitly in the code.

What about an operation that reads and writes?

Sometimes we may want to modify the data (a write) and get some information out of it (read) at the same time. If you’re curious about that, we will get to that in just a few pages. The short answer is “Yes, you can do that.” You’ll see the long answer on page 122.

Item operations

  1. Set the price.**

    ** write

  2. Get the price.***
  3. Get the name.***

    *** reads

The three steps of the copy-on-write discipline

The copy-on-write discipline is just three steps in your code. If you implement these steps, you are doing copy-on-write. And if you replace every part of your code that modifies the global shopping cart with a copy-on-write, the shopping cart will never change. It will be immutable.

There are three steps of the copy-on-write discipline to be performed whenever you want to modify something that is immutable:

Reads

  • Get information out of data
  • Do not modify the data

Writes

  • Modify the data
  1. Make a copy.
  2. Modify the copy (as much as you want!).
  3. Return the copy.

Let’s look at the function add_element_last() from the last chapter, which implements copy-on-write:

function add_element_last(array, elem) {

var new_array = array.slice();

new_array.push(elem);

return new_array;

}

we want to modify array

make a copy

modify the copy

return the copy

Why does this work? How does this avoid modifying the array?

  1. We make a copy of the array, but never modify the original.
  2. The copy is within the local scope of the function. That means no other code has access to it while we modify it.
  3. After we’re done modifying it, we let it leave the scope (we return it). Nothing will modify it after that.

Copy-on-write converts writes into reads.

So here’s a question. Is add_element_last() a read or a write?

It doesn’t modify the data, and now it returns information, so it must be a read! We’ve essentially converted a write into a read. We’ll talk more about that soon.

Converting a write to a read with copy-on-write

Let’s look at another operation that modifies the cart. This one removes an item from the cart based on the name.

function remove_item_by_name(cart, name) {

var idx = null;

for(var i = 0; i < cart.length; i++) {

if(cart[i].name === name)

idx = i;

}

if(idx !== null)

cart.splice(idx, 1);

}

cart.splice() modifies the cart array

Is an array the best data structure to represent a shopping cart? Probably not. But this is the system as MegaMart implemented it. For now, at least, we have to work with what we found.

What does cart.splice() do?

.splice() is a method on arrays that lets you remove items from an array.

cart.splice(idx, 1)

remove one item

at index idx

.splice() can do other things with different combinations of arguments, but we don’t use those things here.

This function modifies the cart (via cart.splice()). If we pass in the global shopping_cart to remove_item_by_name(), it will modify the global shopping cart.

However, we don’t want to modify the shopping cart anymore. We want to treat our shopping cart as immutable. Let’s apply a copy-on-write discipline to the remove_item_by_name() function.

We have a function that modifies the shopping cart, and we want it to use a copy-on-write discipline. First, we’ll make a copy of the cart.

Rules of copy-on-write

  1. Make a copy.
  2. Modify the copy.
  3. Return the copy.

Current

function remove_item_by_name(cart, name) {

 

var idx = null;

for(var i = 0; i < cart.length; i++) {

if(cart[i].name === name)

idx = i;

}

if(idx !== null)

cart.splice(idx, 1);

}

With copy of argument

function remove_item_by_name(cart, name) {

var new_cart = cart.slice();

var idx = null;

for(var i = 0; i < cart.length; i++) {

if(cart[i].name === name)

idx = i;

}

if(idx !== null)

cart.splice(idx, 1);

}

make a copy of the cart and save it to a local var

We’re making a copy, but we’re not doing anything with it. On the next page, we will change all code that modifies the cart argument to modify the copy of the cart.

Rules of copy-on-write

  1. ✓ 1. Make a copy.
  2. 2. Modify the copy.
  3. 3. Return the copy.

We just made a copy; now we need to use it. Let’s replace all usages of the original cart with our new copy.

Rules of copy-on-write

  1. ✓ 1. Make a copy.
  2. 2. Modify the copy.
  3. 3. Return the copy.

Current

function remove_item_by_name(cart, name) {

var new_cart = cart.slice();

var idx = null;

for(var i = 0; i < cart.length; i++) {

if(cart[i].name === name)

idx = i;

}

if(idx !== null)

cart.splice(idx, 1);

}

Modifying copy

function remove_item_by_name(cart, name) {

var new_cart = cart.slice();

var idx = null;

for(var i = 0; i < new_cart.length; i++) {

if(new_cart[i].name === name)

idx = i;

}

if(idx !== null)

new_cart.splice(idx, 1);

}

Now we aren’t modifying the original at all. But the copy is stuck inside the function. Next, let’s let it out by returning it.

Rules of copy-on-write

  1. ✓ 1. Make a copy.
  2. ✓ 2. Modify the copy.
  3. 3. Return the copy.

On the last page, we had gotten rid of all the modifications to the cart array. Instead, we modified a copy. Now we can do the last step of the copy-on-write and return the copy.

Rules of copy-on-write

  1. ✓ 1. Make a copy.
  2. ✓ 2. Modify the copy.
  3. 3. Return the copy.

Current

function remove_item_by_name(cart, name) {

var new_cart = cart.slice();

var idx = null;

for(var i = 0; i < new_cart.length; i++) {

if(new_cart[i].name === name)

idx = i;

}

if(idx !== null)

new_cart.splice(idx, 1);

 

}

Returning the copy

function remove_item_by_name(cart, name) {

var new_cart = cart.slice();

var idx = null;

for(var i = 0; i < new_cart.length; i++) {

if(new_cart[i].name === name)

idx = i;

}

if(idx !== null)

new_cart.splice(idx, 1);

return new_cart;

}

return the copy

And now we have a fully working copy-on-write version of the function. The only thing left to do now is to change how we use it. Let’s do that next.

Rules of copy-on-write

  1. ✓ 1. Make a copy.
  2. ✓ 2. Modify the copy.
  3. ✓ 3. Return the copy.

On the last page, we had gotten rid of all the modifications to the cart array. Instead, we modified a copy. Now we can do the last step of the copy-on-write and return the copy.

Current

function delete_handler(name) {

 

remove_item_by_name(shopping_cart, name);

var total = calc_total(shopping_cart);

set_cart_total_dom(total);

update_shipping_icons(shopping_cart);

update_tax_dom(total);

}

this function used to modify the global

With copy-on-write

function delete_handler(name) {

shopping_cart =

remove_item_by_name(shopping_cart, name);

var total = calc_total(shopping_cart);

set_cart_total_dom(total);

update_shipping_icons(shopping_cart);

update_tax_dom(total);

}

now we have to modify the global in the caller

We will need to go to each call site for remove_item_by_name() and assign its return value to the global shopping_cart. We won’t do that here. It’s pretty boring.

Complete diff from mutating to copy-on-write

We’ve made a few changes over a few pages. Let’s see it all in one place:

Original mutating version

 

 

var idx = null;

for(var i = 0; i < cart.length; i++) {

if(cart[i].name === name)

idx = i;

}

if(idx !== null)

cart.splice(idx, 1);

}

 

function delete_handler(name) {

 

remove_item_by_name(shopping_cart, name);

var total = calc_total(shopping_cart);

set_cart_total_dom(total);

update_shipping_icons(shopping_cart);

update_tax_dom(total);

}

Copy-on-write version

function remove_item_by_name(cart, name) {

var new_cart = cart.slice();

var idx = null;

for(var i = 0; i < new_cart.length; i++) {

if(new_cart[i].name === name)

idx = i;

}

if(idx !== null)

new_cart.splice(idx, 1);

return new_cart;

}

 

function delete_handler(name) {

shopping_cart =

remove_item_by_name(shopping_cart, name);

var total = calc_total(shopping_cart);

set_cart_total_dom(total);

update_shipping_icons(shopping_cart);

update_tax_dom(total);

}

These copy-on-write operations are generalizable

We are going to do very similar copy-on-write operations all over. We can generalize the ones we’ve written so that they are more reusable, just like we did with add_element_last().

Let’s start with array’s .splice() method. We use .splice() in remove_item_by_name().

Original

function removeItems(array, idx, count) {

 

array.splice(idx, count);

 

}

Copy-on-write

function removeItems(array, idx, count) {

var copy = array.slice();

copy.splice(idx, count);

return copy;

}

Now we can use it in remove_item_by_name().

Previous copy-on-write

function remove_item_by_name(cart, name) {

var new_cart = cart.slice();

var idx = null;

for(var i = 0; i < new_cart.length; i++) {

if(new_cart[i].name === name)

idx = i;

}

if(idx !== null)

new_cart.removeItems(idx, 1);

return new_cart;

}

removeItems() copies the array so we don’t have to

Copy-on-write using splice()

function remove_item_by_name(cart, name) {

 

var idx = null;

for(var i = 0; i < cart.length; i++) {

if(cart[i].name === name)

idx = i;

}

if(idx !== null)

return removeItems(cart, idx, 1);

return cart;

}

bonus! we don’t have to make a copy of the array if we don’t modify it

Because we will likely use these operations a lot, implementing them in a reusable way can save a lot of effort. We won’t need to copy the boilerplate of copying the array or object all over.

JavaScript arrays at a glance

One of the basic collection types in JavaScript is the array. Arrays in JavaScript represent ordered collections of values. They are heterogeneous, meaning they can have values of different types in them at the same time. You can access elements by index. JavaScript’s arrays are different from what are called arrays in other languages. You can extend and shrink them—unlike arrays in Java or C.

Lookup by index [idx]

This gets the element at idx. Indexes start at 0.

> var array = [1, 2, 3, 4];

> array[2]

3

Set an element [] =

The assignment operator will mutate an array.

> var array = [1, 2, 3, 4];

> array[2] = "abc"

"abc"

> array

[1, 2, "abc", 4]

Length .length

This contains the number of elements in the array. It’s not a method, so don’t use parentheses.

> var array = [1, 2, 3, 4];

> array.length

4

Add to the end .push(el)

This mutates the array by adding el to the end and returns the new length of the array.

> var array = [1, 2, 3, 4];

> array.push(10);

5

> array

[1, 2, 3, 4, 10]

Remove from the end .pop()

This mutates the array by dropping the last element and returns the value that was dropped.

> var array = [1, 2, 3, 4];

> array.pop();

4

> array

[1, 2, 3]

Add to the front .unshift(el)

This mutates the array by adding el to the array at the beginning and returns the new length.

> var array = [1, 2, 3, 4];

> array.unshift(10);

5

> array

[10, 1, 2, 3, 4]

Remove from the front .shift()

This mutates the array by dropping the first element (index 0) and returns the value that was dropped.

> var array = [1, 2, 3, 4];

> array.shift()

1

> array

[2, 3, 4]

Copy an array .slice()

This creates and returns a shallow copy of the array.

> var array = [1, 2, 3, 4];

> array.slice()

[1, 2, 3, 4]

Remove items .splice(idx, num)

This mutates the array by removing num items starting at idx and returns the removed items.

> var array = [1, 2, 3, 4, 5, 6];

> array.splice(2, 3); // remove 3 elements

[3, 4, 5]

> array

[1, 2, 6]

Image It’s your turn

Here’s an operation for adding a contact to a mailing list. It adds email addresses to the end of a list stored in a global variable. It’s being called by the form submission handler.

var mailing_list = [];

 

function add_contact(email) {

mailing_list.push(email);

}

 

function submit_form_handler(event) {

var form = event.target;

var email = form.elements["email"].value;

add_contact(email);

}

Your task is to convert this into a copy-on-write form. Here are some clues:

  1. add_contact() shouldn’t access the global variable. It should take a mailing_list as an argument, make a copy, modify the copy, then return the copy.
  2. Wherever you call add_contact(), you need to assign its return value to the mailing_list global variable.

Modify the code provided to make it follow a copy-on-write discipline. The answer is on the next page.

Image

Image

Image Answer

We had two things we needed to do:

  1. add_contact() shouldn’t access the global variable. It should take a mailing_list as an argument, make a copy, modify the copy, then return the copy.
  2. Wherever you call add_contact(), you need to assign its return value to the mailing_list global variable.

Here is how we could modify the code:

Original

var mailing_list = [];

 

function add_contact(email) {

 

 

mailing_list.push(email);

 

}

 

function submit_form_handler(event) {

var form = event.target;

var email =

form.elements["email"].value;

 

add_contact(email);

}

Copy-on-write

var mailing_list = [];

 

function add_contact(mailing_list,

email) {

var list_copy = mailing_list.slice();

list_copy.push(email);

return list_copy;

}

 

function submit_form_handler(event) {

var form = event.target;

var email =

form.elements["email"].value;

mailing_list =

add_contact(mailing_list, email);

}

What to do if an operation is a read and a write

Sometimes a function plays two roles at the same time: It modifies a value and it returns a value. The .shift() method is a good example. Let’s see an example:

var a = [1, 2, 3, 4];

var b = a.shift();

console.log(b); // prints 1

console.log(a); // prints [2, 3, 4]

returns a value

var a was modified

.shift() returns the first element of the array at the same time as it modifies the array. It’s both a read and a write.

How can you convert this to a copy-on-write?

In copy-on-write, we are essentially converting a write to a read, which means we need to return a value. But .shift() is already a read, so it already has a return value. How can we make this work? There are two approaches.

  1. Split the function into read and write.
  2. Return two values from the function.

We will see both. When you have the choice, you should prefer the first approach. It more cleanly separates the responsibilities. As we saw in chapter 5, design is about pulling things apart.

We’ll see the first approach first.

Two approaches

  1. Split function.
  2. Return two values.

Splitting a function that does a read and write

There are two steps to this technique. The first is to split the read from the write. The second is to convert the write to a copy-on-write operation. That is done in the same way as any write operation.

Splitting the operation into read and write

The read of the .shift() method is simply its return value. The return value of .shift() is the first element of the array. So we just write a calculation that returns the first element of the array. It’s a read, so it shouldn’t modify anything. Because it doesn’t have any hidden inputs or outputs, it’s a calculation.

function first_element(array) {

return array[0];

}

just a function that returns the first element (or undefined if the list is empty). it’s a calculation

We don’t need to convert first_element(), because as a read, it won’t modify the array.

The write of the .shift() method doesn’t need to be written, but we should wrap up the behavior of .shift() in its own function. We’ll discard the return value of .shift() just to emphasize that we won’t use the result.

function drop_first(array) {

array.shift();

}

perform the shift but drop the return value

Convert the write into a copy-on-write

We have successfully separated the read from the write. But the write (drop_first()) mutates its argument. We should convert it to copy-on-write.

Mutating

function drop_first(array) {

 

array.shift();

 

}

Copy-on-write

function drop_first(array) {

var array_copy = array.slice();

array_copy.shift();

return array_copy;

}

textbook copy-on-write here

Splitting the read from the write is the preferred approach because it gives us all of the pieces we need. We can use them separately or together. Before, we were forced to use them together. Now we have the choice.

Returning two values from one function

This second approach, like the first approach, also has two steps. The first step is to wrap up the .shift() method in a function we can modify. That function is going to be both a read and a write. The second step is to convert it to be just a read.

Wrap up the operation

The first one is to wrap up the .shift() method in a function we control and can modify. But in this case, we don’t want to discard the return value.

function shift(array) {

return array.shift();

}

Convert the read and write to a read

In this case, we need to convert the shift() function we’ve written to make a copy, modify the copy, and return both the first element and the modified copy. Let’s see how we can do that.

Mutating

function shift(array) {

 

return array.shift();

 

 

 

 

 

}

Copy-on-write

function shift(array) {

var array_copy = array.slice();

var first = array_copy.shift();

return {

first : first,

array : array_copy

};

}

we use an object to return two separate values

Another option

Another option is to use the approach we took on the previous page and combine the two return values into an object:

function shift(array) {

return {

first : first_element(array),

array : drop_first(array)

};

}

Because both of those functions are calculations, we don’t need to worry about the combination; it will also be a calculation.

Image It’s your turn

We just wrote copy-on-write versions of the .shift() method on arrays. Arrays also have a .pop() method, which removes the last item in the array and returns it. Like .shift(), .pop() is both a read and a write.

Your task is to convert this read and write to a read in the two different versions. Here is an example of how .pop() works:

var a = [1, 2, 3, 4];

var b = a.pop();

console.log(b); // prints 4

console.log(a); // prints [1, 2, 3]

Convert .pop() into copy-on-write versions.

1. Split the read and write into two functions

Image

Image

2. Return two values from one function

Image

Image

Image Answer

Our task was to rewrite .pop() to use a copy-on-write discipline. We will implement it in two separate ways.

1. Split the read and write into two operations

The first thing we’ll need to do is create two wrapper functions to implement the read and write portions separately.

function last_element(array) {

return array[array.length - 1];

}

 

function drop_last(array) {

array.pop();

}

this is a read

this is a write

The read is done. We won’t need to modify it anymore. But the write will need to be converted into a copy-on-write operation.

Original

function drop_last(array) {

 

array.pop();

 

}

Copy-on-write

function drop_last(array) {

var array_copy = array.slice();

array_copy.pop();

return array_copy;

}

2. Return two values

We start by creating a wrapper function for the operation. It doesn’t add any new functionality, but it will give us something to modify.

function pop(array) {

return array.pop();

}

Then we modify it to follow a copy-on-write discipline.

Original

function pop(array) {

 

return array.pop();

 

 

 

 

}

Copy-on-write

function pop(array) {

var array_copy = array.slice();

var first = array_copy.pop();

return {

first : first,

array : array_copy

};

}

Image Brain break

There’s more to go, but let’s take a break for questions

Q: How is the copy-on-write add_element_to_cart() a read?

A: The function add_element_to_cart() that implements the copy-on-write discipline is a read because it doesn’t modify the cart. You can look at it like it’s asking a question. The question might be “What would this cart look like if it also had this element?”

This last question is a hypothetical question. A lot of important thinking and planning is done with answers to hypothetical questions. Remember, calculations are often used for planning. We’ll see more examples in the pages to come.

Q: The shopping cart uses an array, and we have to search through the array to find the element with the given name. Is array the best data structure for this? Wouldn’t an associative data structure like an object be better?

A: Yes, it may be better to use an object. We often find that, in existing code, data structure decisions have already been made and we can’t easily change them. That’s the case here. We’ll have to continue with the shopping cart as an array.

Q: It seems like a lot of work to implement immutability. Is it worth it? Can it be easier?

A: JavaScript doesn’t have much of a standard library. It can feel like we’re always writing routines that should be part of the language. On top of that, JavaScript does not help us much with copy-on-write. In many languages, you would have to write your own routines for copy-on-write. It’s worth taking the time to ask if it’s worth the work.

First, you don’t have to write new functions. You can do it inline. However, it’s often more work. There’s a lot of repeated code, and each time you write it, you have to be focused enough to get it right. It’s much better to write the operations once and reuse them.

Fortunately, there are not that many operations that you’ll need. Writing these operations may feel tedious at first, but soon you won’t be writing new ones from scratch. You’ll be reusing the existing ones and composing them to make newer, more powerful ones.

Because it can be a lot of work up front, I suggest only writing the functions when you need them.

Image It’s your turn

Write a copy-on-write version of .push(), the array method. Remember, .push() adds one element to the end of an array.

function push(array, elem) {

Image

Image

}

Image Answer

function push(array, elem) {

var copy = array.slice();

copy.push(elem);

return copy;

}

Image It’s your turn

Refactor add_contact() to use the new push() function from the last exercise. Here is the existing code:

function add_contact(mailing_list, email) {

var list_copy = mailing_list.slice();

list_copy.push(email);

return list_copy;

}

 

 

function add_contact(mailing_list, email) {

Image

Image

}

Image Answer

function add_contact(mailing_list,

email) {

var list_copy = mailing_list.slice();

list_copy.push(email);

return list_copy;

}

 

function add_contact(mailing_list,

email) {

 

return push(mailing_list, email);

 

}

Image It’s your turn

Write a function arraySet() that is a copy-on-write version of the array assignment operator.

Example:

a[15] = 2;

 

function arraySet(array, idx, value) {

Image

Image

}

make a copy-on-write version of this operation

Image Answer

function arraySet(array, idx, value) {

var copy = array.slice();

copy[idx] = value;

return copy;

}

Reads to immutable data structures are calculations

Image

Reads to mutable data are actions

If we read from a mutable value, we could get a different answer each time we read it, so reading mutable data is an action.

Writes make a given piece of data mutable

Writes modify data, so they are what make the data mutable.

If there are no writes to a piece of data, it is immutable

If we get rid of all of the writes by converting them to reads, the data won’t ever change after it is created. That means it’s immutable.

Reads to immutable data structures are calculations

Once we do make the data immutable, all of those reads become calculations.

Converting writes to reads makes more code calculations

The more data structures we treat as immutable, the more code we have in calculations and the less we have in actions.

Applications have state that changes over time

We now have the tools to go through all of our code and convert everything to use immutable data everywhere. We convert all of the writes to reads. But there is a big problem that we haven’t faced: If everything is immutable, how can your application keep track of changes over time? How can the user add an item to their cart if nothing ever changes?

Kim makes a good point. We’ve implemented immutability everywhere, but we need one place that is still mutable so we can keep track of changes over time. We do have that place. It’s the shopping_cart global variable.

Image

We are assigning new values to the shopping_cart global variable. It always points to the current value of the cart. In fact, we could say we are swapping in new values of the shopping cart.

Image
Image

The shopping_cart global variable is always going to point to the current value, and whenever we need to modify it, we’ll use this swapping pattern. This is a very common and powerful pattern in functional programming. Swapping makes it really easy to implement an undo command. We will revisit swapping and make it more robust in part 2.

Immutable data structures are fast enough

Let’s be very clear: In general, immutable data structures use more memory and are slower to operate on than their mutable equivalents.

Image

That said, there are many high-performance systems written using immutable data, including high-frequency trading systems, where performance is vitally important. That’s pretty good empirical proof that immutable data structures are fast enough for common applications. However, here are some more arguments.

We can always optimize later

Every application has performance bottlenecks that are hard to predict while you’re developing. Common wisdom dictates that we avoid optimizing before we’re sure the part we’re optimizing will make a difference.

Functional programmers prefer immutable data by default. Only if they find that something is too slow will they optimize for performance with mutable data.

Garbage collectors are really fast

Most languages (but certainly not all) have had years of research and hard work, making the garbage collector very fast. Some garbage collectors have been optimized so much that freeing memory is only one or two machine instructions. We can lean on all of that hard work. However, do try it for youself in your language.

We’re not copying as much as you might think at first

If you look at the copy-on-write code that we’ve written so far, none of it is copying that much. For instance, if we have 100 items in our shopping cart, we’re only copying an array of 100 references. We aren’t copying all of the items themselves. When you just copy the top level of a data structure, it’s called a shallow copy. When you do a shallow copy, the two copies share a lot of references to the same objects in memory. This is known as structural sharing.

Functional programming languages have fast implementations

We are writing our own immutable data routines on top of JavaScript’s built-in data structures, using very straightforward code. This is fine for our application. However, languages that support functional programming often have immutable data structures built-in. These data structures are much more efficient than what we are doing. For instance, Clojure’s built in data structures are very efficient and were even the source of inspiration for other languages’ implementations.

How are they more efficient? They share a lot more structure between copies, which means less memory is used and less pressure is put on the garbage collector. They’re still based on copy-on-write.

Copy-on-write operations on objects

So far, we’ve only made copy-on-write operations for JavaScript arrays. We need a way to set the price on a shopping cart item, which is represented with an object. The steps are the same:

  1. Make a copy.
  2. Modify the copy.
  3. Return the copy.

We can make a copy of an array with the .slice() method. But there’s no equivalent way to make a copy of an object in JavaScript. What JavaScript does have, however, is a way of copying all keys and values from one object to another. If we copy all of the keys and values into an empty object, we’ve effectively made a copy. This method is called Object.assign(). Here’s how you use it to make a copy:

var object = {a: 1, b: 2};

var object_copy = Object.assign({}, object);

how you copy an object in JavaScript

We’ll use this method for copying objects. Here’s how we can use it to implement set_price(), which sets the price of an item object:

Original

function setPrice(item, new_price) {

 

item.price = new_price;

 

}

Copy-on-write

function setPrice(item, new_price) {

var item_copy = Object.assign({}, item);

item_copy.price = new_price;

return item_copy;

}

The basic idea is just the same as for arrays. You can apply this to any data structure at all. Just follow the three steps.

Image Vocab time

Shallow copies only duplicate the top-level data structure of nested data. For instance, if you have an array of objects, a shallow copy will only duplicate the array. The objects inside will be shared with both the original and the copy of the array. We will compare shallow and deep copies later on.

When two pieces of nested data share some of their references, it is called structural sharing. When it’s all immutable, structural sharing is very safe. Structural sharing uses less memory and is faster than copying everything.

JavaScript objects at a glance

JavaScript’s Objects are very much like hash maps or associative arrays that you find in other languages. Objects in JavaScript are collections of key/value pairs, where the keys are unique. The keys are always strings, but the values can be any type. Here are the operations we will use in our examples:

Look up by key [key]

This looks up the value corresponding to key. If the key doesn’t exist, you’ll get undefined.

> var object = {a: 1, b: 2};

> object["a"]

1

Look up by key .key

You can also use a dot notation to access the values. This is convenient if key fits into JavaScript’s tokenization syntax rules.

> var object = {a: 1, b: 2};

> object.a

1

Set value for key .key or [key] =

You can assign a value to a key using either syntax, which mutates the object. It sets the value for key. If key exists, it replaces the value. If the key doesn’t exist, it adds to it.

> var object = {a: 1, b: 2};

> object["a"] = 7;

7

> object

{a: 7, b: 2}

> object.c = 10;

10

> object

{a: 7, b: 2, c: 10}

Remove a key/value pair delete

This method mutates the object by removing a key/value pair given the key. You can use either look-up syntax.

> var object = {a: 1, b: 2};

> delete object["a"];

true

> object

{b: 2}

Copy an object Object.assign(a, b)

This one is complicated. Object.assign() copies all key/values pairs from object b to object a (mutating it). We can use it to make a copy of b by copying all key/value pairs to an empty object.

> var object = {x: 1, y: 2};

> Object.assign({}, object);

{x: 1, y: 2}

List the keys Object.keys()

If we want to iterate through the key/value pairs in an object, we can do it indirectly by asking the object for all of its keys using the function Object.keys(). That returns an array of the keys in an object, which we can then loop through.

> var object = {a: 1, b: 2};

> Object.keys(object)

["a", "b"]

Image It’s your turn

Write a function objectSet() that is a copy-on-write version of the object assignment operator.

Example:

o["price"] = 37;

 

function objectSet(object, key, value) {

Image

Image

}

make a copy-on-write version of this

Image Answer

function objectSet(object, key, value) {

var copy = Object.assign({}, object);

copy[key] = value;

return copy;

}

Image It’s your turn

Refactor setPrice() to use objectSet(), which we just wrote in the last exercise.

Existing code:

function setPrice(item, new_price) {

var item_copy = Object.assign({}, item);

item_copy.price = new_price;

return item_copy;

Write a function setQuantity(), using objectSet(), that sets the quantity of an item. Make sure it implements the copy-on-write discipline.

function setQuantity(item, new_quantity) {

Image

Image

}

Image Answer

function setQuantity(item, new_quantity) {

return objectSet(item, "quantity", new_quantity);

}

Image It’s your turn

Write a function setQuantity(), using objectSet(), that sets the quantity of an item. Make sure it implements the copy-on-write discipline.

function setQuantity(item, new_quantity) {

Image

Image

}

Image Answer

function setQuantity(item, new_quantity) {

return objectSet(item, "quantity", new_quantity);

}

Image It’s your turn

Write a copy-on-write version of the delete operator, which removes a key from an object.

Example:

> var a = {x : 1};

> delete a["x"];

> a

{}

 

function objectDelete(object, key) {

Image

Image

}

make a copy-on-write version of this

Image Answer

function objectDelete(object, key) {

var copy = Object.assign({}, object);

delete copy[key];

return copy;

}

Converting nested writes to reads

We still have one write left to convert to a read on our shopping cart. The operation that updates the price of an item given its name is still a write. However, that operation is interesting because it is modifying a nested data structure. It is modifying the item object inside of the shopping cart array.

It’s usually easier to convert the write for the deeper operation first. We implemented setPrice() in the exercise on page 137. We can use setPrice(), which operates on items, inside of setPriceByName(), which operates on carts.

Original

function setPriceByName(cart, name, price) {

 

for(var i = 0; i < cart.length; i++) {

if(cart[i].name === name)

cart[i].price =

price;

}

 

}

Copy-on-write

function setPriceByName(cart, name, price) {

var cartCopy = cart.slice();

for(var i = 0; i < cartCopy.length; i++) {

if(cartCopy[i].name === name)

cartCopy[i] =

setPrice(cartCopy[i], price);

}

return cartCopy;

}

typical copy-on-write pattern of copying and modifying the copy

we call a copy-on-write operation to modify the nested item

Nested writes follow the same pattern as non-nested writes. We make a copy, modify the copy, then return the copy. The only difference with nested operations is that we do another copy-on-write operation to modify the nested one.

If we modified the item directly, like we were doing in the original code, then our data would not be immutable. The references in the cart array may not change, but the values they refer to do change. That’s unnacceptable. The entire nested data structure has to remain unchanged for it to be immutable.

This is a very important concept. Everything in the nested data structure, from the top to the bottom, must be immutable. When we update a nested piece of data, we need to copy the inner value and all of the values on the way up to the top. It’s so important that we’ll spend a couple of pages really understanding what is getting copied.

What gets copied?

Let’s say we have three items in a shopping cart: a t-shirt, shoes, and socks. Let’s take an inventory of our arrays and objects so far. We have one Array (the shopping cart) and three objects (a t-shirt, shoes, and socks in cart).

We want to set the price of the t-shirt to $13. To do that, we use the nested operation setPriceByName(), like so:

shopping_cart = setPriceByName(shopping_cart, "t-shirt", 13);

Let’s step through the code and count what gets copied:

function setPriceByName(cart, name, price) {

var cartCopy = cart.slice();

for(var i = 0; i < cartCopy.length; i++) {

if(cartCopy[i].name === name)

cartCopy[i] = setPrice(cartCopy[i], price);

}

return cartCopy;

}

 

function setPrice(item, new_price) {

var item_copy = Object.assign({}, item);

item_copy.price = new_price;

return item_copy;

}

copy array

we call setPrice() only once, when the loop finds the t-shirt

copy object

We started with one array and three objects. What got copied? Well, only one array (the shopping cart) and one object (the t-shirt). Two objects were not copied. What’s going on?

We are making shallow copies of nested data, which results in structural sharing. That’s a lot of vocabulary words in one sentence. Let’s visualize it on the next page.

Image Vocab time

Let’s do a quick vocabulary review of some words we’ve already seen:

Nested data: Data structures inside data structures; we can talk about the inner data structure and the top-level data structure

Shallow copy: Copying only the top-level data structure in nested data

Structural sharing: Two nested data structures referencing the same inner data structure

Visualizing shallow copies and structural sharing

We started out with a shopping cart (one array) with three items (three objects). That’s four pieces of data total. We want to set the price of the t-shirt to $13.

Image

We then made a shallow copy of the shopping cart. At first, the copy pointed to the same objects in memory.

Image

The loop eventually found the t-shirt and called setPrice() on it. That function created a shallow copy of the t-shirt Object and changed the price to 13.

Image

setPrice() returned that copy, and setPriceByName() assigned it in the array in place of the original t-shirt.

Image

Although we had four pieces of data at the start (one array and three objects), only two of them (one array and one object) had to be copied. The other objects weren’t modified so we didn’t copy them. The original array and the copy are both pointing to everything that hasn’t changed. That’s the structural sharing that we’ve talked about before. As long as we don’t modify those shared copies, it is totally safe. Making copies allows us to keep the original and the copy without worrying that it will change.

Image It’s your turn

Let’s imagine we have a shopping_cart that has four items:

Image

When we run this line of code

setPriceByName(shopping_cart, "socks", 2);

what has to be copied? Circle all of the pieces of data that are copied.

Image Answer

We only have to copy the item that changes and everything on the path up from it. In this case, the “socks” item changes, so it is copied, and the array containing it must change to contain the new copy, so it needs to be copied, too.

Image

Image It’s your turn

Write a copy-on-write version of this nested operation:

function setQuantityByName(cart, name, quantity) {

for(var i = 0; i < cart.length; i++) {

if(cart[i].name === name)

cart[i].quantity = quantity;

}

}

 

 

function setQuantityByName(cart, name, quantity) {

Image

Image

}

Image Answer

function setQuantityByName(cart, name, quantity) {

var cartCopy = cart.slice();

for(var i = 0; i < cartCopy.length; i++) {

if(cartCopy[i].name === name)

cartCopy[i] =

objectSet(cartCopy[i], 'quantity', quantity);

}

return cartCopy;

}

Conclusion

In this chapter we learned the ins and outs of the copy-on-write discipline. Although it’s the same discipline you find in languages like Clojure and Haskell, in JavaScript you have to do all the work yourself. That’s why it’s convenient to wrap it up with some utility functions that handle everything for you. If you stick with those wrapper functions, you’ll be fine. Sticking with it is why it’s called a discipline.

Summary

  • In functional programming, we want to use immutable data. It is impossible to write calculations on mutable data.
  • Copy-on-write is a discipline for ensuring our data is immutable. It means we make a copy and modify it instead of modifying the original.
  • Copy-on-write requires making a shallow copy before modifying the copy, then returning it. It is useful for implementing immutability within code that you control.
  • We can implement copy-on-write versions of the basic array and object operations to reduce the amount of boilerplate we have to write.

Up next…

The copy-on-write discipline is nice. However, not all of our code will use the wrappers we wrote. Most of us have lots of existing code written without the copy-on-write discipline. We need a way of exchanging data with that code without it changing our data. In the next chapter, we will learn another discipline called defensive copying.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.143.4.181