path
elements are SVG’s answer to drawing irregular forms. Anything that’s not a rect
, circle
, or another simple shape can be drawn as a path
. The catch is that the syntax for defining path
values is not particularly human-friendly. For example, here is a line that we’ll generate from data in this chapter. Note the path
syntax, as specified in the element’s d
attribute and shown in Figure 11-1.
If you can read that, then you don’t need this book.
Fortunately, D3 has lots of built-in functions that generate path
s for you. You’ve already met the axis functions, which express scales as path
, line
, and text
elements. In later chapters, you’ll learn about d3.arc()
and d3.geoPath()
, both of which also generate path
s for different purposes. In this chapter, we’ll cover two other common uses of path
s: drawing line and area charts.
Let’s start with a simple line chart. Actually generating the line is quite simple, but we need some data in place first. For this chapter, I’m going to use a real-world dataset.
Line charts are great for time series, so I’ve decided to chart carbon dioxide measurements over time. I’ve downloaded the “Mauna Loa CO2 monthly mean data”, as provided by the National Oceanic & Atmospheric Administration’s Earth System Research Laboratory. (See the README.md in this chapter’s examples for details and a full citation.)
This data includes monthly average values of CO2 in parts per million, as measured at the Mauna Loa Observatory in Hawaii. An excerpt of the raw-text file looks like this:
# CO2 expressed as a mole fraction in dry air, micromol/mol, abbreviated as ppm # # (-99.99 missing data; -1 no data for #daily means in month) # # decimal average interpolated trend #days # date (season corr) 1958 3 1958.208 315.71 315.71 314.62 -1 1958 4 1958.292 317.45 317.45 315.29 -1 1958 5 1958.375 317.50 317.50 314.71 -1 1958 6 1958.458 -99.99 317.10 314.85 -1 1958 7 1958.542 315.86 315.86 314.98 -1 1958 8 1958.625 314.93 314.93 315.94 -1 1958 9 1958.708 313.20 313.20 315.91 -1 1958 10 1958.792 -99.99 312.66 315.61 -1 1958 11 1958.875 313.33 313.33 315.31 -1 1958 12 1958.958 314.67 314.67 315.61 -1 1959 1 1959.042 315.62 315.62 315.70 -1 1959 2 1959.125 316.38 316.38 315.88 -1 1959 3 1959.208 316.71 316.71 315.62 -1 1959 4 1959.292 317.72 317.72 315.56 -1 1959 5 1959.375 318.29 318.29 315.50 -1 1959 6 1959.458 318.15 318.15 315.92 -1 1959 7 1959.542 316.54 316.54 315.66 -1 1959 8 1959.625 314.80 314.80 315.81 -1 1959 9 1959.708 313.84 313.84 316.55 -1 1959 10 1959.792 313.26 313.26 316.19 -1 1959 11 1959.875 314.80 314.80 316.78 -1 1959 12 1959.958 315.58 315.58 316.52 -1 1960 1 1960.042 316.43 316.43 316.51 -1
Note that the first column indicates a year, while the second indicates a month (1–12). This version of the file contains values from March 1958 through January 2017. The fourth column, “average,” is the CO2 value we’re interested in.
You probably noticed that these values aren’t comma-separated; rather, they are provided in fixed-width columns in an otherwise unformatted text file. D3 needs a little more structure than this, so my first step is to wrangle this into CSV form, using the free, amazing tool OpenRefine.
I won’t detail the steps involved here, as there are other books and tutorials on using OpenRefine. In short, I cut out everything but the three relevant columns, and my resulting file, mauna_loa_co2_monthly_averages.csv, looks like this:
year,month,average 1958,3,315.71 1958,4,317.45 1958,5,317.5 1958,6,-99.99 1958,7,315.86 1958,8,314.93 1958,9,313.2 1958,10,-99.99 1958,11,313.33 1958,12,314.67 1959,1,315.62 1959,2,316.38 1959,3,316.71 1959,4,317.72 1959,5,318.29 1959,6,318.15 1959,7,316.54 1959,8,314.8 1959,9,313.84 1959,10,313.26 1959,11,314.8 1959,12,315.58 1960,1,316.43 …
Much better! Next, I’ll define a row conversion function, as described in Chapter 5.
var
rowConverter
=
function
(
d
)
{
return
{
//Make a new Date object for each year + month
date
:
new
Date
(
+
d
.
year
,
(
+
d
.
month
-
1
)),
//Convert from string to float
average
:
parseFloat
(
d
.
average
)
};
}
This will convert the three columns of data in our CSV into just two: date
and average
. The date
consists of a new Date()
, a new JavaScript Date
object, into which we are passing the year and the month. The +
operator, in this context, forces the d.year
and d.month
values to be typed as numbers instead of strings. And because JavaScript’s month counting begins at zero (as in, 0 = January, 1 = February, and 2 = March) but the data’s month counting begins at one (1 = January), I’ve subtracted one from each month value. (See MDN’s docs on the Date
object for more.)
The average
value is parsed from a string to a float.
We call the row conversion function when we load in the CSV data:
//Load in data
d3
.
csv
(
"mauna_loa_co2_monthly_averages.csv"
,
rowConverter
,
function
(
data
)
{
var
dataset
=
data
;
//Print data to console as table, for verification
console
.
table
(
dataset
,
[
"date"
,
"average"
]);
//…
Feast your eyes on console.table()
, a fancy improvement on console.log()
that prints everything in nice, pretty columns that are easier on your brain. Here I’ve (optionally) specified the names of the two columns of interest: date
and average
. You can verify for yourself in Figure 11-2 that the values match that from our original dataset.
Isn’t it beautiful?
The process of setting up our scales should be familiar by now. First, xScale
will handle time, and yScale
handles the CO2 values. I’ve set a zero baseline (low domain value of 0) to start.
xScale
=
d3
.
scaleTime
()
.
domain
([
d3
.
min
(
dataset
,
function
(
d
)
{
return
d
.
date
;
}),
d3
.
max
(
dataset
,
function
(
d
)
{
return
d
.
date
;
})
])
.
range
([
0
,
w
]);
yScale
=
d3
.
scaleLinear
()
.
domain
([
0
,
d3
.
max
(
dataset
,
function
(
d
)
{
return
d
.
average
;
})])
.
range
([
h
,
0
]);
With our data and scales in place, finally we get to draw a line! (It’s the small things in life…)
Start by defining a line generator function, much as we defined axis generator functions to make axes. To do this, we call d3.line()
, making sure to specify x and y accessors.
//Define line generator
var
line
=
d3
.
line
()
.
x
(
function
(
d
)
{
return
xScale
(
d
.
date
);
})
.
y
(
function
(
d
)
{
return
yScale
(
d
.
average
);
});
The x and y accessors tell the line generator how to decide where to place each point on the line. Note that for x, we specify the scaled value of d.date
, and y gets the scaled value of d.average
. Later, when the line generator is called, it will look for the bound data and loop through each value, using the accessor logic here to calculate where to position each point. Drawing the line itself is really just a matter of connecting those carefully positioned dots.
For this barebones example, I then create the SVG element and finally append a new path
element.
//Create SVG element
var
svg
=
d3
.
select
(
"body"
)
.
append
(
"svg"
)
.
attr
(
"width"
,
w
)
.
attr
(
"height"
,
h
);
//Create line
svg
.
append
(
"path"
)
.
datum
(
dataset
)
.
attr
(
"class"
,
"line"
)
.
attr
(
"d"
,
line
);
Huh? What happened to your old friend, the selectAll/data/enter/append pattern?
Until now, our examples involved one graphical mark corresponding to one data value (or one “row” of related values), as in our bar charts and scatterplots. A line chart, however, calls for a single graphical mark that represents many data values. We can skip the selectAll/data/enter/append pattern, because we already know how many new marks we need: just one.
Instead of using data()
to bind each value in our dataset
array to a different element, we use datum()
, the method for binding a single data value to a single element. The entire dataset
array is bound to the new path
we just created.
Following that, we assign a class of line
(to enable easy CSS selection and styling) and then—finally—set a d
attribute, passing in our line generator function as an argument. Since the data has already been bound to the path
, the line generator simply grabs that data, plots the points as we specified, and draws a line connecting them. This produces the output we saw in Figure 11-1, at the start of this chapter. The final code is in 01_line_chart.html.
To verify how the entire dataset
was bound to one element, try selecting that line in the console using d3.select(".line")
. Note the complete, 707-value-long array that appears under __data__
in Figure 11-3.
In 02_line_chart_axes.html, I’ve added some padding on the left and bottom edges plus axes, as shown in Figure 11-4.
Now, with the axes labeled, it’s time for a sanity check: could CO2 levels really have dropped below zero, as those downward spikes seem to indicate? (Are negative CO2 ppm values even possible? Answer: no.)
This spiky anomaly is explained by a note in the original datafile:
# (-99.99 missing data; -1 no data for #daily means in month)
Ah, so carbon-measuring machines (and even scientists themselves) are not infallible! A –99.99 value is not a true measurement, but stands in for “no data available for that month.”
We could manually remove those –99.99 values from our dataset. Or we could leave the data untouched, and use the line generator’s defined()
method to determine, on the fly, whether or not each individual value is defined (or valid). defined()
is just another configuration method, like x()
and y()
. If the result of its anonymous function is true, then that data value is included. If not, the value is excluded.
This simple use of defined()
checks merely to see if any value exists at all:
.
defined
(
function
(
d
)
{
return
d
;
})
So, if d
exists, this will be interpreted as true.
But in our case, the –99.99 values do exist; it’s just that we humans know they aren’t valid measurements. To exclude them, we could use a comparison operator, which will return a logical result (true or false):
//Define line generator
var
line
=
d3
.
line
()
.
defined
(
function
(
d
)
{
return
d
.
average
>=
0
;
})
.
x
(
function
(
d
)
{
return
xScale
(
d
.
date
);
})
.
y
(
function
(
d
)
{
return
yScale
(
d
.
average
);
});
For every row in the dataset, if d.average
is greater than or equal to zero, this will return true, and the value will be included. All negative values, including –99.99, will be thrown out.
See 03_line_chart_missing.html for that final code, which renders the chart shown in Figure 11-5.
Note the veeeeery tiny gaps, where there used to be (graphically dishonest) spikes.
Actually, this would be a nice time to show off SVG’s fundamentally scalable nature. Use your browser’s zoom functionality to zoom in to this chart as much as possible. Note that the entire chart—consisting of vector SVG elements—remains as sharp and clear as ever. Figure 11-6 shows our chart at 500% zoom, where the gaps are clearly visible.
In 04_line_chart_adjusted.html, I’ve adjusted the yScale
domain to emphasize the upward trajectory, so there is no longer a zero baseline. I’ve also manually added a red line corresponding to 350 ppm, considered the maximum “safe” level of atmospheric CO2 by many scientists, and used the stroke-dasharray
property to make it a dashed line. See Figure 11-7.
At this point, we are dealing in visual rhetoric—design decisions that express a point of view by influencing how others interpret our chart. (For great examples of climate change–related visual rhetoric, see Duarte Design’s graphics and animations, as used by Al Gore in the Academy Award–winning An Inconvenient Truth.)
Having acknowledged that, let’s say I want to emphasize the portions of the line that are above the 350 ppm line. I could approach that by simply creating a second data-driven line. I could use defined()
to include only data points of 350 or greater for the new red line, and to exclude such points for the original teal line:
//Define line generators
line
=
d3
.
line
()
.
defined
(
function
(
d
)
{
return
d
.
average
>=
0
&&
d
.
average
<=
350
;
})
.
x
(
function
(
d
)
{
return
xScale
(
d
.
date
);
})
.
y
(
function
(
d
)
{
return
yScale
(
d
.
average
);
});
dangerLine
=
d3
.
line
()
.
defined
(
function
(
d
)
{
return
d
.
average
>=
350
;
})
.
x
(
function
(
d
)
{
return
xScale
(
d
.
date
);
})
.
y
(
function
(
d
)
{
return
yScale
(
d
.
average
);
});
That, plus some styling changes for each line and a new text label, produces what you see in 05_line_chart_labeled.html and Figure 11-8.
This is just one of many possible visual solutions. A close look reveals small gaps in the line near the 350 ppm mark. An alternative approach would be to render two full lines in their entirety, but mask them using SVG clipping paths, as described in “Containing visual elements with clipping paths” in Chapter 9.
Areas are not too different from lines. If a line is a series of connected x/y points, then an area is just that same line, plus a second such line (usually a flat baseline), with all the space in between filled in.
Figure 11-9 illustrates this and shows what we’re working toward.
The main difference is calling d3.area()
to define an area generator instead of a line generator:
area
=
d3
.
area
()
.
defined
(
function
(
d
)
{
return
d
.
average
>=
0
;
})
.
x
(
function
(
d
)
{
return
xScale
(
d
.
date
);
})
.
y0
(
function
()
{
return
yScale
.
range
()[
0
];
})
.
y1
(
function
(
d
)
{
return
yScale
(
d
.
average
);
});
You’ll notice I’ve specified x
, y0
, and y1
accessors. The x
accessor is unchanged from earlier. y0
represents the area’s baseline, while y1
represents the top, or data value. You’ll notice that y1
is the same as our earlier y
accessor. But y0
is new. For every data value, this accessor returns the first value in yScale
’s range: that is, the “bottom” edge of the chart.
I don’t know about you, but to me this sure feels like a whole lot more carbon dioxide. (Suddenly, I hope you are reading this as an ebook powered by renewable energy. If you’re reading the dead-tree version, don’t feel bad; use what you’re learning here for good.)
One potential downside of an area chart is that, if there are missing data points, those absences will be keenly felt, as in Figure 11-9. Those slices punctuating the data don’t sit right with me, but they are graphically honest. You could go fudge the data to patch those holes, but then how would you sleep at night?
I’ll define the second area (the red one) like so:
dangerArea
=
d3
.
area
()
.
defined
(
function
(
d
)
{
return
d
.
average
>=
350
;
})
.
x
(
function
(
d
)
{
return
xScale
(
d
.
date
);
})
.
y0
(
function
()
{
return
yScale
(
350
);
})
.
y1
(
function
(
d
)
{
return
yScale
(
d
.
average
);
});
Note that, in this case, I specify a baseline value of yScale(350)
—that is, the 350 ppm mark—because this area doesn’t need to extend all the way down to the bottom of the chart. Nice!
Later in the code, we make sure to call the new area generators instead of the old line generators:
//Create areas
svg
.
append
(
"path"
)
.
datum
(
dataset
)
.
attr
(
"class"
,
"area"
)
.
attr
(
"d"
,
area
);
// <-- Area!
svg
.
append
(
"path"
)
.
datum
(
dataset
)
.
attr
(
"class"
,
"area danger"
)
.
attr
(
"d"
,
dangerArea
);
// <-- Area!
I also made some minor adjustments to variable names and CSS properties. But mainly you only need to change line
to area
and specify y0
and y1
accessors.
See that chart in 06_area_chart.html and Figure 11-10.
As a designer, I must point out that, yes, you can see the tiniest bit of the teal area bleeding around the edges of the red area. Most people won’t notice this, but if you do, I’m sorry for you, but glad to meet someone else with the same affliction. In any case, you could remove that cleanly by using SVG clipping paths (again, see Chapter 9).
18.191.62.122