Gnuplot provides a facility to map numeric values into a continuous range of colors. This allows you to create false-color plots (or heatmaps): graphs in which a numeric value is represented by a color, instead of by position.
In this appendix, we’ll first study how to set up such mappings between numbers and colors. Because the values in a false-color plot typically form a continuum, the mapping usually consists of a continuous color ramp or color gradient. The gnuplot command to create and manage such color gradients is set palette, and hence I’ll refer to these mappings as palettes. After introducing the commands to define palettes, we’ll look in detail at a catalog of example palettes for a variety of different applications and point out their strengths and weaknesses.
Next, we’ll study ways to use palettes to create colored graphs in gnuplot. Gnuplot’s palette feature is quite flexible, and you can use it for a variety of purposes, of which heatmaps are only one (but arguably the most important) example.
Before discussing all the various commands and options in detail, let’s quickly look at an example that demonstrates the capabilities that’ll be introduced in this appendix. Consider again the function that was studied in appendix C. There, you saw surface plots (such as figure C.1) and contour plots (figure C.8) of this function. Now we’re going to add some color!
Let’s begin simply. The following commands create a colored figure (figure D.1):
set isosamples 200 set palette defined ( 0 "blue", 0.5 "light-grey", 1 "red" ) splot [-2:2][-2:2] exp(-(x**2 + y**2))*cos(x/4)*sin(y)*cos(2*(x**2+y**2)) w pm3d
Two items are new. First, the set palette command is used to create a smooth color gradient palette in gnuplot. The w pm3d style in the splot command then uses this palette to create a colored surface by mapping the surface elevation into the palette. (The abbreviation stands for “palette-mapped three-dimensional” plot—you can see why.)
Of course, other color schemes are possible—that’s the point of the set palette command. Later in this appendix (in section D.2), you’ll learn how to set up your own palettes using the set palette command.
But color palettes are not only available for surface plots (as in figure D.1), but also for many other plots. Figure D.2 shows the same function again, but instead of using contour lines, it now uses only color to indicate the local value of the function at each point.
The graph in the left panel was constructed with a very simple palette, using the following commands:
set isosamples 400 set view map; set size square set palette defined (-1 'white', 0 'blue', 0 'red', 1 'white' ) splot [-2:2][-2:2] exp(-(x**2 + y**2))*cos(x/4)*sin(y)*cos(2*(x**2+y**2)) w pm3d
The palette for the other graph is a tad more complicated; you’ll see its definition later, in section D.2.5.
For yet another kind of graph that used a color palette, go back to figure 9.5, when we discussed data-dependent colors. The time has come to introduce all the detail that we skipped back then!
Gnuplot’s command to create color gradients is set palette. The command is pretty smart and does a lot of legwork for you. It’s also fairly complex and combines many different bits of functionality. In fact, one of its problems is its almost promiscuous flexibility: the command places few limits on user input, which can be both very convenient and very confusing, depending on whether you understand how gnuplot processes the arguments.
There are basically two different ways to define a color gradient using set palette:
The set palette command can handle five different color models: RGB (red-green-blue), HSV (hue-saturation-value), CMY (cyan-magenta-yellow; mostly used in print publishing), YIQ (US color television standard), and CIE XYZ (a model based on color perception).[1] For computer graphics, RGB and HSV are by far the most important.
Check the respective Wikipedia pages for details.
You can select the color model using the model keyword. For example:
set palette model RGB set palette model HSV
The color model selection is sticky, meaning it remains in effect until explicitly changed: you don’t need to include the model specification in every call to set palette. The default is the RGB model.
The color model in set palette is sticky and silently stays in effect until explicitly overridden. Unless the color model is RGB, I always include the explicit choice of color model in each call to set palette.
All color models represent color through exactly three components, and formulas exist to translate between the different color models. (Transparency would require a fourth component, but set palette doesn’t handle transparent colors, so three components are sufficient.)
To specify the three components, the set palette command accepts three different formats:
Here are two colors in all three formats:
"red" "#ff0000" 1 0 0 "grey" "0xc0c0c0" 0.75 0.75 0.75
Here’s the first potential for confusion: the set palette command separates the input into three components (let’s call them A, B, and C) and interprets them according to the currently selected color model. Every format is permissible for every color model, and gnuplot makes no attempt to “guess” the intended color model from the input format.
Let’s consider an example. Regardless of whether your input was "red" or "#ff0000" or 1 0 0, gnuplot will interpret it as follows: first channel (component A) at maximum, and the other two channels (components B and C) at minimum. If the chosen color model is RGB, then this amounts to “full red, no green or blue.” But if the chosen color model is HSV, this means “full hue, no saturation or value” (that is, black, by virtue of the zero value in the third channel). Similarly, the arguments "grey", "0xc0c0c0", and 0.75 0.75 0.75 will all be interpreted as “all three channels at three-quarters of their maximum.” In the RGB space, this amounts to a light gray; but in the HSV space, the color will be a medium purple! In every case, the input is split into its three components first, and only then are the components interpreted according to the currently active color model. (In fairness, set palette warns you if you try to use named colors with color spaces other than RGB.)
The set palette command doesn’t restrict the input format to appropriate color models. Instead, all input is first broken into components, which are only then interpreted according to the currently selected color model.
You can define a palette by specifying a set of colors at fixed locations within the plot range, and gnuplot will fill in the gaps in between. This is, in my opinion, the most useful way to create palettes in practice.
You use the defined keyword to specify the list of colors and positions. You can use any of the color formats discussed in the previous section, and hence the following four commands are equivalent:
set palette defined ( -1 "red", 0 "white", 1 "blue" ) set palette defined ( -1 "#ff0000", 0 "#ffffff", 1 "#0000ff" ) set palette defined ( -1 1 0 0, 0 1 1 1, 1 0 0 1 ) set palette defined ( -1 "red", 0 "#ffffff", 1 0 0 1 )
The defined keyword must be followed by a comma-separated list in parentheses. Each of the entries between the commas consists of a position, followed by a color. Don’t get confused when using the third format (numeric triples): each entry indeed consists of four numbers, without any punctuation between them.
There’s no limit on the number of entries in the list of colors. The values used as positions aren’t restricted: they can be positive or negative, integer or floating-point. The only constraint is that they must form a non-decreasing sequence of numbers. It’s possible to repeat a position, and doing so can be useful to create palettes with sharp transitions, as we’ll discuss later. Two examples:
This brings us to the second potential source of confusion when using set palette: the positions need not correspond to actual values from the plot range! In fact, they aren’t absolute positions at all—they’re merely relative positions in the interval spanned by the smallest and the greatest of the positions (that is, by the first and the last entries between the parentheses). The following four commands all define exactly the same palette:
set palette defined ( -1 "red", 0 "white", 1 "blue" ) set palette defined ( -10 "red", 0 "white", 10 "blue" ) set palette defined ( 0 "red", 0.5 "white", 1 "blue" ) set palette defined ( 7 "red", 14 "white", 21 "blue" )
In every case, the total range of values spanned by the positions will be shifted and scaled to the unit interval. All nodes retain their relative positions in the process. In particular, the positions need not correspond to actual values from the plot range; you use a special command (set cbrange—see section D.3) to map the entire palette to the desired plot range.
The positions assigned to nodes when using set palette defined (...) aren’t absolute positions; they’re relative positions within the interval of values spanned by the first and last node. Gnuplot internally normalizes the values by shifting and scaling the nodes into the unit interval for further processing.
Gnuplot interpolates linearly between nodes, for each component separately. The interpolation is done in the currently selected color space, and only afterward are the interpolated values transformed to RGB for actual presentation. For example, the following command creates a standard rainbow palette:
set palette model HSV defined ( 0 0 1 1, 1 1 1 1 )
In HSV space, gnuplot interpolates linearly between a hue value of 0 and a hue value of 1, but if you look at the resulting palette in RGB space, the behavior is most definitely not linear.
It’s possible to read the nodes from file, using the file keyword. The file is treated as a regular data file, which means the using directive is available to pick out columns and to apply inline transformations.
Gnuplot expects three or four columns (the position, followed by the three color components). If only three columns are provided, the nodes that are read from file will be spaced uniformly along the palette. Again, the values of the color components that are read from file will be interpreted according to the currently active color model!
Here are two syntax examples, with and without an explicit using directive:
set palette file "palette.gpf" set palette file "palette.gpf" using 1:2:3:4
The color values read from file (or resulting from an inline transformation) should be between 0 and 1. Values exceeding this interval are truncated back into this interval.
Any color values that are read from file using set palette file ... and that exceed the unit interval are truncated back into the unit interval.
Rather than provide a set of nodes and let gnuplot interpolate between them, you can also provide a set of three functions (one for each color component). Depending on the color model chosen, the components will be interpreted as red, green, and blue channels (RGB), or as hue, saturation, and value channels (HSV), and so on.
Instead of using defined, you use the functions keyword in the set palette command. The functions are arbitrary, but each one must map the unit interval [0:1] into the unit interval [0:1]. The independent (dummy) variable in the function definition must be called gray (not x, not grey!).
Defining palettes using functions works best in the relatively intuitive HSV color space. Here are a few simple examples:
When defining a palette through functions with set palette functions ..., don’t forget that the independent (dummy) variable in the function definitions must be called gray. If you choose any other variable name (such as x), gnuplot will not warn you—things just won’t work!
The set palette command offers some further features that I want to mention in passing. Check the gnuplot standard reference documentation for details.
The command set palette cubehelix (which is intended to be used with the RGB color model) creates a palette that traces out a rainbow, while simultaneously increasing color intensity. Several sub-options allow you to control various aspects of the resulting palette.
The set palette rgbformulae command is intended to reduce the size of PostScript files. When using this option, you must define your palette in terms of several predefined functions. When exporting your graph to PostScript, these functions are written to the PostScript file so that the palette can be evaluated analytically when the PostScript file is processed. Because the palette need not be saved explicitly, the resulting PostScript file is smaller.
The following command is a valuable tool when developing palettes:
test palette
It produces a figure (using the current settings of set output and set terminal!) like the one in figure D.3. The bottom panel shows the palette as a spectrum of colors; the top panel shows the intensity levels of the red, green, and blue components. (Regardless of the choice of color model, the test palette command always displays RGB components.) Also included in the top panel is the value of the NTSC luminance, which corresponds to the Y channel in the YIQ model. It indicates the result if the color were displayed on a black-and-white TV screen.
In addition to this graphical visualization of the palette with test palette, you can query the current definition using
show palette
This prints the current values of all sub-options of set palette to the screen, including the three function definitions if the palette was defined using set palette functions. If the palette was created from a set of discrete nodes using set palette defined (...), then the printed message won’t include explicit information on the nodes. Instead, you must use
show palette gradient
to list them.
It may be desirable to export a finished palette as a set of RGB triples: for example, to use it in another application. You can do so using the show palette palette command:
show palette palette {int:n} [ int | float ]
This prints a list of RGB triples to the output channel identified by the current value of set print. The mandatory integer argument gives the number of intermediate colors in the exported palette. By default, the output is formatted for human readability, but by giving either the int or the float keyword, you can obtain a listing that’s easier to parse from a program. (With int, the RGB values are integers in the range 0 to 255; with float, they’re floating-point values between 0 and 1.)
All the details supplied to the set palette command are included in the information persisted with save; hence an existing graph (and its palette) can be re-created using load, as you’d expect. You should be aware, though, that gnuplot saves the color for each node in a palette only in the third format, which consists of three numbers between 0 and 1. In particular, gnuplot will not write the names of named colors to file when using save. The resulting command files are therefore, by construction, much less human-readable. If you want to save a palette to file with the idea of later continuing to edit it manually, then you should edit the respective command file in an editor (see section 12.2), rather than use gnuplot’s save command.
Even though you may have defined a palette in terms of named colors, gnuplot will only persist the equivalent numerical values for each of the three color components to file. To retain the named colors, you must edit and save the command file in an external editor.
Let’s look at some example palettes. The purpose here is twofold: first, I want to demonstrate how to use the set palette command; and I also want to provide you with recommendations for palettes and specific color choices that work well in practice. (For all the following, compare figure D.4.)[2] Defining your own palettes is particularly important because gnuplot’s default palette isn’t very good. To obtain good results, you need to define your own.
You can find a very large collection of palettes at http://soliton.vm.bytemark.co.uk/pub/cpt-city. Although many of them are intended for graphic design, not data visualization, the site is a terrific source of inspiration.
Because gnuplot’s default palette isn’t very good, it’s essential that you learn how to define your own palettes. Moreover, false-color plots generally need palettes that have been adapted to the specific graph in order to be effective, necessitating you to have the required know-how.
Two palettes are particularly suitable to demonstrate the various features of the set palette command: the linear grayscale and the standard rainbow (panels A and B, respectively, in figure D.4). Here are four equivalent ways of creating a linear grayscale palette, using all combinations of the RGB and HSV color model together with either the defined or the functions feature:
set palette defined ( 0 "black", 1 "white" ) set palette functions gray, gray, gray set palette model HSV defined ( 0 0 0 0, 1 0 0 1 ) set palette model HSV functions 0, 0, gray
The standard rainbow only makes sense to define in HSV space. Here are two ways, using nodes and functions, respectively:
set palette model HSV defined ( 0 0 1 1, 1 1 1 1 ) set palette model HSV functions gray, 1, 1
Neither of these two palettes works particularly well in practice. The standard rainbow doesn’t convey an intuitive sense of ordering; it wraps around (so that the highest and the smallest values are assigned similar colors); and its garish colors don’t aid perception. Graphs drawn using a simple grayscale are dominated by shades of black and white, which are harder to distinguish than shades of darker and lighter grays. Moreover, a grayscale isn’t in color!
Here are three palettes that are simple enough to keep in mind, so that you can recreate them on the spot. The first consists of the colors blue, white, and red (panel C in figure D.4):
set palette defined ( 0 "blue", 0.5 "white", 1 "red" )
You should commit this palette to memory! It has several advantages:
Despite its extreme simplicity, the blue/white/red palette is amazingly convenient and versatile. The color white should be replaced by a light gray if white would be hard to distinguish from the color of the background.
Another combination of colors that conveys a sense of ordering consists of the traffic light colors: red, yellow, and green (panel D in figure D.4). In addition to the sense of ordering, this particular trio of colors invariably also conveys a semantic “good/bad” meaning—which may or may not be appropriate:
set palette defined ( 0 "web-green", 0.5 "goldenrod", 1 "red" ) set palette defined ( 0 "#00c000", 0.5 "#ffc020", 1 "#ff0000" ) set palette defined ( 0 0 3/4. 0, 1 1 3/4. 1/8., 2 10 0 )
The three definitions given are equivalent—the one using gnuplot’s color names may be the easiest to remember, the one using hex strings is the most portable, and the one using numeric fractions makes the specific choice of colors clearest. In this palette, I use a warmer green and a dark shade of yellow: I’ve found that both colors look better than pure green and yellow at full intensity.
The red/yellow/green color combination conveys not only a high/low but also a semantic good/bad distinction. You need to decide whether this is appropriate for the visualization task at hand. Keep in mind that a pure yellow may be hard to see against a white background; instead, it’s better to use a darker shade of yellow or even a light orange. Also keep in mind that this palette will pose difficulties for people with red/green color vision deficiencies.
The last simple workhorse is the “improved rainbow” (panel E in figure D.4):
set palette model RGB defined ( 0'blue', 1 'cyan', 2 'green', 3 'yellow', 4 'red', 5 'magenta')
It fixes the two biggest problems of the standard rainbow: because the colors run from blue (cold) to red and magenta (hot), it conveys an intuitive sense of ordering. Moreover, because it doesn’t wrap around, the color mapping is unambiguous. The improved rainbow isn’t a very good palette, but it’s much better than the standard rainbow, and—because it only depends on primary colors as nodes—it’s still easy to remember and simple to set up.
The following palette, which is a further refinement on the improved rainbow palette, is similar to the default color palette in MATLAB (panel F in figure D.4):
set palette defined ( 0 'dark-blue', 1/8. 'blue', 3/8. 'cyan', 5/8. 'yellow', 7/8. 'red', 1 'dark-red' )
This is a good, all-purpose palette with a strong intuitive sense of ordering and several distinct visual gradients toward the center of the plot range. Furthermore, the central part of the palette consists of softer, pastel colors, which results in graphs that are easier on the eyes.
In a similar spirit, here’s an improved version of the red/yellow/green color combination. Like the most recent palette, this also uses pastel colors in the center of the plot area (panel G in figure D.4):
set palette defined ( 0 0 0.55 0.2, 0.3 0.65 0.85 0.4, 0.5 1 1 0.75, 0.7 0.95 0.68 0.38, 1 0.75 0 0.15 )
The next two palettes are very good but also very subtle: each is constructed from a large number of nodes with precisely chosen colors for each node. Instead of giving you an approximation, which would necessarily lose some of the subtlety of either palette, I recommend that you download the original files.
The first of these palettes[3] is based on work by the oceanographer William F. Haxby (panel H in figure D.4). It’s similar in spirit to the MATLAB palette you just saw, but it works with even milder colors. Highly recommended!
“Hx-110-110,” cpt-city, http://mng.bz/T48T.
The other palette[4] was developed by computer scientist Kenneth Moreland and is an improvement on the simple blue/white/red palette (panel J in figure D.4). It uses non-saturated colors and special, nonlinear color ramps and thereby achieves an uncommonly uniform variation of intensity across the entire spectrum. Highly recommended!
“Cool-warm,” cpt-city, http://mng.bz/m8bp.
It shouldn’t be lost on you that all the palettes in this section are improvements on simple palettes that were introduced earlier. The difference doesn’t come from a conceptual breakthrough, but from a relentless attention to detail.[5]
You can find another interesting palette at www.gnuplotting.org/matlab-colorbar-parula-with-gnuplot. This palette combines a continuous change in hue with an almost linear increase in lightness over the entire color range.
High-quality palettes can’t be improvised but are the result of many incremental improvements in detail. Become familiar with and use high-quality palettes that are available.
All the palettes discussed so far consisted of smooth transitions and were suitable for continuously varying data. But sometimes that’s not what you want: sometimes it’s necessary to indicate sharp boundaries between categories. (Think of the difference between a political compared to a topographical map!)
Using the set palette command, you create a sharp color transition by listing one of the positions twice, giving a different color each time. The following traffic-light palette consists only of sharp transitions (panel K in figure D.4):
set palette defined ( 0 "green", 1 "green", 1 "yellow", 2 "yellow", 2 "red", 3 "red" )
Palettes can feature sharp transitions together with smooth gradients. Here’s a variant of the blue/white/red palette that emphasizes the break at the center of the plot range. You might use this palette to indicate the location where a function or data set changes sign—while still indicating total magnitude through the smooth color change on either side. Notice that the palette doesn’t fade to white (or gray) away from the center, but always retains some tint (panel L in figure D.4):
set palette defined ( -1 '#aaaaff', 0 'blue', 0 'red', 1 '#ffaaaa' )
Here’s a red/green palette with a sharp transition. In contrast to the blue/red palette you just saw, this one has the highest color intensity not in the center, but toward the edges. The fully saturated colors were chosen to be relatively mild and to have comparable lightness (panel M in figure D.4):
set palette defined (0 0 0.75 0, 0.2 0.2 0.8 0.2, 0.5 0.75 0.95 0.75, 0.5 1 0.66 0.66, 0.8 0.9 0.25 0.25, 1 0.9 0.1 0.1 )
Here’s a more complicated take on the same idea (panel N in figure D.4). This palette also exhibits red and blue color gradients away from the middle, but in addition it sports a rapid sequence of color changes right at the center point, thus very effectively highlighting the transition:[6]
For a similar palette, see “Curvature,” cpt-city, http://mng.bz/QJ7x.
set palette defined ( 0 "dark-blue", 0.45 "blue", 0.495 "web-blue", 0.4995 "cyan", 0.5 "white", 0.5005 "yellow", 0.505 "sienna1", 0.55 "red", 1 "dark-red" )
A geo scale is yet another type of palette that offers a sense of ordering, using the kind of color sequence familiar from topographic maps (although in this case the sense of ordering is more conventional than inherent). Such palettes seem to work best when all the colors are particularly pale. Here’s an example (panel O in figure D.4):
set palette defined ( 0 0.7 0.85 0.6, 0.3 1 1 0.75, 0.7 0.75 0.6 0.4, 1 1 1 1 )
Geo palettes may include a sharp transition to indicate the boundary between sea and land. Here’s a a rather complex example[7] (panel P in figure D.4):
This palette was inspired by the standard palette used by maps on Wikipedia. See “Wikipedia elevation schemes,” cpt-city, http://mng.bz/6lBR.
set pal defined (0 0.45 0.66 0.85,0.25 0.725 0.9 1, 0.33 0.85 0.95 1, 0.33 0.68 0.82 0.65,0.4 0.58 0.75 0.53,0.575 0.95 0.9 0.75, 0.8 0.66 0.52 0.32,0.85 0.66 0.6 0.5, 1 0.95 0.95 0.95 )
Yet another idea is to create a palette from well-separated color bands (possibly on a smoothly varying background). Here’s an example[8] (panel Q in figure D.4):
This is an improvement on the version given in the first edition. I’ve since learned that you need to use relatively wide color bands that don’t have sharp edges, but that fade away gently.
hue(x) = word( "0.75 0.66 0.5 0.33 0.16 0.1 0.0 0.83", 1+int(8*x) ) sat(x) = cos(2*pi*(x*7.4-0.2)) set palette model HSV functions hue(gray), sat(gray)>0?sat(gray):0, 1
If you want to superimpose the colored bands on a smoothly changing grayscale, you can replace the 1 in the last entry (for the value component) with an expression like sat(gray)>-0.5?1:gray. Personally, I find that the background clutters up the result too much.
It isn’t at all obvious what makes a good palette. Here are some considerations and recommendations for you to consider:
Finally, keep in mind that it all depends on the specific task at hand and on the purpose of your visualization. Consider a map: it may either be colored continuously (topographic map) or use discrete colors (political map). In the discrete case, the colors may be unsorted and chosen for contrast (to indicate different countries) or sorted (to indicate a sorted quantity, such as the average income per head for each country). Continuous quantities are always sorted, and therefore smooth palettes should always be constructed to convey a clear sense of ordering.
The colorbox shows the mapping of numeric values to colors. It fulfills a role similar to that of the key (see section 7.4), but for colored plots. You encountered examples in figures D.1 and D.2; further examples can be found in figures D.5 and D.6, in section D.4. In all cases, the colorbox appears to the right of the main figure.
The colorbox is only visible when you use palette-mapped colors. Its appearance is controlled through the set colorbox options:
set colorbox [ vertical | horizontal ] [ noborder | bdefault | border {idx:linestyle} ] [ default | user [ origin {pos:orig} ] [ size {pos:size} ] ] [ front | back ]
The colorbox can be oriented either horizontally or vertically (this is the default). It’s usually surrounded by a border, which you can turn off using noborder. Alternatively, a predefined line style can be selected as an integer argument to the border keyword. You can’t use explicit line options with set colorbox.
The standard size and position for the colorbox can be chosen using the default keyword. Alternatively, the keyword user selects customized sizing and placing of the colorbox, indicated through the appropriate optional arguments. For three-dimensional plots, the only permitted coordinate system is the screen system (see section 7.2); but for two-dimensional plots (including set view map plots), all coordinate systems can be used.
The colorbox can be suppressed using unset colorbox. Keep in mind that hiding the colorbox doesn’t allow the viewer to extract any quantitative information from your plot, because there’s no obvious mapping from colors to numbers and vice versa. Don’t do it unless you’re sure the graph is meaningful even without the colorbox.
You can think of the colorbox as just a little plot within a larger one, and so it responds to all commands and options that manipulate plot ranges, axes, and tic marks. For the colorbox, there’s only a single axis, running from low values (and the corresponding colors) to high values (and colors). This axis as known as the cb (for colorbox) axis. Table D.1 lists all relevant options. None of these options change the way colors are distributed in the palette; they merely change the way numeric values are assigned to those colors.
Option name |
Description |
Section |
---|---|---|
set cbrange | Sets the range of numeric values covered by the colorbox, independently of the range chosen for the range of z values. | 8.2 |
set logscale cb | Distributes numeric values logarithmically across colors. | 3.4 |
set cblabel | Assigns a text label to the colorbox. This label is placed next to the colorbox. If the label is invisible, use set cblabel offset, because the default placement may put the label outside the canvas area. | 7.3.3 |
cbtics | Controls all aspects of tics for the colorbox (major tic marks). | 8.3 |
mcbtics | Controls minor tic marks for the colorbox. | 8.3.4 |
grid cbtics | Draws grid lines within the colorbox at the major colorbox tic positions. | 8.3.6 |
grid mcbtics | Draws a grid within the colorbox at the minor colorbox tic positions. | 8.3.6 |
cbdata time | Chooses time-series mode for the values in the colorbox. | 8.4 |
cbdtics | Uses weekdays as tic labels for numeric values. | 8.4.1 |
cbmtics | Uses months as tic labels for numeric values. | 8.4.1 |
As I pointed out in section D.2.2, the positions assigned to the various nodes making up a palette are merely relative positions within the palette. In particular, they don’t necessarily correspond to the values that are being plotted. Hence there’s a separate step, wherein the range of the palette is mapped to the range of data values.
The relevant command in this context is set cbrange. Its syntax is equivalent to the familiar set xrange or set yrange command (see section 8.2). The set cbrange command assigns a specific numeric value to the minimum and the maximum position in the palette; intermediate values are distributed uniformly into the interval. To make this concrete, let’s say that you’ve defined a palette using
set palette defined ( 0 "blue", 0.5 "white", 1 "red" )
If the data that you want to plot now spans the range from -5 to +5, then you want to set cbrange to this interval:
set cbrange [-5:5]
All of this is extremely straightforward. Confusion can arise through three items in particular:
A common problem when using palettes is the occurrence of outliers in the data. (You’ll see a worked example in section D.5.1.) Imagine a data set where all data points take on values between 0 and 1 except for a handful of records that have a value of 10. Gnuplot’s default setting of set cbrange [*:*] means the palette will be autoscaled to the interval [0:10]. Most of the data points (and all the data points that you care about) are therefore mapped to the bottom tenth of the palette, with the consequence that they’re all drawn in more or less the same color. That’s not the intended effect.
The remedy is simple: fix cbrange to the relevant range of values using set cbrange [0:1]. Now, the “good” data points are mapped to the full range of the palette, and the few outliers are mapped to the color at the top end of the palette.
It may be desirable to indicate data points that exceed the specified cbrange using a special color. This can be done—see section D.5.1 for an explanation.
Gnuplot’s default of autoscaling cbrange means a single outlier can distort the mapping of colors to values. Choose an explicit cbrange to make sure the interesting data points span the entire range of the palette. Gnuplot handles outliers that exceed the cbrange reasonably.
Finally, you can use logarithmic scaling with the colorbox. Setting set logscale cb distributes tic marks (that is, numeric values) logarithmically across the color spectrum. This command doesn’t affect the way the palette is constructed from the nodes; it only affects the way the colors are mapped to data values.
Palettes can be used with different plot styles. First, wherever gnuplot expects an explicit color specification (such as rgb "red" or rgb "#ff0000"), you can instead provide a color from a palette using the keyword frac followed by the relative position in the palette. See the following listing and figure D.5.
set palette defined ( 0 'web-green', 1 'goldenrod', 2 'red' ) set grid lt 1 lw 1.5 lc palette frac 0.5 set label 1 "{/:Bold Origin}" at 1.5,-0.15 center tc palette frac 0.9 set arrow 1 from 1.1,-0.15 to 0.125,-0.02 lw 2 lc palette frac 0.9 plot [-5:3] airy(x) lw 2 lc palette frac 0.1
More interestingly, palettes can be used for data-dependent coloring: the color of each point or line segment is found by mapping the values of an additional data column into the palette, using the keyword palette in the plot command. Figure 9.5 showed how to use this feature to create a curve that changes its color continuously. The next listing shows another application (compare figure D.6).
set palette defined ( 0 'blue', 1 'grey', 2 'red' ) plot "cloud" u 1:2:3 palette pt 7
The first two columns determine the location of each point, and the third column is used to find the point’s color by mapping the numeric value into the provided palette.
Because data-dependent coloring depends on the ability to add columns to the using directive, it’s only available when plotting data. If you want to use it when plotting functions, you must employ the "+" pseudofile.
Furthermore, palettes can be used together with splot to create colored surface plots—that will be the topic of the following section. But the most exciting application of palettes is to use them in false-color plots that rely entirely on color to convey values in a data set. Because of their importance, they’re treated in their own section later in this appendix (section D.5).
Palettes were originally developed to color surface plots generated by splot (see appendix C). You saw an example at the beginning of this appendix (see figure D.1). The splot command uses palettes when the with pm3d style is given explicitly as part of the splot command, or when pm3d mode is enabled globally using set pm3d.
The set pm3d command controls various aspects of the way color is used:
set pm3d [ at [b|s|t] | map ] [ implicit | explicit ] [ border {lineoptions} | hidden3d {idx:linestyle} | nohidden3d ] [ interpolate {int:xsteps},{int:ysteps} ] [ corners2color [ mean|geomean|harmean|rms|median|min|max|c1|c2|c3|c4 ] ] [ scansautomatic | scansforward | scansbackward | depthorder ] [ flush [ begin | center | end ] ] [ [no]ftriangles ] [ clip1in | clip4in ]
In pm3d mode, gnuplot constructs a surface from colored, nontransparent polygons. Because the polygons are opaque, no explicit hidden-line removal is required—instead, surface areas closer to the observer hide surface areas further away. The resulting effect therefore depends on the order or direction in which the surface is drawn. Although gnuplot usually chooses a reasonable strategy for drawing surfaces, it helps to keep this point in mind when working in pm3d mode. (The gnuplot standard reference documentation contains more information on the algorithm.)
A colored, opaque surface can be drawn at three positions: at the top of the plotting box, on the plotted surface, or at the bottom. The position is specified using the keyword at together with a combination of the letters b (bottom), s (surface), and t (top). Each letter can appear twice (for example, set pm3d at bsb: this is one instance where the way surfaces are drawn in pm3d mode is potentially relevant). You can use the keyword map as a shortcut for set view map; set pm3d at b.
By default, the command set pm3d puts pm3d into implicit mode, meaning all surfaces drawn with splot are drawn using colored, nontransparent polygons. If you want to combine colored surfaces together with transparent, wire-mesh surfaces in a single graph, you need to choose explicit mode using set pm3d explicit. In explicit mode, you must specify pm3d as part of the splot command:
splot f(x,y) w l, g(x,y) w pm3d
This plots the function f(x,y) with a transparent wire mesh, but the function g(x,y) with a colored, opaque surface.
A colored surface can be drawn together with a wire mesh of the same surface using set pm3d hidden3d. This command takes as an additional, mandatory argument the index of a (previously defined) line style, which is used for the wire mesh. When using this plot mode, don’t forget to switch off the regular surface and hidden-line removal using unset surface; unset hidden3d. An alternative to hidden3d is the border keyword, which can be followed by any of the familiar line options (see the tip at the end of section 6.2.3) that should be used for drawing the wire mesh.
As an alternative to set dgrid3d (see section C.5), pm3d has a similar interpolating capability, triggered by the keyword interpolate. It takes two mandatory arguments, giving the number of interpolation steps in both x and y directions.
Two sub-options to set pm3d control the way the surface is constructed. The corners2color keyword selects how the color of each polygon is determined from the z coordinates of its four corners: as mean, median, and so on, or by choosing the value from one of the corners directly. The scansforward, scansbackward directives control the direction in which the surface is constructed. The default is scansautomatic and usually doesn’t need to be changed.
Further sub-options apply to certain edge cases when the input data doesn’t fall on a regular grid. See the gnuplot standard reference documentation for details.
So far, we’ve assumed that the color of each surface element was chosen based on the elevation of that surface element, in order to enhance the height perception of the surface. But it’s also possible to map a different quantity than the height to the color spectrum. This is done with an additional parameter to the using directive to the splot command:
set pm3d at s splot "data" u 1:2:3:4 w pm3d
Here, the first and second columns of the data file are taken as x and y coordinates, the value of the third column is used for the z coordinate (the height) of the surface, and the color is assigned according to the fourth column.
There are at least three different ways to use gnuplot palettes to create false-color plots:
All of these approaches have different strengths and weaknesses, which makes them suitable to different tasks. Table D.2 summarizes the most important differences.
with pm3d |
with image |
|
---|---|---|
Surface elements can be at arbitrary locations. | Surface elements must form a regular, but possibly distorted, grid. | Surface elements must form a strictly regular grid. |
Each surface element is drawn independently from all others. | Each surface element is determined by the data points at its four corners. | Each surface element is centered on a data point and extends halfway to the four neighboring data points. |
Surface elements can be at arbitrary locations. | Surface elements can be drawn on an elevated surface. | Surface elements must form a flat surface. |
Style is only available for data. | Style works with functions and data. | Style is only available for data. |
Imagine that you oversee a fleet of machines or servers, and you want to visualize the number of defects that have occurred for each device and for each day of operation. The situation is complicated by the fact that not all machines operate every day: depending on the daily load, machines are either active on a given day or not.
This problem has a natural grid-like structure: you care about the number of defects for each machine-day. But because not all machines are active on all days, the grid isn’t complete and has missing cells. One interesting question is this: do defects cluster in any way? The data format is simple enough: for each day that a machine was active, the data set contains a record with the day number, the machine ID, and the number of observed defects (see listing D.3).
# Day Machine Defects 1 1 2 1 3 0 1 4 1 1 5 2 ...
Figure D.7 uses gnuplot’s with points style, together with a color palette, to visualize this data set. The selected point size is relatively large, because each point represents a cell in the data grid. Each record is first plotted as a filled (colored) square; then a thin, black empty frame is drawn around it purely for aesthetic reasons. (Try it with and without the frame!) Two clusters of sometimes troublesome machines are clearly visible. (The commands are in the following listing.)
unset key set xlab "Day"; set ylab "Machine"; set cblab "Defects" offset 1,0 set palette defined (0 'web-green', 0.5 'goldenrod', 0.999 'red', 1 'black') set cbrange [0:7] plot [0:32][0:21] "machines" u 1:2:3 w p pt 5 ps 1.75 pal, "" u 1:2 w p pt 4 ps 1.75 lw 0.5 lc black
To get a good result with this data set, it’s necessary to specify cbrange explicitly. Although the number of defects for most of the data points is somewhere between 0 and 7, there is a handful of records that show catastrophic failure, with defect rates up to 50! If you relied on gnuplot’s autoscaling of the cbrange, then almost all the data points would be drawn in green. Fixing the palette using set cbrange [0:7] ensures that the interesting data points are distributed across the entire palette. The palette was designed for this scenario: its uppermost edge is black so that points exceeding the interval defined by cbrange stand out.
One last comment before leaving this section: this approach to creating false-color plots isn’t limited to the with points style, but works with most of the familiar styles. You may want to experiment with the with vectors and with circles styles (section 6.3.5). The with boxxyerrorbars style may come in handy to create rectangles of arbitrary size. For example, the following plot command is for all practical purposes equivalent to the one from listing D.4:
plot [0:32][0:21] "machines" u 1:2:(.35):(0.35):3 w boxxyerrorbarslc palette fs solid, "" u 1:2:(.35):(0.35) wboxxyerrorbar lw 0.5 lcblack
Although pm3d mode has so far been discussed only in the context of colored surface plots, it can also be used for false-color plots by placing the viewpoint vertically above the surface. This is what the set pm3d map option does.
For data sets on a regular grid, the with image style is probably more convenient, but there are two instances where you still need with pm3d for false-color plots: when plotting functions and when the points fall on a grid that is locally distorted. Figure D.2 demonstrated two examples of plotting functions using with pm3d, so I don’t need to repeat this here, but I’ll show an example of a distorted grid. Listing D.5 is similar to the file in listing C.2 that was graphed in a surface plot in figure C.9, but with one difference: the positions of the points down the center line are all slightly shifted in an irregular fashion. The splot command can handle this:
set palette defined ( 0 'blue', 0.5 'grey', 1 'red' ) set pm3dmap splot "distorted-grid" u 1:2:3
# x y z 0 -1 10 0 0 10 0 1 10 1 -1 10 1.2 0.2 5 1 1 10 2 -1 10 1.9 -0.2 1 2 1 10 3 -1 10 2.8 0 0 3 1 10
As you look at figure D.8, I’d like to emphasize how the surface elements are spanned by their four corners, and that the color is a combination of the z values at those four corners. You may want to experiment with set pm3d corners2color to see what results you get.
The with image style is suitable for flat graphs on a regular, rectangular (or at least rhomboidal) grid. The style is very restrictive (the grid has to be strictly regular, with no missing values) but convenient if the necessary conditions are fulfilled. The style performs few internal consistency checks, and it’s relatively easy to get gnuplot (and oneself) confused by trying to use with image for anything other than what it was intended for: flat graphs on strictly regular grids!
The with image style works with all three data formats familiar from splot: grid, matrix, and packed matrix (see section C.4). When using the packed matrix format (which doesn’t include coordinate information), then all the formal requirements are automatically fulfilled—this is the sweet spot for the style. When using the general, non-uniform matrix format, be aware that gnuplot does not evaluate all the coordinates. It merely examines the coordinates of the bottom-left and top-right corners (the origin and its diagonal counterpart) and then chops the coordinate axes into cells of equal size. It isn’t possible to have a non-uniform grid with the with image style, even when using the non-uniform matrix format!
Even more confusion may result when using the grid format, which includes explicit coordinate information for all cells, because gnuplot ignores most of them and again constructs a regular, equally spaced array of cells. The with image style will fail with an error message if any of the cells are missing.
The with image style is strictly for data sets on a regular, rectangular grid of equally spaced cells. If your problem doesn’t fit that description, don’t use this style.
Figure D.9 shows a false-color plot of the original file from listing C.2 (that is, without distortions). Notice that the individual cells are centered on the location of the data points. Because of the large areas of solid color, I used a pastel version of the usual red/white/blue palette. Also, notice that the command is plot, not splot!
set palette def ( 0 '#8080ff', 0.5 '#ffffff', 1 '#ff8080' ) plot "grid" u 1:2:3 w image, "" u 1:2 w p pt 7 lc black
The with image style won’t tolerate missing values: every cell in the grid must exist and contain a value, even if it’s an invalid value (see section 4.3.4). If gnuplot detects an invalid value, with image doesn’t draw the corresponding surface element; the background remains visible (that’s usually what you want).
Furthermore, if your data file is in grid format, you may think it isn’t necessary for the file to be sorted: after all, the grid format contains the locations of all the surface elements explicitly, and hence gnuplot should be able to place each surface element individually. But remember that the with image style doesn’t process the coordinates: instead, it places surface elements in the order in which they’re found in the data file. The data file therefore must be sorted, by horizontal coordinates first and by vertical coordinates second.
If your file doesn’t fulfill these two requirements, then you need to write a short script in the programming language of your choice that reads in the raw data, populates a suitable grid, and then prints the grid, in order, while supplying NaN values for all cells that don’t contain data. Or, you can use the with points style to begin with.
In addition to the with image style described so far, gnuplot also provides the with rgbimage and with rgbalpha styles. Both are similar to the with image style, but they expect each data point to be described not by a single value, but by separate color components. The with rgbimage style expects three values per data point (for the red, green, and blue color components), and the with rgbalpha style expects four (with the addition of an alpha channel for transparency). All values are expected in the range from 0 to 255. See the standard gnuplot reference documentation for more details.
Figure D.10 was created using the with image style from a file containing the x and y coordinates for each pixel and a third value that’s mapped into a color palette. The figure shows a small section of the complex plane near the edge of the Mandelbrot set. The x and y coordinates correspond to the real and imaginary components of a complex number, and the color-coded quantity is the number of iteration steps before the Mandelbrot iteration diverged.[9] The data file was calculated separately.
See the Wikipedia page on the Mandelbrot set for details of the calculation.
This example is interesting not only because the appearance of the Mandelbrot set continues to be fascinating, but also because it poses particular challenges for the design of a good palette. The range of values that needs to be represented through color spans three orders of magnitude (from 10 to 10,000). Moreover, the distribution of values is extremely uneven: all larger values are concentrated in a very narrow area near the border of the Mandelbrot set (the black area in figure D.10).
You can attempt to form a palette by distributing nodes non-uniformly, like so:
set palette defined ( 0 'cyan', 200 'blue', 300 'red', 600 'yellow', 2000 'white', 9999 'white', 10000 'black' )
The problem is that now the colorbox is almost useless: fully 80% of the area is white (from 2,000 to 10,000). It’s better to use logarithmic scales. Remember: set logscale cb doesn’t change the way the palette is constructed from the nodes, but only changes the way numerical values are mapped to colors (that is, how tic marks are distributed across the palette). In order to assign nodes to specific numerical values despite the logarithmic scale, the following listing uses the logarithms of the desired values in the definition of the palette and fixes cbrange explicitly.
set size ratio -1 unset key set palette defined ( log(15) 'cyan', log(100) 'blue', log(200) 'red', log(500) 'yellow', log(2000) 'white', log(9995) 'white', log(10000) 'black') set logscale cb set cbrange [15:10000] plot "fractal" u 1:2:3 w p pt 5 ps .05 lc pal z
One last question concerns the style to use with the plot command. This example may look like an ideal application of the with image style, because it involves a regular grid without missing cells. But the implementation of with image contains certain optimizations that lead to a loss of accuracy for a figure containing as much finegrained detail as the Mandelbrot set. I therefore used the with points style, but with a very small point size.
For palettes that are defined through nodes, two things need to be fixed:
In section D.2.5, I gave many examples of colors and color combinations. In this section, I’ll demonstrate a technique you can use to experiment with the position of each node.[10]
I’d like to thank Ethan Merritt for making me aware of this technique.
In section D.2.5, the nodes were distributed more or less uniformly across the spectrum. The need to place the nodes more carefully arises when there’s more than one strong visual gradient in the palette and you want these gradients to coincide with regions of significant change in the data set. Depending on the placement of the nodes, a false-color plot of the same data set can appear quite different—even when the colors for all the nodes remain unchanged. Wouldn’t it be lovely if you could move the nodes back and forth in the palette and have gnuplot simultaneously redraw the graph you’re working on? The following listing shows a tool that does just that.
This script is intentionally simple: it’s intended for palettes of only three nodes. But the principles carry over directly to more complicated situations. You invoke the script using the call command and giving the three colors that you would like to use in your palette:
call "palette-explorer.gp" "blue" "white" "red"
The script creates a palette using the supplied colors and also defines a parameter s that controls the relative position of the central node within the palette. Furthermore, the script creates two key bindings to the two function keys F1 and F2 (the choice of keys is obviously arbitrary; see section 12.4.3 for more information on the bind command). If the plot window has keyboard focus, then with each key press, the parameter s and therefore the position of the middle color are moved by the amount ds, either up or down. Then the palette is re-created, and the plot is redrawn—using the new palette—via replot. The effect needs to be seen to be believed. Try it out with a graph like figure D.6 or figure D.7.
As I said, this script is intentionally simple and is meant to demonstrate the basic principle most clearly. Three generalizations suggest themselves:
Have fun!
It’s difficult to find information on color selection that is based on empirical evidence; many recommendations aren’t much more than personal opinion. At the same time, it’s interesting to observe that there has been real (albeit slow) progress over time. Here are some suggestions:
Several websites make palettes available for download:
3.147.79.84