Chapter 6. A catalog of styles

This chapter covers

  • Directives for choosing styles
  • Gallery of plotting styles
  • Pointers to additional styles

The next few chapters describe the different ways you can control the appearance of a plot: how to make it look just right. In this chapter we’ll discuss the various ways to display data, and in the next chapter we’ll talk about all the other stuff that goes onto a plot, such as labels, borders, arrows, and similar decorations. Because axes and their labels can provide so much relevant information about a plot, they’re given their own chapter (chapter 8). And finally, in chapter 9, we’ll discuss how you can customize the appearance of all plot elements, in case you don’t like the defaults.

Most of this chapter is taken up by a comprehensive gallery of available plotting styles. Feel free to treat it as a catalog: look at the pictures to see whether there’s anything you like and might want to use right away. As your needs change, you can always come back and check out what other styles are available.

In any case, be sure to read section 6.2, because it explains the mechanics of how to choose plot styles in gnuplot. The remainder of the chapter is optional (at this point).

6.1. Why use different plot styles?

Different types of data call for different display styles. For instance, it makes sense to plot a smooth function with one continuous line but use separate symbols for a sparse data set in which each individual point counts. Experimental data often requires error bars together with the data, whereas counts of discrete events call for histograms. Choosing an appropriate style for the data leads to graphs that are both informative and aesthetically pleasing.

But the choice of plot style isn’t just an aesthetic issue: different graphical representations give a data set context and may even imply specific semantics:

  • Continuous lines indicate a function or a dense data set with little noise.
  • Explicit point symbols emphasize the discrete nature of a sparse data set and make individual data points stand out.
  • Boxes are mostly used for histograms and so imply counts of discrete events.
  • Error bars make the uncertainty in a data set explicit.
Tip

Remember: the style you choose determines not only what people are seeing, but also how they will interpret it!

6.2. Styles and aspects

Gnuplot can create graphs using different overall styles: you’ve already encountered with lines, with points, and with linespoints. In this chapter, you’ll learn about additional styles for drawing boxes, error bars, circles and arrows, and so on. The style determines the overall appearance of the plot. Styles are usually specified using the familiar with keyword.

All graphs are constructed from lines and point symbols. In addition to the overall choice of style, you can also customize many aspects of lines and points: things like the width, color, and dash pattern of lines; and the shape, size, and color of points. These aspects can be set globally or locally for each data set individually.

Unless you specify aspects explicitly, gnuplot will choose a different set of aspects for each new data set you include in a graph. For example, if you create a graph from four data sets, by default each data set will be drawn using a different color (or dash pattern) and point shape. You can override this default sequence locally (as part of the plot command) or redefine the overall sequence globally.

In this section, we’ll first review how to use inline customizations to modify the appearance of graph elements as part of the plot command. Then I’ll give you an overview of the aspects that can be changed this way.

6.2.1. Choosing styles inline through with

Generally, plot styles are chosen inline, as part of the plot command. (There are also ways to set global style preferences—we’ll come back to this in chapter 9.) You already saw inline styles in chapter 2; by giving the with keyword as part of the plot command, you can specify which style to use:

plot "data" u 1:2 with lines, "" u 1:3 with linespoints,
   "" u 1:4 with points

As usual, keywords can be abbreviated to the shortest unambiguous form, so it’s more convenient to write w l, w linesp (or even: w lp), and so on.

6.2.2. The default sequence

In graphical analysis, you often want to plot several similar data sets, possibly together with functions, all on the same plot—usually so that they can be compared directly to one another:

plot [0:4][-0.25:1.5] "sequence" u 1:2 w lp, "" u 1:3 w lp, "" u 1:4 w lp,
                    -log(x)+(x-1) w l

All data sets are plotted using the same overall style (with linespoints), but gnuplot makes sure to change the color (or dash pattern) of the lines used. Gnuplot also uses a different point symbol for each data set (see figure 6.1).

Figure 6.1. Gnuplot automatically chooses a different line style for each curve.

By default, lines and points are chosen automatically from a sequence of available patterns. If there are more data sets in the plot than there are patterns in the sequence, the selection starts again at the beginning.

You can override the automatic selection by explicitly specifying the index of one particular element from the default sequence. For example, if in the previous example you wanted to draw the second data set in the same style as the first, you could say

plot [0:4][-0.25:1.5] "sequence" u 1:2 w lp, "" u 1:3 w lp lt 1,
                    "" u 1:4 w lp, -log(x)+(x-1) w l

Here, lt is short for linetype (which refers to elements of the default sequence), and lt 1 asks for the first element from the sequence. Fixing a specific line like this doesn’t affect the way the internal counter is incremented, so the ways the fourth column of the data and the function are displayed don’t change.

So far, I’ve only shown you how to select a specific entry from the default sequence. Instead of making such a local change, you can also customize the default sequence globally—or even use different default sequences for different purposes. We’ll discuss how to do that in chapters 9 and 12, but it’s pretty advanced stuff. Most of the time, you’ll find that gnuplot cycles through the default sequence in a way that just works.

6.2.3. Customizing graph elements

You can exercise detailed control over various aspects of graph elements. Table 6.1 lists all commonly used properties. Many are reasonably self-explanatory, but some may not be. In particular, color specifications and customized dash patterns require more detail—we’ll come back to them in chapter 9.

Table 6.1. Modifiers for the appearance of lines, points, and other graph elements

Keyword

Abbreviation

Description

linetype lt A line type is a combination of all aspects of both lines and points: line width, color, and dash type, as well as point type, point size, and point interval. If not instructed otherwise, gnuplot will cycle through a predefined sequence of line types for each new curve in a plot.
linestyle ls User-defined combination of appearance aspects. Similar to linetype but not eligible for automatic assignment to curves. (See section 9.4.)
linewidth lw Numerical multiplier, multiplying the default line width.
linecolor lc Color used to draw graphical elements (lines and points). (See section 9.1 for the syntax of color specifications.)
dashtype dt Pattern of dots and dashes used to draw the line. (See section 9.2.2 for the syntax of dash-pattern customizations.)
pointtype pt Shape or symbol used to display a data point. Chosen from a selection of available symbols. (See section 9.2.1.)
pointsize ps Numerical multiplier, multiplying the default point size.
pointinterval pi Controls whether only a subset of records is displayed with an explicit symbol when using the linespoints style. (See section 6.3.1.)
textcolor tc Color used for textual labels. (See section 9.1 for the syntax of color specifications.)
fillstyle fs Closed shapes (boxes, circles, ellipses, candlesticks) can be filled with a color or a pattern. The keyword solid, followed by a density between 0.0 and 1.0, fills the shape with the current color at the specified density; pattern uses the current pattern from the range of available patterns; and empty leaves the shape unfilled. (See section 9.4.4 for more information on fill styles.)
fillcolor fc Color used to fill closed shapes. (See section 9.1 for the syntax of color specifications.)
border bo Describes the border used for filled areas. Expects any combination of line options as argument. (See the tip at the end of this section for the lineoptions shorthand.)
arrowstyle as Selects a separately defined arrow style when using with vectors. (See section 7.3.2 for more information on arrows and section 9.4.3 for information on arrow styles.)

There are several contexts in which the properties from table 6.1 can be specified:

  • Some properties can be set globally. For example, you can use the set pointsize option to adjust the size of point symbols globally. Other properties can be configured globally using set style (see chapter 9).
  • Several terminals allow you to modify the line widths and point size used for the selected terminal (see chapter 10).
  • By far the most common way to specify appearance aspects is inline as part of the plot command. It’s straightforward to do this, as I’ll show next.
Inline customizations

To change any appearance property, you tack the desired specifiers onto the function or data set in the plot command:

If there are multiple curves in a plot, attach the specifiers to the desired curve:

plot "data" u 1:2 w lp lt 3, sin(x) w l lt 1

This plots the data set using the properties of line type 3 and the sine function with the properties of line type 1.

You can include more than one modifier (but gnuplot will prevent you from including the same modifier multiple times):

Line types and line styles are special: they’re containers for all available properties. Each line type (or line style) indicates a line width, color, and dash pattern, as well as a point shape and size. By specifying either a line type or a line style, you choose values for all properties at once. Gnuplot allows you to override individual aspects, and doing so is common:

As you can see, if a line type (or line style) is given, the curve inherits the applicable properties, unless they’re explicitly overridden. In the previous example, the data set was plotted with points (that is, without connecting lines), so the graph inherited point shape, size, and color from the indicated lt 4. But because the point type was selected explicitly, the resulting graph used color and point size from lt 4 but the point shape corresponding to pt 3.

Shorthand: line options

Most styles and decorations introduced in this and the next few chapters are constructed from lines, and the appearance of these lines is usually subject to the appropriate modifiers from table 6.1. In particular, linetype, linestyle, linewidth, linecolor, and dashtype are generally recognized.

To condense the presentation, I won’t list these modifiers individually going forward but will instead refer to them as line options. In command summaries, the string lineoptions is henceforth used as a placeholder for any combination of the aforementioned explicit appearance modifiers. Of course, lineoptions is not a legal gnuplot keyword—it’s just a notational shorthand.[1]

1

That being said, as a symbolic placeholder for all available line modifiers, the lineoptions shorthand is more robust and often more accurate than an explicit list that becomes incomplete whenever gnuplot’s selection of line modifiers grows. The standard gnuplot reference documentation contains many instances where explicit lists of line modifiers haven’t kept up with gnuplot’s increased capabilities and are now incomplete.

Tip

In command summaries, the placeholder lineoptions stands for any combination of the explicit line-appearance modifiers listed earlier. This is just a convenient shorthand; lineoptions is not a legal gnuplot keyword!

6.3. A catalog of plotting styles

Well over two dozen styles are available in gnuplot. Here we look at those most useful for ordinary, two-dimensional data. I’ll provide pointers to additional styles (for more complicated data sets) in section 6.5, toward the end of this chapter.

6.3.1. Core styles: lines and points

There are four styles I consider core styles, because they’re so generally useful: with points, with lines, with linespoints, and with dots. These styles represent data with simple symbols or lines on the plot (see figure 6.2).

Figure 6.2. The four core styles: with points, with lines, with linespoints, and with dots

Points

The points style plots a small symbol for each data point. The symbols aren’t connected to each other. This is the default style for data (see figure 6.2).

The size of the symbol can be changed globally using the set pointsize command. The parameter is a multiplier, defaulting to 1.0:

set pointsize {flt:mult}

You can also change pointsize (abbreviated ps) inline:

plot "data" u 1:2 w points pointsize 3

The point type (or point shape) can be selected using pointtype (or pt). Remember that ps stands for pointsize, not for point style (there is no such thing)! Figure 9.7 shows all commonly available point shapes.

In addition to the selection of available point types, you can also specify a single letter as argument to pointtype. The selection of symbols available depends, of course, on the fonts installed on your system. Modern fonts that provide glyphs for a large number of Unicode code points may offer some interesting possibilities (see figure 6.3)! Note that when you use a character as a symbol, you must change its color using text-color instead of using linecolor. (See the sidebar in section 5.1 for information on Unicode and how to access characters from extended sets.)

Figure 6.3. Variants of the with linespoints style. Top to bottom: default behavior, using a (regular) character as symbol, omitting every second symbol, and using a character from an extended range of glyphs while omitting two out of every three points. Note that in the last case, the lines have been erased to make room for the characters by giving a negative argument to the pi keyword.

Lines

The lines style doesn’t plot individual data points, only straight lines connecting adjacent points. This is the default style for functions and the preferred style for dense data sets without too much noise.

Many aspects of lines, including their width, color, and dash pattern, can be customized. For instance, to double the width of a single line in a plot, you can use the following inline directive:

plot "data" u 1:2 w l linewidth 2
Linespoints

The linespoints (abbreviated lp) style is a combination of the previous two: each data point is marked with a symbol, and adjacent points are connected with straight lines. This style is mostly useful for sparse data sets.

What has been said about the with points and with lines styles generally applies to with linespoints: the default size of the symbol is controlled by the set pointsize option but can be overruled inline, as can the line type and the point type.

A property that is only available for linespoints is pointinterval (short: pi).[2] If this property is set to a positive integer n, then gnuplot plots a symbol only for every nth data point; all other points are drawn with lines only (see figure 6.3). This can be extremely useful for dense data sets.

2

This feature is a relatively recent addition to gnuplot.

If pointinterval is negative, then it has the same effect as just explained, but in addition, a circular region centered at the location of the point is cleared, using the special bgnd color before the symbol is drawn (see figure 6.3). (The background color bgnd is a property of the terminal; use set terminal ... background ... to set it. See chapter 10.) The size of the cleared region is controlled using the set pointintervalbox option:

set pointintervalbox {flt:mult}

The value of pointintervalbox multiplies the point size to find the radius of the region to be cleared. All this is demonstrated in figure 6.3.

Dots

The dots style prints a minimal dot (a single pixel for bitmap terminals) for each data point. This style is occasionally useful for very large, unsorted data sets (such as large scatter plots). Figure 1.2 was drawn using dots.

6.3.2. Indicating uncertainty: styles with error bars or ranges

Sometimes you don’t just want to show a single data point; you also want to indicate a range with it. This may be the experimental uncertainty (the error bar), or it may be the range over which some quantity has changed during the observation interval (this is typical of financial charts). Gnuplot offers several styles that place an indicator for such a range onto the plot. First we’ll look at styles that draw regular error bars (both in vertical and in horizontal directions). Then we’ll go on to discuss styles that let you indicate several ranges at once (but only in the vertical direction).

Styles with error bars

There are two basic styles to show data with error bars in gnuplot: errorbars and errorlines (see figure 6.4). The errorlines style is similar to the linespoints style (a symbol for each data point, adjacent points connected by straight lines), whereas the errorbars style is similar to the points style (unconnected symbols).

Figure 6.4. Different plot styles showing uncertainty in the data. From top to bottom: connected symbols using errorlines, unconnected symbols using errorbars, ranges indicated as boxes using boxxyerrorbars, and errors on top of a histogram using boxerrorbars.

These styles draw error bars in addition to the data. Error bars can be drawn in either the x or y direction or both. To select a direction, prefix the style with x, y, or xy, respectively, as in plot "data" with xyerrorlines. Table 6.2 summarizes all available combinations.

Table 6.2. All possible combinations of errorbars and errorlines styles
 

Error bars in x direction

Error bars in y direction

Error bars in both directions

Unconnected symbols xerrorbars yerrorbars xyerrorbars
Connected symbols xerrorlines yerrorlines xyerrorlines

The appearance of both the errorlines and errorbars styles is determined by the current line style. Both styles draw a symbol at the location of the data point—which is unnecessary, because the intersection of the error bars already indicates this position! You can suppress the symbol by specifying pointtype 0, like so: plot "data" u 1:2:3 with yerrorlines pt 0.

The error bars themselves are drawn in the current line style. A tic mark is placed at the ends of each error bar (see figure 6.4). You can control the size of the tic mark using the set bars option:

set bars [ small | large | fullwidth | {flt:mult} ]

The parameter is a multiplier, defaulting to 1.0. The symbolic names small and large stand for the values 0.0 and 1.0, respectively. The value fullwidth is only relevant to histogram styles (more on that in a minute). Finally, you can turn off the tic marks using unset bars.

Error bars require additional information. Just the x and y coordinates aren’t enough: you must also provide data about the size of the uncertainties. Usually, this data comes from the data file, in the form of one or two additional columns. If one additional column is given, it’s interpreted as a range dy to be added to and subtracted from the corresponding data value, so the error bar is drawn from (x, y-dy) to (x, y+dy). If two additional columns are supplied, they’re interpreted as the absolute coordinates of the lower and upper ends of the error bar (not the ranges), so error bars are drawn from (x, ylow) to (x, yhigh). Corresponding logic applies to error bars drawn in the x direction.

As usual, the columns to use are indicated with the using directive to plot:

Data transformations (see section 3.3) are often useful in this context. Here are some examples:

  • If the input file contains only the variance (instead of the standard deviation, which is usually plotted as error) together with the data, you can apply the necessary square root inline: plot "data" u 1:2:(sqrt($3)) w yerrorb.
  • If you know that the uncertainty in the data is a fixed number (such as 0.1), you can supply it directly: plot "data" u 1:2:(0.1) w yerrorl.
  • If the data supplied in the file is of the unsupported form (x, y, ylow, yhigh, dx), you can build up the required plot command manually: plot "data" u 1:2:($2-$5):($2+$5):3:4 w xyerrorl.

As a final style to visualize data with uncertainty in both directions, there’s boxxyerror-bars. It’s similar to the xyerrorbars style, except that the range of uncertainty is shown as a rectangular box centered at the data point, rather than as a cross of error bars. (If you prefer to use circles or ellipses instead of boxes to indicate a confidence range, you may want to peek ahead at section 6.3.5.)

The last style that uses error bars is boxerrorbars (not to be confused with boxxyerrorbars), which is a combination of the boxes and yerrorbars styles. It’s displayed as a box with a vertical error bar centered at its top. It might be used, for instance, for histograms that have some uncertainty in their counting statistics. The additional values required for the error bar are supplied as the third (or third and fourth) arguments to the using directive. The box width is provided as the last argument to using.

The styles discussed in this section are mostly used to plot data stemming from scientific experiments or calculations, where you want to show the uncertainty in the data clearly. But there are other situations where you may want to indicate a range (or even several ranges) together with the data. Those are the topic of the next section.

Time-series styles

Gnuplot offers two styles that are mostly useful for time-series data, although they can be used for other purposes as well: candlesticks (also known as a bar-and-whiskers plot) and financebars (see figure 6.5). Both can show two ranges in a single (vertical) direction: for instance, the typical band of variation and the highest and lowest values ever. Both are frequently used for financial data (such as stock prices), and I’ll discuss them in those terms. The candlesticks style in particular is quite versatile and can be used to good effect in a variety of situations.

Figure 6.5. Styles for time series: financebars and candlesticks. Filled bars indicate that the closing value is less than the opening value for the current record.

Both styles require five columns of data: the x value followed (in order) by the opening, low, high, and closing prices. As usual, the appropriate columns are selected with the using directive to the plot command. Additional columns may optionally be specified to indicate the box width or the color.

Both styles represent the maximum range (low to high) with a vertical line. They differ in the way the secondary (opening to closing) range is displayed: in the candlesticks style, a box of finite width is overlaid on the vertical line; in the financebars style, tic marks indicate the opening and closing values. The size of the tic marks is controlled by the set bars option familiar from errorbars styles (see figure 6.5).

You can control the details of the candlesticks style using some additional options. If the current fill style is empty and the closing value is less than the opening one, the box is filled with the current color; otherwise, the box is left empty. (If you choose a particular fill style using set style fill, then this style is used to fill the box, independent of the opening and closing values. See section 9.4.4 for more information on set style fill.)

You can change the width of the box either by using the set boxwidth option or by supplying a value in a sixth column. If the width is read from a column, it’s assumed to be given in the same units as the x coordinate. (If boxwidth is unset, the value of set bars is used instead, but this usage is deprecated for candlesticks. For finance-bars, the set bars option is the only way to change the size of the opening and closing markers.)

To place tic marks at the ends of the vertical line, append the keyword whisker-bars (or whisker) to the plot command. You can control the size of these tic marks independently from the box width by appending a numerical value to the whisker-bars keyword. This value is interpreted as a multiplier giving the length of the tic mark relative to the box width.

A few examples will make this clearer:

Neither the financebar style nor the candlesticks style connects consecutive entries. If that’s what you want, you’ll have to do so explicitly. Keep in mind that it’s not clear what should be connected in these styles—they don’t have a concept of a “middle” value. You must supply this information separately.

6.3.3. Styles with steps and boxes

Box styles, which draw a box of finite width, are sometimes useful for counting statistics or for other data sets where the x coordinates have only discrete values.

Steps

Gnuplot offers three styles to generate steplike graphs consisting only of vertical and horizontal lines (see figure 6.6). The only difference between the three styles is the location of the vertical step:

  • histeps centers each bin around the supplied x value.
  • steps places the vertical step at the end of the bin.
  • fsteps places the vertical step at the front of the bin.
Figure 6.6. The three step styles. The same data set is shown three times (vertically shifted). Individual data points are represented by symbols; the three steps styles are shown with continuous lines. Note how different the same data set can appear, depending on the exact location of the vertical steps.

If in doubt, the histeps style is probably the most useful.

Boxes and impulses

In contrast to the step styles from the previous section, the boxes style plots a box centered at the given x coordinate from the x axis (not from the graph border) to the y coordinate (see figure 6.7). The width of the box can be set three ways:

  • Supplied as the third parameter to using.
  • Set globally through the set boxwidth option.
  • Otherwise, boxes are sized automatically to touch adjacent boxes.

Figure 6.7. Box and impulse styles. The widths of boxes can be set globally or for each box individually. The second data set uses a fixed width (enclosed in parentheses in the using directive); the third one reads values for variable box widths from file.

If you supply a third column in the using directive, it’s interpreted as the total width of the box in the same coordinates that are used for the x axis. The set boxwidth option has the following syntax:

set boxwidth [ {flt:size} ] [ absolute | relative ]

The size parameter can either be a measure of the absolute size of the box in x axis coordinates or denote a fraction of the default box size, which is the width of the box if it touches adjacent boxes. If absolute mode isn’t stated explicitly, relative sizing is assumed. You can use a boxwidth of -2 to force automatic sizing of boxes (with adjacent boxes touching each other). The impulses style is similar to the boxes style with boxwidth set to zero. The examples in figure 6.7 clarify this.

Boxes can be filled or shaded, according to the global set style fill option (see section 9.4.4) or the inline fillstyle specification. The fillsteps style is a filled equivalent of the steps style.

6.3.4. Filled styles

You can fill the area between two curves in two-dimensional plots with color or patterns using the filledcurves style. The appearance of the filled regions is determined by the settings of the fill style, which is controlled by the set style fill option (see section 9.4.4) or given inline.

Filling the area between two curves requires a data set with at least three columns: for the x coordinate and the y coordinates of the two curves. The two curves constitute the boundaries of the filled area. If the two lines cross each other, you can distinguish the enclosed areas depending on whether the first or the second line is greater than (that is, above) the other. By default, all enclosed areas are shaded, but you can restrict shading to only one of the two kinds of areas by appending the keyword above or below.

This style is only available when plotting data from a file, but you can “fake” it using the "+" pseudofile (see section 4.5). That’s what’s done in the following code snippet (see figure 6.8):

plot [0:3*pi][-1.1:2.1] "+" u 1:(sin($1)):(0.5*sin($1)) w filledc,
                       "" u 1:(1+sin($1)):(1+0.5*sin($1)) w filledc above,
                       "" u 1:(2+sin($1)):(2+0.5*sin($1)) w filledc below
Figure 6.8. Shading the area between two curves in a single data set: plot "data" u 1:2:3 w filledcurves. You can select only those areas where one of the curves is above or below the other curve. (See the text for details.)

All three versions are shown: the default (with all enclosed areas filled) and—vertically shifted—the same curves with only the areas above and below filled.

In addition to filling the areas between two explicitly given curves, gnuplot can also handle other boundaries:

  • Fill the area between one curve and one straight line (which may be one of the coordinate axes or a plot boundary).
  • Treat a single curve as a closed polygon, and fill its interior.
  • Specify an additional point that will be included in the construction of the polygon.

I suggest you consult the standard gnuplot reference documentation for the pertinent details.

6.3.5. Beyond lines and points: multivariate visualization

Several styles allow you to encode information by other means than the position on the graph: with labels reads a text string from the input file and places it on the plot. Using pointsize variable, you can change the symbol size according to the values in the data set. Finally, you can place arrows or entire geometric shapes on the plot using the with arrows, with circles, and with ellipses styles.

All these styles have in common that they consist of graphical elements that aren’t simple but that instead have internal degrees of freedom—that’s what enables them to encode additional information beyond their position on the graph. (We’ll come back to multivariate visualization in section 14.2.) At the same time, because they’re more than “lines and points,” the distinction between these styles and some of the decorations discussed in the next chapter is a bit fuzzy—in fact, these styles share many implementation details with the corresponding decorations. Frequently, what applies to one applies to the other, even if it doesn’t say so explicitly in the text!

Tip

The with labels, with arrows, with circles, and with ellipses styles share many properties with the set label, set arrow, and set object decorations.

Labels and point size

Let’s look at an example that demonstrates both the with labels style and the point-size variable facility. Listing 6.1 shows a short data file containing the additional information required for labels and symbol sizes as additional columns. Given this file, you can generate the plot in figure 6.9 using the following commands:

plot [0:6][0:3.5] "labels" u 1:2:3 w p pt 6 ps var,
                "" u ($1+0.25):($2-.25):4 w labels
Figure 6.9. Encoding additional information through symbol size or textual labels: pointsize variable and with labels. The corresponding data file is shown in listing 6.1.

Both styles require a third column as part of the using declaration, the contents of which are interpreted as labels or desired symbol sizes, respectively. Variable symbol sizes are most easily recognized if the symbols are circles, which are chosen by pointtype 6 (pt 6 for short). Then follows the pointsize variable (abbreviated ps var) specification. Labels are chosen using with labels, and all the labels are offset a little down and to the right so they don’t overlap with any of the circles.

Listing 6.1. Data for figures 6.9 and 6.10 (file: labels)
# x     y       size    label
1       2.6     3       ABC
2       2.1     6       EFG
3       1.0     2       PQR
4       1.2     1       UVW
5       1.6     4       XYZ

You should exercise some caution when using pointsize variable: it’s often difficult for the observer to judge the size of symbols accurately. Moreover, it’s not necessarily clear to the observer whether the radius or the area of the symbol is proportional to the encoded quantity. I’ll have more to say about visual perception in chapter 14.

In contrast, the with labels style is frequently useful and very versatile. As mentioned earlier, many of the options of set label carry over to with labels; see section 7.3.3 for details. Here are only a few remarks:

  • You can rotate the label using rotate by.
  • You can change the text font and its size using font "...".
  • You can select the text color with textcolor (tc). Data-dependent text coloring is available through tc var (see section 9.1.5 for more on data-dependent color selection).
  • You can draw a point symbol using point and shift the text label relative to it using offset.
  • If the hypertext directive is included, the text label is usually hidden and is displayed only when you hover with the mouse near the point’s location. (Obviously, this is relevant only for interactive terminals.)

Figure 6.10 shows some of these options in action—the data is again that shown in listing 6.1. The plot command is as follows:

plot [0:6][0:3.5] "labels"
                 u 1:2:(sprintf("{/=%d %s}",5*column(3), strcol(4))):1
                 w labels rotate by 30 point pt 5 offset 1,-1 tc var
Figure 6.10. Further capabilities of the with labels style: plotting a point symbol together with the text label, and changing the color, size, and orientation of the label

Mind you, I’m not saying this is an example of a successful visualization! But it does demonstrate many of the concepts just introduced. (The variable text size uses enhanced text mode—I’ll formally introduce this concept in section 10.2.3.)

A final word of warning: in contrast to what’s customary elsewhere, the with labels style directive can’t be abbreviated. Neither with lab nor with label will work!

Shapes and arrows

Gnuplot defines three shapes that can be used as plotting symbols: arrows, circles, and ellipses. What’s different about these plot elements is that they have a nontrivial internal structure that can be used to encode information: the arrow has not only a position (as does any other point symbol) but also a direction and a length. An ellipse has a variable shape and orientation; and circles are interesting, because gnuplot lets you draw arc segments in addition to full circles. (As I pointed out earlier, these styles share many properties with set arrow and set object—see sections 7.3.2 and 7.3.4 for additional details.)

As an example, consider the short data file shown in listing 6.2. In listing 6.3, I use the first two columns as x and y coordinates and display the data in the third and fourth columns as size and angle, respectively. (Compare figure 6.11.)

Figure 6.11. The data from listing 6.2, plotted using with vectors, with circles, and with ellipses. See listing 6.3 for details of the plot command.

Listing 6.2. Data for figure 6.11 and listing 6.3 (file: shapes)
# x     y       size    angle
1       1       0.1     0
2       1.2     0.2     15
3       1.8     0.3     45
4       1.7     0.2     90
5       1.6     0.15    160
6       1.5     0.2     250
7       1.4     0.3     300

Arrows are drawn using the with vectors style. They require four columns in the using directive: x and y position, followed by the offset of the arrow’s tip from its starting position. You can control the way the arrow is drawn either by giving explicit style details inline (see section 7.3.2 to learn how to control an arrow’s appearance) or by referencing a separately defined arrow style (see section 9.4.3 for globally defined styles). With arrowstyle variable, you can also read the index of a predefined arrow style from an additional column:

Circles require only two columns, for the x and y positions. If you specify a third column, it’s used for the radius; otherwise the radius is taken from the global set style circle setting. (The radius is always measured in x coordinates.) The fourth and fifth columns, if present, measure the beginning and ending angle of the drawn arc segment (the “pie slice”). Both are measured in degrees, counterclockwise, against the positive x axis. The circle (or circle segment) can be filled using either an inline fillstyle specification or the global set style fill setting. Finally, the color of the entire shape may be read from an additional column, using linecolor variable. Keep in mind that the resulting shape is always a circle, independent of the selected y range. If you want to control the ratio of the horizontal to the vertical axis, you must use with ellipses.

Ellipses require four columns, for the x and y coordinates of the center and for the length of the major and minor diameters. You can use a fifth column to give the angle of the ellipse with the positive x axis. Fill and color properties are the same as for circles. (There are some fineries about the units in which the diameters of an ellipse are measured. Check the standard gnuplot reference manual for details.)

Let’s now discuss the commands used to create figure 6.11, as shown in listing 6.3. This listing is worth studying in detail, because it demonstrates how you can adapt plot styles to a given purpose through judicious application of inline transformations.

Because the various shape styles expect angles in degrees (not radians), you first need to make sure trigonometric functions will interpret their arguments accordingly by using the set angle option (for more information, see appendix F). The data set is then plotted several times, shifted vertically. Basic trigonometric relations are used to calculate the offsets of the arrow’s tip from the angle and the radius. In the first row of complete circles, the command ignores the angle argument in the fourth column. For both the incomplete circles and the ellipses, the y coordinate is ignored instead, and all shapes are plotted at the same vertical elevation.

Listing 6.3. Commands for figure 6.11 using data from listing 6.2 (file: shapes.gp)
set angle degrees

plot "shapes" u 1:($2+2):(2*$3*cos($4)):(2*$3*sin($4)) w vectors,
   "" u 1:($2+1):($3) w circles,
   "" u 1:(1):($3):(0):4 w circles,
   "" u 1:(0):(0.5):($3):4 w ellipses

6.4. Putting it together

This section combines everything you’ve learned in this chapter to create figure 6.12. Imagine that you’ve been given a set of daily weather observations (the beginning of the file is shown in listing 6.4). It includes the day of the month followed by the temperature (in Celsius), the dew point,[3] the cloud cover (in percent), and visibility (in miles, where 10 miles means unrestricted visibility). The last four columns give the wind speed (in knots), wind direction, air pressure (in millibars), and the pressure’s short-term trend (up or down). That’s a lot of data, and we’d like to present it both compactly and intuitively.

3

The dew point is the temperature at which water vapor in the air begins to condense. The dew point is always lower than or equal to the actual temperature, and the closeness of those two temperatures is a measure of the humidity in the air.

Figure 6.12. All-in-one visualization of the data set in listing 6.4: temperature and dew point as solid lines, atmospheric pressure and its trend with symbols, visibility with grey bars; wind speed and direction across the top, cloud cover along the bottom. See listing 6.5 for the commands.

Listing 6.4. Incomplete data for figure 6.12 with commands in listing 6.5 (file: weather)
# Day    Temp   Dew     Cloud   Visib   Wind    Direct  Press    Trend
1        14.3   14      20      7       0       0       1007.6   dn
2        14.3   14      80      2.5     7       0       1006.4   up
3        13.8   13.5    70      10      6       150     1006.4   dn
4        14.3   13.9    50      10      8       190     1009.6   --
5        14.3   13.9    10      10      6       170     1012     dn
6        12.4   11.8    80      1.75    6       210     1008.2   up
7        12.4   11.5    90      2.5     3       190     1005.8   dn
...

In figure 6.12, temperature and dew point are shown using continuous lines, pressure is shown using unconnected symbols (in millibars above 1,000), and the symbols indicate the most recently observed tendency: light triangles pointing upward for rising tendency; and dark, downward triangles for falling tendency. Visibility (or, rather, restrictions to visibility) are shown as grey bars in the background—the shorter the remaining white area at the bottom of the graph, the shorter the visible range. Finally, wind is shown with vectors along the top, and cloud cover is shown using circles along the bottom. The complete commands are given in the following listing.

Listing 6.5. Commands for figure 6.12 using data from listing 6.4 (file: weather.gp)

This set of commands uses many of the concepts introduced previously. The plot command in this example is unusually complicated (including nine rather different quantities!). Here are a few pointers as you step through it.

This example uses the candlestick style to draw the grey bars that indicate the visible range . This style is convenient because it draws a box, all dimensions of which you can control explicitly. The whiskers have zero length.

Gnuplot draws items onto the graph in the order in which they appear in the plot command. For this reason, it’s important that the visibility comes first: the grey bars should be in the background and not obscure any of the other plot elements.

The yerrorlines style comes in handy to draw the temperature and the dew point .

Next come the circle segments for the cloud cover . There is a little math to account for the fact that gnuplot measures angles counterclockwise from the horizontal axis, but angles should increase clockwise from the 12 o’clock position in this graph. The filled pie slices are drawn first; the empty black circles that provide a rim for each symbol are drawn afterward, so they’re visually in front.

Two horizontal lines are included to divide the graph visually . This part of the plot command could also have come earlier or later.

Now the wind speed and direction indicators are added . First a circular symbol (pointtype 7) is drawn to indicate the origin of each vector—this is important if the wind speed is so low that the vector has zero length. Next comes the vector, again adjusting the angles to be measured clockwise from the top rather than counterclockwise from the right. The arrow head is suppressed using nohead: the origin of each arrow is already clearly marked, so there’s no need to clutter the graph by drawing explicit heads.

Finally, the symbols are drawn for the pressure tendency . This part of the command exploits gnuplot’s ability to use a single character as plotting symbol. The symbols can be found in the “Geometric Shapes” section of the Unicode standard: the light upward triangle has code point U+25b3, the filled downward triangle corresponds to U+25bc, and the bullseye character has code point U+25cd. The latter was chosen to be as neutral as possible compared to the other two symbols (no angles and of intermediate visual weight, due to the inner circle on an unfilled background).

The sym(s) function consists of two nested conditional operators and converts the strings used in the data file to the characters used in the plot. You need to use the stringcolumn(i) function here—the $9 shortcut doesn’t work in this evaluation context. Also remember that stringcolumn(i) takes a numeric argument: the position of the column from which a value should be taken. (See section 4.4 for more information about stringcolumn(i) and the sidebar in section 5.1 for more information about Unicode.)

Figure 6.12 is certainly impressive, and so are the details of listing 6.5—but remember that most graphs (and commands) are much, much simpler! More important is the question of whether figure 6.12 succeeds in its visualization task. Here are some things to think about:

  • The graph packs a lot of information into a relatively compact space. The middle section is shared by three totally different quantities, but because they’re represented using different visual cues (lines, symbols, and filled regions), their interference is reduced. The amount of information encoded in the different symbols used to represent pressure readings seems too high: I find it difficult to get an intuitive sense of the behavior of the air pressure from this graph.
  • The dew point is represented particularly well, I think. The absolute value of the dew point is less relevant than its distance from the temperature, and it’s precisely this distance that is emphasized by the vertical error lines. The continuous lines used for the ambient temperature highlight the actual values.
  • For the visibility data, I wanted to stress the reduction in visibility, represented by the grey bars intruding from above, possibly leaving only a small, unobscured visible region at the bottom of the graph. Although good in concept, I don’t think this idea works well in practice: drawing the visual range with lines would have been clearer.
  • Both the wind and the cloud cover visualizations represent their underlying data intuitively: you can, literally, see which way the wind blows. It would have been weird to use with lines to indicate the wind direction.

Nevertheless, the visualization of the wind speed and direction has a couple of problems: although you have a relative sense of the wind speed, there’s no indication of its numerical range—if you’re trying to land a plane, that matters a lot! And if the wind speed is high and the direction is variable, the arrows will have a tendency to obscure each other. (That this isn’t an issue here is solely a consequence of this particular data set.)

  • One last problem: although the example uses good numerical tic marks for the temperature and dew-point values, things are less clear for the visibility and pressure readings. A single axis isn’t enough to represent quantities that have such totally different numerical ranges.

6.5. Other styles

The styles discussed in this chapter are all very general: they can be used to represent all sorts of data in a variety of situations. In addition to these general-purpose styles, gnuplot offers further possibilities for special applications:

  • Appendix E explains boxplots (to represent point distributions), histograms (for counting statistics), and parallel coordinates (a particular kind of multivariate technique).
  • Appendix C and appendix D introduce surface and contour plots, as well as false-color plots and heatmaps. All of these methods apply to cases where you’d like to represent how one dependent quantity depends on two independent variables.
  • Appendix F introduces some special plot types for non-Cartesian coordinates.

6.6. Summary

This chapter is like an illustrated catalog: it demonstrates all of gnuplot’s major plot styles by way of example. As such, I’m not disappointed if you didn’t read the chapter from start to finish: it’s fine to look at the figures until you find one that does what you need and then read the accompanying section.

The only part of this chapter that you should have read, even on a first pass, is section 6.2, because it explains how styles are chosen and how you can customize them to meet your needs. This is fundamental material that you’ll need basically every time you use gnuplot.

Everything in this chapter exists to visualize data. In contrast, chapter 7 looks at things you may want to put on a plot in addition to the data: labels, arrows, and other decorations.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.145.61.170