Chapter Eight
Pitfall 7: Design Dangers

“Design is not just what it looks like and feels like. Design is how it works.”

Steve Jobs

How We Dress up Data

I don't consider myself an expert designer – far from it – but I did have a unique opportunity in my career to encounter a high volume of very creative data visualizations over an extended period of time. I was asked to head up the global marketing team for the wildly popular and free Tableau Public platform,1 and I held that role for over 5 years as it grew more than twenty-fold. It was a role I'm very grateful for, and one I'll never forget.

What's so interesting about this platform is that it gives data nerds around the world a chance to apply the skills they develop in their day jobs to passion projects about topics that are interesting to them, from baseball team stats to multiple sclerosis dashboards, from sacred text word usage infographics to Santa Claus trackers. The topics range from silly to serious and the level of difficulty ranges from simple to sophisticated, and you can find everything in between those extremes.

And it isn't just corporate data jockeys set free to create without constraint. You'll also find the work of data journalists telling the stories of our time with data, corporate marketers creating engaging content for campaigns, government agencies giving citizens access to public data, and nonprofits bringing attention to their causes through interactive data.

All of these “authors” have two things in common: (1) they're publishing broadly to an audience on the web they don't entirely know and can't talk to en masse ahead of time, and (2) they can't assume that this audience will want to stop and look at anything they've created, or stick around for very long even if they do.

These two facts present an interesting design challenge that results in a need for both clarity and aesthetics. Since the work will potentially be seen by millions, it must be relatively easy to understand in order to accommodate the wide range of data literacy amongst the audience members. That's where clarity comes in. But since a lot of other online content is vying for the attention of those same millions of people, it must capture their attention and engage their imagination. Hence, the need for aesthetics, too.

I've long held the belief that both clarity and aesthetics matter when it comes to creating data graphics for an audience to consume.2

To attempt to define them in greater detail, the term “clarity” is fairly straightforward in this context. When I use that term in reference to data visualization, I'm referring to the speed and effectiveness with which a data visualization imparts to the audience an accurate understanding of some fundamental truth about the real world. This has been scientifically researched a number of ways, including tests of cognition as well as eye-tracking studies and more.

While clarity may be easy to define, aesthetics, unfortunately, will never be so cooperative. Of course the problem with aesthetics is that the overused cliché “beauty is in the eye of the beholder” is very true. What looks beautiful to me here and now will not necessarily look beautiful to you, and may not even look beautiful to me at a different point in time.

Like fashion, certain elements of style come into vogue and then pass out of it again in a cycle that's hard to predict. And so we have a problem when it comes to defining aesthetics. The universal standards of “beautiful” elude us, and always have.

The previous chapter on graphical gaffes touched on both of these aspects, with a focus on clarity – choosing charts and creating them so that people can get a job done or come to an accurate understanding of the way things are. But it also touched on aesthetics – choosing color palettes that are simple and focus the reader's attention.

Design impacts both clarity and aesthetics. As Steve Jobs noted (see the epigraph to this chapter), we can't narrow the field of design to just aesthetics. How we interact with designed things and how they function also matter. Just as I learned in mechanical engineering school, we need to consider form, fit, and function in order to build something worth giving to the world.

So I'd like to break this chapter into two main parts. The first part will deal with pitfalls related to the look and feel of data visualizations. The second part will focus on pitfalls related to how we interact with them.

Pitfall 7A: Confusing Colors

A pitfall that's quite easy to fall into when creating dashboards with multiple charts and graphs is that of using color in ways that confuse people (Figure 8.1). There are many ways to confuse with color, including the oft-maligned red-green encoding that color-vision-deficient readers can't decipher. That's just one of many, though, and I'd like to illustrate three additional versions of the “confusing color” pitfall that I see people like me fall into quite often.

I'll conclude this section by talking about the design goal that I aspire to achieve any time I create a dashboard with multiple views.

Illustration of a Boston Marathon dashboard that uses the same shade for different attributes.

FIGURE 8.1 A Boston Marathon dashboard that uses the same color for different attributes.

Color Pitfall 1: Using the Same Color for Two Different Variables

The example for the first type of this common pitfall comes from a marathon dashboard showing the results of the 2017 Boston Marathon. The data for this race is available online.3

Here are some pluses (things I like) and deltas (things I would change) about this dashboard:

  • Pluses: I like the use of color in the histogram that shows the clear cutoff points, where finishers cross the line in droves immediately before the turning of the hour, especially the fourth hour of the race. This shows how goal-setting can affect the performance of a population, and it's fascinating.
  • Deltas: Notice that the same red hue, though, applies to finishers from Chicago and Portland, and also to people who finished the race between 4 and 5 hours. Likewise, the same orange is used to encode the finishers from New York and Austin, as well as people who finished between 3 and 4 hours after starting. Similarly, teal, blue, and green have multiple meanings. Of course, there's no actual relation between these particular groups, though it may seem like there is at first glance.

To avoid this confusion, I propose using entirely different color schemes for the histogram and the treemap (and not repeating any colors within the treemap itself), or, better yet, not putting these two charts next to each other at all, because they tell completely different stories.

Color Pitfall 2: Using the Same Color Saturation for Different Magnitudes of the Same Variable

Similarly, I've made the mistake of using the same color saturation to effectively create two conflicting color legends for the exact same dashboard. Consider this trivial map that I created with data about mileage of California roads by county to illustrate the point (Figure 8.2).

Illustration of a dashboard with two different sequential  palettes using the same shade.

FIGURE 8.2 A dashboard with two different sequential color palettes using the same hue.

Notice that there are two different sequential color legends on the dashboard that use the exact same turquoise color. In the filled map, the fully saturated turquoise color corresponds to a specific county (Los Angeles County) with 21,747 total miles of roads.

In the bar chart, the full turquoise color saturation corresponds to a specific road type (Local roads) with a total of 108,283 miles for the entire state of California. Just eyeing the dashboard in passing, however, the viewer may connect Los Angeles County with local roads and mistakenly think these two marks are connected. Or, the reader may look at the wrong color legend (if both are in fact included) and be misled about how many miles of road the county actually includes, or vice versa.

From a software user's perspective, this pitfall was incredibly easy to fall into because all I had to do was click and drag the “Miles” data field over to the field that determines color of the map, and also do the same in the place where I created and edited the bar chart. These two visualizations are aggregating miles by totally different dimensions – county and road type – but I can easily create a confusing color encoding if I'm not paying close attention.

Is there a way to avoid this type of color pitfall?

Notice that the color encoding on the bar chart is actually redundant. We already know the relative proportions of the miles of different road types by the lengths of their corresponding bars, which is quite effective all by itself. Why also include miles on the color shelf, especially considering the fact that the color would conflict with the choropleth map, where color is totally necessary?

To eliminate this conflicting color scheme, let's remove color from the bars altogether and just leave an outline around them (Figure 8.3).

Color Pitfall 3: Using Too Many Color Encodings on One Dashboard

It's very common to use too many color schemes on a dashboard, especially with big corporate dashboards where the various stakeholders call for everything but the kitchen sink to be added to the view.

Figure 8.4 shows a dashboard I created to illustrate the point – my first dashboard that uses the Sales SuperStore sample dashboard that comes with Tableau Desktop.

In this dashboard we see not just one red-green color encoding but two, and they have different extremes for the exact same measure (Profit). We also see red and green used in the scatterplot, but now they refer to two of the four different regions (West and South, respectively) instead of different profit levels. Finally, we have another bar chart that uses no color scheme, but each bar is blue – the same blue as the Central region in the scatterplot.

A version of the dashboard using only one sequential shade encoding to avoid confusion.

FIGURE 8.3 A version of the dashboard using only one sequential color encoding to avoid confusion.

You get the point. This isn't what we want to create. I think I broke all of the rules in creating this one.

My Design Aspiration: Only One Color Encoding per Dashboard

This goal isn't always feasible, but as much as possible, I try to include one and only one color scheme on every dashboard I create. The reason is that I find it takes me a lot longer to figure out what's going on in someone else's dashboard when they've used more than one. It's that simple.

This means I often have to make a tough choice: which is the variable (quantitative or categorical) that will be blessed with the one and only one color encoding on the dashboard? It'll become the variable that receives the most attention, so it should be the one that's most related to the primary task the user will perform when using the dashboard.

Illustration depicting a company sales dashboard using a fabricated store data set - sales by product category and quantity and profit by customer.

FIGURE 8.4 A company sales dashboard using a fabricated store data set.

For example, if the dashboard was created for a sales meeting in which the directors of each U.S. sales region talk about what's working well and what's not working well in their respective regions, then the “Region” attribute could very well take the honored place of prominence (Figure 8.5).

Notice that this version of the dashboard doesn't show which cities were unprofitable overall. We have shown profit by product category as a separate bar chart in the bottom left, but the map no longer gives us any information about profit. So this version of the dashboard actually shows less information, but shows it in a way that's easier to understand. Sales executives are typically more focused on sales, aka the “top line,” than profit, or the “bottom line.”

Illustration depicting a redesigned version of the dashboard that limits use to a single-shade encoding.

FIGURE 8.5 A redesigned version of the dashboard that limits use to a single-color encoding.

But if we were to find out the profit by city was critical to the discussion, we'd need to either find a way to add it back to the dashboard, or we'd need to create a second view to handle this part of the discussion.

Trying to cram all of the information that could possibly be needed into a single view is often unnecessary, and what results in the type of color confusion we're talking about in this section.

Pitfall 7B: Omitted Opportunities

You'll recall that back in Chapter 4 (the chapter all about our third pitfall, mathematical miscues) we showed how a very common software default led us to completely miss the fact that one particularly moody and haunting poet didn't publish anything in three different years during the course of his career. We missed this fact because the years were missing from the view since there were no records in the data set with those years.

We showed how to correct this chart setting in order to avoid this pitfall, as shown in Figure 4.8, which I will share again in Figure 8.6.

But what I didn't mention in that chapter is that this version of the chart has omitted much more than just a few years on the x-axis. It omits a palpable aesthetic design opportunity.

For those of you familiar with the poetry and stories of Edgar Allan Poe, many are moody and melancholy and some are even downright chilling or haunting. But there's nothing particularly moody or melancholy about this chart at all, and certainly nothing chilling or even remotely haunting. It's a nice, bright blue column chart with divisions for each published work, creating a stacked box feel. It does nothing to evoke the feeling of Edgar Allan Poe's work, and there are no artistic elements that draw in the reader or communicate the subject matter to them in any way.

Chart displaying the works of Edgar Allan Poe, with the missing years depicted in-between.

FIGURE 8.6 Poe's works displayed as columns with missing years shown.

This is another sin of omission. A major one.

Whenever I ask people in my classes how they would change this view to add some artistic flair, they often suggest changing the squares into books so that there would be a stacked book feel. I like that idea, but his published works are of widely varying length, and some of them are incredibly short, so that might be a bit misleading.

What we can do, though, is turn the y-axis upside down so that the squares stack downward, and we can change the color from bright blue to red, evoking a dripping, blood-stained feel (Figure 8.7).

A modified version of the Poe chart depicting the remarkable career (150 works) of Edgar Allan Poe.

FIGURE 8.7 A modified version of the Poe chart that adds aesthetic elements.

We need to be very careful with reversing axes. If the marks were lines instead of bars, the slope would appear to go down at the precise time that the measure increases. So, I very rarely find occasions to invert axes for line charts.

Once instance where such a change may be warranted is when showing change in rank over time. Since low numbers like 1 and 2 correspond to high ranks, and high numbers like 99 or 100 correspond to low ranks, inverting the y-axis actually helps show when a particular item or group goes up or down in rank (Figure 8.8).

Wouldn't it be strange if the tenth-ranked skill were at the top of the chart and the top-ranked skill were at the bottom? I would make the argument that this type of chart is one where inverting the axis actually aides cognition and comparisons.

A bumps chart depicting the analysis of Top Skills ranking in various companies over the course of past five years.

FIGURE 8.8 A “bumps chart” showing ranking of skills over time.

The Poe chart is an example where inverting the axis doesn't either help or hinder comprehension dramatically, but it certainly does help with the aesthetic look and feel of the chart. We still get the sense that there are more works published in years where the boxes stack farther downward, just as there would be more blood on the mirror if the drip went all the way to the bottom.

A key point, though, is that I'd hesitate to add artistic flair whenever doing so results in a dramatic reduction in clarity or comprehensibility, or when adding aesthetic elements is likely to totally mislead some of my audience. These are trade-offs to be aware of and to test with potential audience members, even if the test is just a quick-and-dirty one.

In this case, though, we can go farther than simply inverting the y-axis. There is a nice gap in the very center of the chart. It's a gift from the data gods that we can use to great aesthetic advantage. While I don't think we should fill every single white space (quite the contrary in fact) this one is just begging for an image of the author himself. Luckily for us, a nice oval portrait image of Edgar Allan Poe is available in public domain and free to use under the Creative Commons license. So is an image of his signature, which is a nice object to replace the text of his name, and very appropriate because these works would have been written by hand, not word processor (Figure 8.9).

These images aren't just useless chartjunk. They actually convey information. His image will be recognizable to many (although not all) of our audience. For those who have seen his photo before and are familiar with his face, seeing again here will convey the subject matter and also bring back to the surface memories and feelings associated with reading his poems in our high school textbooks. There is real value in that experience we have created.

It's so easy to miss opportunities like this, and fall into yet another pitfall of omission. The key to avoiding this pitfall is to allow our creative juices to flow, and to ask ourselves what opportunities exist to add aesthetic components that will enhance the overall experience for our audience.

A modified version of the Poe dashboard depicting the remarkable career of Edgar Allen Poe with his photograph displayed in the center.

FIGURE 8.9 A modified version of the dashboard that adds images to further enhance aesthetic appeal.

Great care must be taken when seeking to avoid this pitfall for two reasons. The first reason was already mentioned: there can be a trade-off between clarity and aesthetics at play, which we will want to approach with caution. The second is that sometimes our audience doesn't want any of these elements whatsoever. There are people for whom these types of visual enhancements are actually annoying, and they will get very irritated with you if you add them.

I'll never forget the time I gave an hour-long presentation to a group of people who worked at a city office in Arizona about how they could apply creative techniques similar to this one to their reports and dashboards. At the end of the presentation, I asked whether there was anyone in the audience who felt that there were zero opportunities in their current role to make use of such creative elements.

One woman raised her hand. She said she worked for the police department and prepared their weekly reports. She felt that any attempt to add creative or aesthetically pleasing elements would backfire spectacularly. I had no interest in getting her fired from her job, but I asked her to think about it as time went along, to challenge that assumption. I'd like to say she added some fancy thing or two to a dashboard and got applauded by all the grouchy chartjunk-hating chiefs of the force, but I haven't talked to her since that presentation. Sorry if that's a disappointing end to the story. For the record, I still think she can do it.

Pitfall 7C: Usability Uh-Ohs

Of course, design is about much more than simply color choice, aesthetic elements, and how something looks. Just like this chapter's epigraph states, it's also about how it works. Think form, fit, and function.

I've been educated and inspired by the best-selling design classic The Design of Everyday Things by user-centered design guru Don Norman.4 You really have to read the entire book, which applies to all types of objects that people design, from chairs to doors to software to organizational structures. It provides thoughtful and practical principles that guide designers to design all of those things well. By “well” he means “products that fit the needs and capabilities of people” (p. 218).

As I read it, it occurred to me that data visualizations are “everyday things” now, too, even richly interactive ones viewed on tablets and phones. That has only become the case in the past decade or so. Yes, examples can be traced back to the early days of the Internet, but the recent explosion of data, software tools, and programming libraries has caused their proliferation.

And I found that point after point, principle after principle in Norman's book applied directly to data visualization. I'd like to call out five points that struck me as particularly relevant to the field of data visualization.

1. Good Visualizations Are Discoverable and Understandable

Norman starts his book describing two important characteristics of all designed products:

  • Discoverability: Is it possible even to figure out what actions are achievable and where and how to perform them?
  • Understanding: What does it all mean? How is the product supposed to be used? What do all the different controls and settings mean?

He talks about common things that are often anything but discoverable and understandable, such as faucets, doors, and stovetops. One of my favorite quotes in the book is about faucets (p. 150):

If you want the faucet to be pushed, make it look as if it should be pushed.

It occurred to me that the typical stovetop design snafu has a direct translation into the world of data visualization. To explain, let's start with the problem with stovetops. Ever turn on the wrong burner? Why? Because you're stupid? No. Because there are often poor mappings between the controls and the burners. The burners are sometimes arranged in a two-by-two grid while the controls can be in a straight line (Figure 8.10).

What does that have to do with data visualization? We often use similar controls – radio buttons, combo boxes, sliders, and so on – to filter and highlight the marks in the view. When there are multiple views in a visualization (a dashboard), there's a similar opportunity to provide clear, or natural, mappings.

Image described by caption.

FIGURE 8.10 A common stovetop design, with no natural mapping between burners and controls.

Norman gives the following advice for mappings:

  • Best mapping: Controls are mounted directly on the item to be controlled.
  • Second-best mapping: Controls are as close as possible to the object to be controlled.
  • Third-best mapping: Controls are arranged in the same spatial configuration as the objects to be controlled.

Often the software default places the controls on the right-hand side. Here's my attempt (Figure 8.11) to show these options on a generic data dashboard, where the four different views are labeled A, B, C, and D, and the controls that change them are labeled according to the views they modify.

Illustration of a data dashboard depicting an example of default versus natural mapping of filters, indicated by matching letters.

FIGURE 8.11 An example of default versus natural mapping of filters on a data dashboard.

This is a relatively straightforward example, and the job of the designer of a more complex visualization is to make it similarly clear what can be done and how to do it. Designers use things like affordances, signifiers, constraints, and mappings to make it obvious. Note that it takes a lot of effort to make the complex obvious.

2. Don't Blame People for Getting Confused or Making Errors

A fundamental principle that Norman drives home a number of times in the book is that human error usually isn't the fault of humans, but rather of poorly designed systems. Here are two great quotes on the topic (p. 167):

It is not possible to eliminate human error if it is thought of as a personal failure rather than as a sign of poor design of procedures or equipment.

And again on the same page:

If the system lets you make the error, it is badly designed. And if the system induces you to make the error, it is really badly designed. When I turn on the wrong stove burner, it is not due to my lack of knowledge: It is due to poor mapping between controls and burners.

Norman differentiates between two types of errors: slips and mistakes.

  • Slips are when you mean to do one thing, but you do another.
  • Mistakes are when you come up with the wrong goal or plan and then carry it out.

Both types of errors happen when people interact with data visualizations. In the world of mobile, slips are so common – maybe I meant to tap that small icon at the edge of my phone screen, but the phone and app recognized a tap of an adjacent icon instead.

Mistakes are also common. Maybe it made sense to me to filter to a subset of the data to get my answer, but in reality I was misleading myself by introducing a selection bias that wasn't appropriate at all. If someone makes the wrong decision based on misinformation they took from your visualization, that's your problem at least as much as it is theirs, if not more so.

How to make sure your readers avoid slips and mistakes? Build and test. Iterate. Watch people interact with your visualization. When they screw up, don't blame them or step in and explain what they did wrong and why they should've known better. Write it down and go back to the drawing board. If the person who agreed to test your visualization made that error, don't you think many more likely will? And you won't be there to tell them all what they did wrong. Your only chance to fix the error is to prevent it.

3. Designing for Pleasure and Emotion Is Important

I'm a big believer in this principle. Norman states that “great designers make pleasurable experiences” (p. 10):

Experience is critical for it determines how fondly people remember their interactions. Was the overall experience positive, or was it frustrating and confusing?

How can an experience with a data visualization be pleasurable? In lots of ways. It can make it easy to understand something interesting or important about our world, it can employ good design techniques and artistic elements, it can surprise us with a clever or funny metaphor, or some combination of these and more.

Remember our ridiculous discussion of a hypothetical pie chart with 333 slices in the previous chapter? Well, here's a dashboard about bicycle parking stands in Dublin, Ireland, that contains a pie chart with exactly that many slices, one for each stand, sized by the occupancy, or the number of bikes that can be parked in each one (Figure 8.12).

Image described by caption.

FIGURE 8.12 A dashboard of bicycle stands in Dublin that uses an outrageous pie chart in a fun way.

You can use it to hover over a slice or a dot on the map to see the occupancy of each stand, and where it ranks in the overall list of 333 stands. Is that such a travesty? I think it's kind of neat.

What about emotion, the “e-word” to which the analytical folks in our midst can sometimes be allergic? Cognition gets a lot of play in the world of data visualization while emotion does not. But these two horses of the chariot that is the human spirit are actually inextricably yoked (p. 47):

Cognition and emotion cannot be separated. Cognitive thoughts lead to emotions: emotions drive cognitive thoughts.

Cognition attempts to make sense of the world: emotion assigns value…. Cognition provides understanding: emotion provides value judgments.

So let's embrace emotions. Some data visualizations make us angry or upset. Some make us laugh out loud. Some are just delightful to interact with. These elements of the experience should be part of the discourse in our field, and not ignored. If we take them into consideration, we'll probably design better stuff.

4. Complexity Is Good, Confusion Is Bad

There's a trend in data visualization to move away from the big, complex dashboards of 2010 and toward “lightweight” and uber-simple individual graphs, and even GIFs. Why? A big part of the reason is that they work better on mobile. Also, we've learned in the past few years that the complexity of those big dashboards isn't always necessary.

This is a great development and I'm all for it, but let's just remember that there was often a great value to the rich interaction that's still possible on a larger screen. Instead of abandoning rich interactivity altogether, I believe we should be looking for new and innovative ways to give these advanced capabilities to readers on smaller devices. When those capabilities will help us achieve some goal, we'll be better off. We're not there yet.

After all, it's not the complexity of the detailed, filterable dashboard that's the problem on the smartphone screen – it's that we haven't figured out how to make these capabilities intuitive to a reader on this device yet, and the experience is confusing.

I actually see this as a good thing. Our generation has the chance to figure this out for the generations to come. The growth of the numerical literacy of our population will be well worth the effort.

5. Absolute Precision Isn't Always Necessary

I have to be honest. This one is my hot button. There's a school of thought that says that the visualization type that gives the reader the ability to guess the true proportions of the thing visualized with the greatest accuracy is the only one that can be used. Some go so far as to declare it immoral to choose a visualization type that introduces any more error than another (they all have some error).

I found this great visualization about different encoding channels in Tamara Munzner's book Visualization Analysis and Design (Figure 8.13).

This research shows that all encoding types are imperfect. People won't guess the true proportions with 100% accuracy for any of them. And many times absolute precision just isn't necessary – not for a task they need to perform, and not for a general awareness we're trying to impart.

If we had to pick the one with the highest accuracy all the time, we'd only ever have dot plots, bar charts, and line charts and that's it.

The problem with this line of reasoning is that absolute precision isn't always necessary for the task at hand. Norman uses the example of converting temperature from Celsius to Fahrenheit. If all you need to do is figure out whether you'll need to wear a sweater when you go outside, a shortcut approximate conversion equation is good enough. It doesn't matter whether it's 52°F, 55°F, 55.8°F, or 55.806°F. In all four cases, you're wearing a light sweater.

Graph depicting the research results of errors involved with different encoding types.

FIGURE 8.13 Research results showing error involved with different encoding types.

Since there are errors associated with every visualization type, and since we aren't machines or perfect decoders of pixels or ink, then sometimes it's okay that a general understanding is achieved. Many times this means we are free to add interesting chart types that lend flavor and, heaven forbid, even a little bit of fun to the endeavor.

I think that's a good thing. In the Tableau Public role, I certainly saw many people use creativity to great effect. When I started in that role in 2013, I think there were still many people in the business intelligence space who felt that any aesthetic component or artistic flair was inherently evil and to be avoided. By the time I left, this attitude had changed, and it seemed to me, at least, that people creating dashboards both for public consumption as well as for corporate reporting were feeling empowered to find those aesthetic elements that would take a piece of work to the next level.

We'll see how much longer that trend continues, and whether it will turn out to be a pendulum that swings a little too far.

Notes

  1.   1 https://public.tableau.com/.
  2.   2 https://dataremixed.com/2012/05/data-visualization-clarity-or-aesthetics/.
  3.   3 https://www.kaggle.com/rojour/boston-results#marathon_results_2017.csv.
  4.   4 https://www.amazon.com/Design-Everyday-Things-Revised-Expanded/dp/0465050654/.
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.15.186.79