CHAPTER 6
Data Literacy

We've seen the roles abstraction and iconicity play with both language and charts. Literacy is a further abstraction, a system for representing words and ideas to preserve thoughts over time. Yet literacy isn't one thing; it's many. The online version of Merriam-Webster’s Dictionary defines being literate in several ways, including as follows (n.d.):

  • 1a: educated or cultured
  • 1b: able to read and write, such as population literacy rates
  • 2c. Having knowledge or competency, “computer literate”

When we intersect these definitions with data literacy, discussions tend to center somewhere around 1b (“reading and writing”) and 2c (“competency within a specific skill”). For a bit of added spice, some definitions come with a flavor of 1a (“cultured”) added. These slight variations create some confusion. In community discourse, we sometimes try to fold reading and writing (1b) and competence (2c) into the same idea without distinguishing between the terms.

These paradigms affect how we design, deploy, and use visualizations as well as our discourse around data visualization as a profession. Charts are a written language for data. Is data literacy about how to train ourselves and users on the language of charts (1b), or is it a broader paradigm that encompasses the entire ecosystem of data (1a and 2c)? The first represents an individual effort, one we'll call graphicacy in line with Alberto Cairo's (2019) definition of reading and creating charts. Graphicacy plays a key role in data literacy but represents a small part of the overall process.

When we look at the broader paradigm, data literacy becomes less about a specific task and more about navigating a system like health literacy. This process, shown in Figure 6.1, encompasses the following elements:

  • Inputs: How the data came to be in the first place
  • Storage: Whether it is generated in a flat (spreadsheet), relational (database), or array format (JSON)
  • Modelling and prep: The changes in shape and aggregation to allow for analysis
  • Defining: The meanings and limitations of the fields and individual members of data
  • Analyzing: The process of understanding the data in visual and aggregated formats
  • Output: The means in which the interpretation is shared, such as in visualizations, dashboards, or other data-informed composition
Schematic illustration of data literacy circle

FIGURE 6.1 Data literacy circle

Inputs decide how the data is collected. Can users type in a field, select single variables, or check multiples in response to a question? For example, COVID cases may be reported as a date of onset, as a test result, or by presumption. This data is then stored in a spreadsheet, in a database where it's modelled dimensionally, or as a JSON file. A case may be defined by country or state: some may allow reporting based on symptoms or require a test. Graphicacy comes into play with analyzing. We're making charts and exploring relationships. Lastly, the final analysis is shared to some type of output, whether as a digital composition or printed to paper.

Navigating Data Literacy

A broader definition significantly affects how we disperse data literacy skills and to what levels. It spells out what we, as a society, expect people to know to navigate a modern and highly abstracted world. Both computers and writing transformed how we navigate the world, with one change being within the last few decades and the other centuries ago. With COVID-19, we've seen that data visualization is becoming a primary means of understanding the world.

At the beginning of the pandemic, news organizations started reporting on a novel coronavirus in China. The virus quickly spread, with various Centers for Disease Control (CDC) departments and the World Health Organization (WHO) struggling to keep pace with reporting. Johns Hopkins University (JHU) also began tracking the numbers separately and showed markedly higher numbers (Dong et al., 2020). The JHU dashboard, in addition to the pandemic, also marked a turning point for data literacy.

Even the pandemic's early rally cry—“Flatten the curve”—relies on a data-literate culture. Understanding the curve formed by cases over time relies on recognizing the pattern of data and what it represents: a longer, flatter hill means fewer cases over a wider span of time. Data visualization is a primary means of understanding the COVID-19 pandemic.

The dashboard itself served as the backbone for news organizations reporting on the pandemic. If sites did not directly use the images, they rewrapped the visuals in a variety of formats and styles with JHU data as the source. Charts, more than words, told the story of the COVID-19 pandemic. Health departments have since made their own iterations of dashboards, expanding from the original focus of cases, deaths, and positivity rates to include vaccination data and other relevant information and calls to action (Patino, 2021).

This shift is no accident. The rise of graphicacy and broader data literacy intersects with the technology that makes it possible and the critical need to understand information in ways current literacies fail. Like reading and writing, data literacy must become mainstream to fully democratize information access. To understand the role graphicacy plays in data literacy, let's look at the impact of reading and writing.

The Impact of Writing

In the fifteenth century, Koreans exclusively used Hanja, Chinese characters repurposed for Korean sounds. While the yangban, or aristocratic elites, mastered the numerous borrowed ideographs to read and write, literacy remained out of reach to most social classes. Faced with a population where only the richest could read, King Sejong, the ruler of Joseon (now modern-day Korea), faced a challenge: How do you make a language accessible to the masses?

Sejong knew he needed to capture the sounds of the Korean language. Alphabets are tricky. They attempt to codify the smallest units of sound—phonemes—and allow them to build up to syllables and words. Precision and utility are key. Hangul artistically combines the higher-level syllables and the lower-level phonemes, or individual sounds.

Enter Hangul, a written language most sources attribute directly to the king that epitomizes functional aesthetics (Kim & diRende, 2014). It is an alphabet with distinct letters for individual phonemes. Unlike most alphabets, Hangul isn't written linearly. Rather, sounds are grouped into syllable blocks systematically, much like a syllabary. The larger units of sound can be scanned and memorized quickly. Figure 6.2 highlights this dual nature: the word Hangul is written with initial consonants (h, g) in dark green, vowels in light yellow (a, u), and ending consonants in blue (n, l ). Note how the vowels affect the shape.

Schematic illustration of hangul written to highlight letter and syllable features

FIGURE 6.2 Hangul written to highlight letter and syllable features

Hangul is a lesson in functional aesthetics. The script contains featural elements with iconic roots—certain letters mirror key shapes in the mouth, shown on the right in Figure 6.3. These consonants use triangles, squares, and circles as base elements to formulate their design. Letter families use the same placement in the mouth but alter whether the sound is voiced, aspirated, or tensed in that position. For example, t and d have the same placement but differ in voicing. Hangul preserves these relationships and marks them with additive line changes. Figure 6.4 shows how the original consonants build from each other in design with obsolete letters marked in yellow. These basic letters expand systematically by adding a line for aspiration or by duplicating the base letter for tenseness (selected letter families shown in Figure 6.5).

Schematic illustration of featural elements

FIGURE 6.3 Featural elements

Schematic illustration of additive design

FIGURE 6.4 Additive design

From Pae (2018)

Schematic illustration of selected letters (base, aspirated, tense)

FIGURE 6.5 Selected letters (base, aspirated, tense)

Vowels have their own philosophy and harmony rules, using aggregation and yin and yang groupings (shown in Figure 6.6). These rules may help with spelling. Despite its simplicity—or maybe even because of it—the aristocratic class objected, labeling it with contempt toward the people most likely to use it: peasants and women. Yet Hangul precariously survived and is celebrated in South Korea with a holiday.

Schematic illustration of korean vowel harmony

FIGURE 6.6 Korean vowel harmony

Kim, Y., and diRende, S. (2014), Korean Hangeul: A New Kind of Beauty. Ecobook

As a writing system, Hangul is elegant. It supports reading by grouping sound families, systemizing letter creation, and clearly defining syllable boundaries. It works to make the task of reading sounds as transparent and efficient as possible. As we broaden our lens from writing to visualization, we hope to capture Hangul's balance of form and function.

Like King Sejong, we can work to make graphicacy attainable and legible, in addition to supporting the broader cycle of data literacy by clarifying inputs and definitions. Many of the debates we have today around data literacy share a common theme with historical discussions around the value of literacy. With visualization, we worry about who needs to learn to make graphics, how end users will understand what we present, and what levels of data literacy skills to disperse.

Visualizations are abstractions, relying on primary graphicacy skills to fully understand the composition. Dashboards, infographics, and data-driven news articles are rapidly maturing in exposition styles. They too are making the shift from where charts were an auxiliary part of communicating information to where they can drive the composition.

Data Orality

Orality exists in spaces before literature takes hold. The culture and cognitive tools center around using conversation and verbal expositions to learn. Data orality is where the use of charts serves as supplemental to the exposition of data. Charts are not intuitively read. Instead, consumers rely on outside narration, expanded supplemental text, and numeracy to navigate what the visualization shows.

Literacy changes both the brain and culture. The brain recycles areas for common tasks into networked reading zones (Wolf & Stoodley, 2008). Areas dedicated to shape awareness also take on the task of recognizing the written word as notable and reading as a dedicated process. The process of learning to read takes years, requiring skills that go beyond recognizing the words on the page to understanding the broader concepts presented by the author (Wolf & Stoodley, 2008). For literate societies, reading eventually becomes a primary means of learning. Books not only educate but create a cultural backdrop and shorthand.

Socrates, the famed Greek philosopher, distrusted the idea of reading for learning (Plato, 1952). Rather, he found memorization and dialog pivotal for understanding, hence the Socratic method commonly used in schools. Writing may supplement, but the primary means of understanding relies on discourse, or orality. Orality isn't about the individual but the broader culture.

The societal impacts of literacy profoundly change exposition styles. Walter Ong (2012) studied the differences between cultures not exposed to writing in contrast to those with literacy as a bedrock institution. Cultures centered around orality—like the ebbing Greek oratory that Socrates cherished—rely on the ability to recall at hand. Works like Homer's Odyssey carry evidence of orality like meter, tempo, and proverbs rarely found after literacy took hold in Greek society. These systems allow recitation of the same story in slightly different patterns. An arc may be exposed slightly early, or a person's thoughts may be phrased differently depending on recitation, without impacting the overall work.

Figure 6.7 shows an exposition style favored by orality. Details form the crux of the exposition. Socrates himself often started his dialogs with an example and then proceeded to expand that example into themes, going back to details to create the setup for the next theme. Each theme added nuance to the story. Themes are drawn to a close to establish both rapport and context (the latter represented with light gray boxes in the figure).

Schematic illustration of orality exposition style

FIGURE 6.7 Orality exposition style

As we look at methods of presenting data, we see patterns that mirror the shift from orality to literacy as a primary means for comprehension. Early data compositions rely on other methods to explain the information shown in charts. Just as the printing press expanded access to literacy (Wolf & Stoodley, 2008), digital advancements have transformed data literacy. Interactivity, animation, and customization options enable a new way of exploring and understanding visualization. Just as early written compositions had oral residue, data visualizations indicate cultures centered around data orality.

Tables as details and experiences: Tables provide granular details and allow consumers to build trust through their own direct experiences. Figure 6.8 shows a tabular version of antibiotic data visualized by designer Will Burtin. The lowest number indicates the most effective antibiotic. Can you find the pattern? Try highlighting.

Schematic illustration of Burtin's antibiotic data

FIGURE 6.8 Burtin's antibiotic data

As with Socrates, oral data cultures are not yet ready to trust charts as a primary means of sensing patterns. Rather, finding these patterns by seeing the data and manually highlighting and reordering it, allows the insight to be trusted. Tables such as the one in Figure 6.8 provide the example first and set the tone for understanding any auxiliary charts. Medium also plays a role: static visualizations historically required an exposition more in line with oral styles. We see this trend changing with more newspapers including charts as the primary driver of a story and using text, rather than tables, to clarify.

Outside guidance for themes and context: Works such as Homer's Odyssey and religious texts are often hard for literature societies to navigate as they contain oral residue. Their exposition styles and pacing are unfamiliar. Oral data cultures design with the visualization serving as a supplement. These compositions may provide a variety of graphics on a dashboard or only a single visualization. Ensembles supplement a greater whole that exists outside the dashboard rather than serving as cohesive compositions. Figure 6.9 shows an example of data orality within a dashboard.

Schematic illustration of dashboard with residues of data orality

FIGURE 6.9 Dashboard with residues of data orality

There are four parts to the dashboard shown in Figure 6.9. Each part can exist independent of the others. Someone using this dashboard can easily cut segments and put them into external presentations and documents. Rather than this dashboard being used as a whole unit, the exposition can be rearranged and trimmed with minimal impact to its greater context and meaning.

Technologies like paper and the printing press democratized writing access and literacy (Wolf & Stoodley, 2008). Data visualization software shifted from specialized departments and IT to broader business and academic users. Online tools like Datawrapper allow nearly anyone to create charts rapidly and easily. New mediums require different exposition styles.

Changing Exposition Styles

COVID-19 is rapidly altering how we read and interpret charts. We expect charts to work together, clarify one another, and align to a particular thesis. Essays provide a powerful metaphor for understanding current expectations around data visualizations. Like essay writing, visualizations require anticipating questions and exposing information in a clear manner.

Figure 6.10 shows how literate cultures expose information. Framing context at the beginning concisely prepares consumers for what to expect (shown in light rectangles). Themes are explored at a high level and then exposed in a stair-step manner, with details used as supports; it is the inverse of what orality prizes.

Schematic illustration of literate culture exposition

FIGURE 6.10 Literate culture exposition

As we look at Figure 6.10, it mirrors essay writing that gets taught as early as elementary school. Written expositions aren't refined through conversation with others, but with ourselves. A literate culture relies on the reader to personalize the information and the author to provide enough information and clarity to do so. It's a different thought process, one that rewards unique phrasings and the ability to create clearly resonant themes in advance. We bring these models to other literacy paradigms.

Literacy and numeracy work together. Numbers were the first thing we documented, using tallies on various clay tokens and stones to track quantities (Pae, 2020). As these accounting systems matured, they started to incorporate pictures in addition to tallies. These tracking systems set the stage for early literacy development. Interpreting and creating charts relies on both numeracy (the first tier) and literacy (the second tier) to fully grasp what the chart represents. Figure 6.11 shows how graphicacy is a third-tier skill, after numeracy and literacy.

Schematic illustration of graphicacy as a third-tier skill

FIGURE 6.11 Graphicacy as a third-tier skill

Data Literacy Democratization

Moving from orality to literacy requires democratizing access, much like Hangul made reading Korean easier. Charts, too, are an abstraction designed to be read. It's a cultural shift, one that creates shared expectations. Essay writing follows a formula, and it's one that children learn as early as third grade in the United States. Graphicacy has progressed and matured, initially starting with more annotation and progressively reducing it as comfort increased. Look at some of the historical charts in Chapter 3 and you'll see more annotation and lines than we typically provide now.

Democratizing access to visualization requires both an individual shift in abilities (learning to read and write charts) and a sociological shift in communication patterns (from orality to literacy). Practitioner, writer, and Columbia University teacher Allen Hillery (2020) is one of many voices calling for apprenticeships to make data proficiency far more equitable. Underserved communities are often left without resources for advancement. College degree attainment has a direct correlation with parent income—those at the lowest income face a vast number of navigational issues in addition to cost. Hillery also shows how making access more equitable helps the final product: they're more likely to address the needs of a wider user base.

Sarah Nell-Rodriguez, an educator and founder of Be Data Lit (https://bedatalit.com/educating-organizations-on-data-literacy/), takes a similar approach with an emphasis on reskilling. In the aftermath of COVID-19, workers faced immense challenges in an economy where so much has moved online. Nell-Rodriguez proposed a modified version of Bloom's theory when it comes to data literacy. Within this paradigm, literacy can be measured as a quantifiable skill. Credentialing of both practitioners and end users serves as a powerful means for pushing the group as a whole. Users have more comfort with more charts and practitioners gain more freedom in the range of charts that can be used.

Numerous software platforms have signed onto this view to data literacy. Qlik has partnered with various entities to create and sponsor the Data Literacy Project (https://thedataliteracyproject.org), featuring leaders across toolsets. It focuses on providing training on visualization with some data shaping skills. Tableau incorporates “data culture” into its literacy approach as a means of democratizing access and knitting data into decision making. Everyone from the top down at a company is encouraged to use data. Most of these approaches overlap with the idea of data literacy as chart-reading and creation, or the reading and writing definition of literacy (1b, per Merriam-Webster). Some embed ideas of ideas of being cultured (1a) in addition to composition.

True democratization of data literacy takes into account the entire ecosystem of data. It recognizes the proliferation of charts in our daily lives and works to make them intelligible broadly. Interactive pieces can build in additional clarifiers through details on demand or by teaching the end user how to read the chart.

Defining data literacy beyond graphicacy means it's up to us, the practitioners, to build tools that keep our users from falling off a cliff. It requires understanding our user's culture and proficiencies in charts and working to meet them where they are, while providing tools to understand novel ideas. Beyond training the users, we act as interpreters and provide both linguistic and cultural clarification around the entire cycle of data. It means we can bring a literary craft to our work when the time calls but that we must also be sensitive to users who seek a CliffsNotes guide to our work. The systemic paradigm pushes visualization toward the role of a profession, one that encompasses taking ownership of the message we present with data and acculturating into shared norms.

Summary

Technology and COVID-19 rapidly accelerated the need to understand the abstract language of charts. Yet, data literacy doesn't stop at reading and writing charts but encompasses a broader ecosystem. Data orality exists before data literacy. Without established graphicacy skills, consumers rely on tools beyond the chart to parse the information. Democratizing access relies on recognizing the fundamental need to make data literacy accessible and culturally transformative.

As Steve Jobs said, “When you're a carpenter making a beautiful chest of drawers, you're not going to use a piece of plywood on the back, even though it faces the wall and nobody will ever see it. You'll know it's there, so you're going to use a beautiful piece of wood on the back. For you to sleep well at night, the aesthetic, the quality, has to be carried all the way through.” The next chapter explores how we carry data literacy through data preparation, a task that is often viewed as “behind-the-scenes.”

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.138.181.196