Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Chapter 5

Labeling and visual attributes

Abstract

Network visualizations come to life when they are combined with other data describing the vertices and edges. NodeXL supports vertex and edge labels. Three types of vertex labels can be used including: (1) adjacent labels that appear next to vertices on the graph, (2) shape labels that replace the vertex with the label, and (3) tooltip labels that only appear when mousing over a vertex. Label position, color, font type, etc. can be customized using label options. Labeling best practices are provided. Many visual properties can be mapped onto vertices including color, shape (including labels and images), size, and opacity. Visual properties for edges include color, width, style, and opacity. Features such as Autofill Columns and built-in Excel formulas can be used to enter data into the Visual Properties fields automatically. Best practices for creating meaningful network visualizations are provided. This chapter illustrates these principles using the ABCD Network.

Keywords

Labeling; Visual attributes; Label position; Tooltip; Visual properties; Width; Style; Color; Opacity; Visibility; Autofill columns; Excel formulas

5.1 Introduction

Network data is often accompanied by data or textual information that describes each vertex and edge. Attribute data describing each vertex in a network is often available, particularly for social media datasets. Vertex data often includes information about usernames, profile information (location, biography, hometown), number of friends/followers, number of posts, account creation date, etc. Data about each edge can also include information about the type of edge (reply-to; mention; follow), content of a message (tweet; email content), number of messages, etc. Additional data about each edge and vertex can come from calculated network metrics (Chapter 6), group properties (Chapter 7), or textual content (Chapter 8) as described in later chapters. Numerous insights can be revealed by integrating vertex and edge metadata into network visualizations through the use of labels and visual attributes such as vertex size, color, and opacity or edge width. NodeXL provides a variety of options to add custom labels to vertices and edges, as well as change visual attributes of vertices and edges. Network analysts become artists and communicators as they create custom visualizations aimed at accurately and effectively representing underlying data, while also inspiring viewers.

5.2 Labeling

Labeling vertices and edges is essential to creating readable graphs. Advanced topic: Labeling best practices shows some best practices for effectively using labels.

Advanced topics

Labeling best practices

• Edge labels are very difficult to read and should only be used on small graphs with few edge crossings. Instead, use visual properties, such as color, width, and opacity of edges to differentiate between them.
• Use vertex labels selectively for large graphs. Including a label for each vertex becomes problematic when there are over 50 or so vertices. Label only the most important vertices, or the ones that you are mentioning in related text.
• When labels are essential and the network is not too large, consider making the label into the vertex by using the Shape > Label in visual properties (see Section 5.2.5). This creates a rectangular box around the label to which the edges connect, making the label’s connection to the vertex unmistakable (see Figure 5.6).
• Consider using images in place of labels for vertices where an image can more efficiently convey content (see Section 5.3.2). For example, many Twitter images have logos that convey a company more efficiently than a long company name.
• Additional information, such as demographic information or network metric values, can be added to labels. For example, label a vertex with “Dave (12)” to indicate a name and age.
• Truncate (i.e., shorten) labels when needed since long labels can hide edges and clutter the graph. Just make sure the labels are unique after truncation to avoid confusing one vertex for another.
• Use Tool Tips and custom right-click menus (see Advanced topic: Useful Excel formulas for labeling) to provide details on demand. For example, use a person's name for the label, but have their age show up when the cursor mouses over it. Or link to their user profile via a custom URL that appears when you right-click on the vertex.

5.2.1 Viewing attribute data in the ABCD network file

This chapter will rely on the ABCD network file you created in the last chapter (or download from https://www.smrfoundation.org/nodexl/teaching-with-nodexl/teaching-resources/). Instructions for downloading the file and creating a Trusted Location for it in Windows were provided at the end of Chapter 4. This data file includes additional attribute data describing the edges and vertices, which will be mapped to labels and visual attributes. If you’d rather enter the data manually, the data values are available in Figures 5.1 and 5.2.

Both the Edges (Figure 5.1) and Vertices (Figure 5.2) worksheets have an Other Columns section where a column of data can be added about edges or vertices. Data added immediately to the right of the furthest right column in the table will make the new data become part of the table as indicated by the column header turning blue. This is important, adding a column within the table makes the added data available to other parts of NodeXL that will be introduced later. Any type of data can be entered in these fields including textual, numerical, date/time, etc. Notice that the Years_at_ABCD column includes numerical data, while the other fields are textual.

5.2.2 Labeling vertices

Navigate to the Vertices worksheet of the ABCD network file. Scroll over to the Labels columns. Right-click on the first cell in the Label column, choose Format Cells…, and change it from Text to General. Then enter = [Vertex] into the cell and press enter so that it will reference the names that are found in the Vertex column (i.e., Column A) on the Vertices worksheet. Excel should copy this formula down the complete column so that all rows in the Label column have the same formula. Refresh Graph using the Karel-Koren layout to see the labels appear on the graph, as shown in Figure 5.3. By default labels show up underneath the vertex. However, this can be adjusted for each vertex using the Label Position column.

Figure 5.3 Sample ABCD network vertices with labels applied and a tooltip shown for Camila.

Excel includes many textual formulas that can be used to help adjust labels. For example, Advanced topic: Useful Excel formulas for labeling explains formulas that can combine data fields into a single label or shorten long text fields. If you are not using formulas to populate the Label column, you could copy-and-paste values, type them in manually, or use Autofill Columns (see Advanced topic: Using autofill columns). The Autofill Columns feature is a core feature in NodeXL that will save considerable time and be used throughout most of the chapters in this book.

Advanced topic

Useful Excel formulas for labeling

One of the great benefits of NodeXL is that it can leverage all of the built-in features of Excel, such as formulas. Below are a few examples of useful formulas for creating meaningful labels. Make sure you reformat the column as General instead of Text before using these formulas. These formulas are based on the data in the ABCD network.

• Shorten names by using = LEFT([Vertex],3). This truncates (i.e., shortens) the names to 3 letters, or whatever number you place at the end of the formula, so that Camila becomes Cam. To visually indicate which names were truncated try using the following formula, which adds an ellipsis after only those that are truncated: = IF(LEN([Vertex]) > 3,LEFT([Vertex],3)&"…", [Vertex]). Camila now becomes Cam… while Ava remains Ava.
• Combine data from different columns by using the & symbol along with quotes to indicate text. For example, =[Vertex]&" "&"("&[Role]&")" results in Ava (Manager), since she has Manager in the Role column.
• Add a * to highlight specific names using = If. For example, to star all of the names with over 10 years working at ABCD use: = IF([Years_at_ABCD] > 10,[Vertex]&"*",[Vertex]), which will result in Ava and Camila*.
• Other useful formulas to look into include Clean, Trim, Upper, Proper, Lower, Len, Iferror, and the use of "" and & in text strings (as in the prior bullet).

5.2.3 Adding tooltips

To reduce information clutter, some information can be displayed only when the mouse is placed over a specific vertex. This is called a tooltip. For example, in Figure 5.3, the cursor was placed over Camila’s vertex, which prompted her role (Manager) to appear nearby. To add a tooltip, populate the Tooltip column on the Vertices worksheet. This can be done using copy-and-paste, manually, via formulas, or using the Autofill Columns feature as described in Advanced topic: Using autofill columns.

Advanced topic

Using autofill columns

The Autofill Columns feature of NodeXL allows you to automatically populate (or delete) information in the Edges, Vertices, and Groups worksheets. To use the feature, click on the Autofill Columns icon in the Visual Properties section of the NodeXL Ribbon, which opens a new dialog as shown in Figure 5.4. Notice there are three tabs at the top, each of which corresponds to a different worksheet in the NodeXL workbook. On the left of this dialog is a column of all the visual properties, labeling, and positioning attributes. In the second column, choose the field with the data that you want to use to govern this attribute from the drop-down menu. For example, in Figure 5.4 choose Role and click Autofill. This will populate the Vertex Tooltip column of the Vertices worksheet with the data in the Role field and refresh the graph. The Options arrow next to each field allows you to clear the associated worksheet column, or change other properties related to the mapping that are discussed later in this chapter.

Figure 5.4 Autofill Columns dialog showing the Vertices tab. The Vertex Tooltip is mapped to the Role data column.

5.2.4 Formatting, positioning, and truncating labels using label options

Several additional visual attributes of labels can be modified through the Label Options dialog. Click on the Graph Options button on the NodeXL Graph Pane (highlighted in Figure 5.5) to open the Graph Options dialog. Navigate to the Other tab, and click on the Labels… button also highlighted in Figure 5.5. This will open the Label Options dialog shown at the bottom of Figure 5.5 where you can automatically truncate labels, change the font and textual properties of labels, set the default position of labels in comparison to the vertices, and more. These can be adjusted for edge, vertex, or group box labels (discussed in Chapter 7).

Figure 5.5 Label Options dialog (bottom) and pathway to open it (buttons highlighted in yellow).

Advanced topic

Visual property best practices

Not all visual properties are created equal. The human eye perceives some more easily than others. Understanding when to use a specific property or when to combine properties is a mix of science and art. Below are some best practices to keep in mind:

• Avoid using too many visual properties at once, particularly if they all map to different data. Instead, create multiple graphs of the same data and place them side-by-side (with the same layout) for easy comparison.
• Combine shape and color to emphasize distinct categories of vertices. For example, use orange disks for teachers and blue solid squares for students.
• Combine edge width and opacity on weighted edges, making sure the minimum and maximum widths and opacity ranges are set so they are all visible.
• Avoid combining color and opacity, since colors will look different when their opacity changes.
• Use size (or width) for numerical data, since humans can more accurately compare different sizes (or widths) than different gradations of color or opacity.
• Use solid shapes (e.g., disk) by default. Use outline shapes (e.g., circle) only if there are many vertices that are occluded (i.e., hidden by other vertices).
• Avoid using color if the image is likely to be printed in black and white.
• Use colors that those who are colorblind can differentiate.

5.2.5 Label vertex shape

To increase readability, it is often useful to turn the vertex into a label rather than the default disk (i.e., filled in circle). To do this, navigate to the Shape column on the Vertices worksheet. Place the cursor inside of cell C3 and a drop-down menu option will appear next to the cell on the right. Select the drop-down menu and scroll down to choose Label (see Figure 5.6). Copy this down for all vertices and click Refresh Graph, which will show an updated graph like the one shown in Figure 5.6. When the Shape is set to Label, other visual properties such as color and size still apply to the vertex (see Section 5.2). The background color of the box surrounding the label can be set differently for each Vertex using the Label Fill Color column.

Figure 5.6 Results of setting the Vertex Shape to the Label option.

5.2.6 Labeling edges

Edges can also be labeled, though this is less common than labeling vertices, because edge labels are difficult to read on most networks. Typically, other visual properties, such as width or color, can represent the value or type of an edge more effectively than edge labels. However, when data is qualitative or unique, and the network size is small, edge labels can be useful. Adding label text to the Label column on the Edges worksheet is similar to adding it to the Vertices worksheet. You can also customize the color and size of the edge label by entering data into the Label Text Color and Label Font Size columns on the Edges worksheet.

5.3 Visual properties

NodeXL is a sophisticated and flexible network visualization tool, allowing you to map many types of data to a variety of visual properties of a network graph. For example, the color of a vertex may be based on demographic data such as gender or age. Or the size of a vertex may be based on a network metric such as Degree or Betweenness Centrality (see Chapter 6). A combination of different visual attributes can be used to help draw attention to different details. For best practices related to visual properties see Advanced topic: Visual property best practices.

The Vertices worksheet includes a set of columns grouped under Visual Attributes including Color, Shape, Size, Opacity, Image File, and Visibility. Figure 5.7 shows the many visual attributes that can be applied to vertices. Values for each visual attribute can be typed into the spreadsheet manually, populated via a formula, selected from a drop-down that shows up when the cursor is inside of a cell (e.g., the Shape column), selected from the Visual Properties menu ribbon items, or automatically filled in based on the Autofill Columns feature (see Advanced topic: Using autofill columns). Some effects, such as Glow, Drop Shadow, and Selected color are determined in the Graph Options dialog (see Advanced topic: Graph options).

Figure 5.7 Vertex visual property examples.

5.3.1 Vertex color

To make the ABCD network graph more visually meaningful, change the color of the male students to blue and the female students to a custom color. To set Ben’s color, type Blue into the Color column on Ben’s row. You can type in any color from the 140 Cascading Style Sheet color names (a Google search will list them for you). Alternatively, you can choose a color from the color picker available in the NodeXL menu ribbon under the Visual Properties section (see highlighted menu button in Figure 5.8). If you choose Define Custom Colors and pick your own color, the spreadsheet will show the color’s 3-digit red-green-blue (rgb) number such as 230, 101, 6 in the spreadsheet cell (see Figure 5.8). Choose Refresh Graph to see the changes.

Figure 5.8 Changing vertex color using the color picker accessible via the menu ribbon.

Rather than manually entering colors, you could write an = IF() formula that sets the color in the Color column based on data in the Gender column. This is much faster than manually entering the data, particularly as datasets grow beyond a few dozen edges. Enter the following formula = IF([Gender] = "Male," "Blue," "230, 101, 6") and copy it to each of the cells in the Color column. Click Refresh Graph to see the changes take effect.

An alternative method to set the vertex color is the Autofill Columns feature (see Advanced topic: Using autofill columns). The Vertex Color Options dialog lets you choose between two types of data: Categories or Numbers. Categorical data has distinct categories, such as the Gender column that includes the categories of Male and Female. If you choose this option you cannot choose the specific colors that are chosen by NodeXL for each category, so using a formula does give you more control than this approach. Alternatively, numerical data can be used. If chosen, the raw numerical data (e.g., the Years_at_ABCD column) maps to a variety of colors that blend two colors selected by the user in the Vertex Color Options dialog.

5.3.2 Vertex shape

The Vertex Shape column was first introduced in Section 5.2.5, when we set the Shape of each Vertex to Label. A variety of additional vertex shapes are available: solid shapes (Disk, Solid Square, Solid Diamond, and Solid Triangle), outline shapes (Circle, Square, Diamond, Triangle), and others (Sphere, Label, and Image). The Image shape only works if the Image File field is populated with a valid path name to a file on your computer (e.g., C:MyImagesImage.jpg) or a URL (e.g., http://www.somesite.com/Image.jpg). Some NodeXL network data importers, such as the Twitter importers, download user images and automatically populate the Image File field so that profile images can be used to represent each vertex. If the URL’s become broken links at a later time, a default image with a red X will be shown.

If you have different types of vertices (e.g., students and faculty; wiki pages and wiki editors), you may want to use shape to differentiate between them. This can be done using formulas for categorical data. For numerical data, the Autofill Columns feature can be used to identify shapes automatically based on specific values (e.g., data that is greater than 10 will be a Solid Square, otherwise it will be a Disk).

5.3.3 Vertex size

Similar approaches can be used to fill in the data for the Vertex Size column. When working with numerical data, such as the data in the Years_at_ABCD column, it is often useful to use the Autofill Columns feature of NodeXL to map the raw data onto the visual properties (e.g., Size). Open the Autofill Columns dialog, choose Years_at_ABCD from the drop-down menu next to Vertex Size, and then open the Vertex Size Options dialog as shown in Figure 5.9. The options dialog allows you to change details about the mapping of the raw data onto the visual property data. For example, as shown in Figure 5.9, the Vertex Size Options dialog allows you to change the minimum and maximum size of the vertex. Change the maximum vertex size to 50 to increase the difference in sizes between the vertices.

Figure 5.9 Using Autofill Columns to set the vertex size based on Year_at_ABCD data.

By default, a linear mapping is used. For example, Fay has the most years at ABCD (29) and Liu and Matt have the fewest (1). Notice that in the Size column, Fay has the maximum size (50) and Matt has the minimum size (1.5). All other employees are assigned Size values between these extreme values based on a linear mapping. This works well for this network, but for other networks you may want to choose the Ignore Outliers and/or Use a logarithmic mapping options on the Vertex Size Options dialog (see Figure 5.9). Outliers are identified as values that are at least one standard deviation above or below the average value of the raw data. Ignoring them will still include the vertex in the graph, but will not include the vertex’s value when calculating the value of the other vertices. Using a logarithmic mapping is useful when the raw data follows a logarithmic or power-law distribution, which is common in social media participation data (e.g., number of followers or posts). More advanced mappings can be performed using Excel formulas that populate the vertex property field (e.g., Size) based on the raw data field (e.g., Years_at_ABCD).

5.3.4 Vertex opacity

Vertex Opacity determines the level of transparency (i.e., how see-through) for each vertex. Values can be between 0 (fully transparent) and 100 (fully opaque). The default value is 100. The Autofill Columns options allow you to determine the minimum and maximum value, similar to the Vertex Size Options dialog shown in Figure 5.9.

5.3.5 Vertex visibility

When working with large networks, it is often useful to filter out some vertices, so they do not show up in the network. The Visibility column allows you to do so without deleting the information from the Excel spreadsheet. There are four options available. Show if in an Edge will display the vertex on the graph if the vertex is connected to another vertex by at least one edge. Otherwise, the vertex row will be ignored. This is the default. Skip will ignore the vertex row and any edges connected to it. It is as if the data is not in the spreadsheet, so graph metrics (see Chapter 6), groups (see Chapter 7), and the graph itself will not use the data present in any “skip” row. Hide will include the vertex in calculations for graph metrics, groups, and even use it to determine the positioning of other vertices in the graph, but will not display it. This is equivalent to setting its opacity (and the opacity of any edges associated with it) to 0. Show will assure that the vertex is always included, even if it has no edges connected to it.

5.3.6 Edge visual properties

The Visual Properties columns on the Edges worksheet are slightly different, but work in a similar manner to the Visual Property columns on the Vertices worksheet. Figure 5.10 presents the many edge visual properties available in NodeXL. Color and Opacity work the same way as the corresponding vertex attributes. Style changes the type of line (Solid, Dash, Dot, Dash Dot, and Dash Dot Dot) and is comparable to the Shape column for vertices. It is best used when working with categorical data. Visually, different styles are difficult to differentiate in large networks, so coupling style with distinct colors is often useful. Width determines how wide the edge is and is most comparable to the Size vertex property. The Visibility column affects the visibility of edges and can be set to Show (always show, no matter what), Skip (act as if the edge does not even exist in the dataset), or Hide (do not display on the graph, but otherwise treat it as if it is present). See Chapter 7 for more examples of using the Visibility column to filter out edges or vertices. Additionally, the Graph Options allow you to create Curved edges and Bundled edges (see Advanced topic: Graph options).

Figure 5.10 Edge visual property examples.

Combining Size and Opacity when using numerical data can make differences between edges more distinct. Use the Autofill Columns feature to set the edge Width and Opacity based on the Shared_Connections column as shown in Figure 5.11. This represents the number of shared friends that each pair of people have. Change the minimum edge opacity to 50 as shown in Figure 5.11. Also change the Edge Width Options to have a minimum of 1.5 and a maximum of 5. This will assure that each edge is visible, but not too wide. After clicking Autofill, the graph should look similar to the one shown in Figure 5.11.

Figure 5.11 Using Autofill Columns to set the Edge Width and Edge Opacity based on Shared_Connections data on the Edges worksheet.

Advanced topic

Exporting and importing NodeXL options

It can take considerable time to customize graphs so they look the way you desire. NodeXL allows you to import and export options settings (e.g., visual properties, labels, default settings), so you can use the same ones in different workbooks. To do so, find the Options portion of the NodeXL Ribbon. Clicking on Export will let you save down a .NodeXLOptions file that you can name with the options settings for the current workbook. Choosing Import will allow you to import such options into a workbook. You can also Use Current for New, which will use the current workbook's options for all new NodeXL workbooks. Reset All will reset all options to the original defaults.

NodeXL has “recipes.”

5.3.7 Showing the graph legend

A graph legend can be included at the bottom of the image, as is done in Figure 5.12. To view the legend, check the Legend item in the Graph Elements drop-down menu found in the Show/Hide section of the NodeXL Ribbon (see Figure 5.12). Notice that color is not shown in the legend. This is because a formula was used instead of the Autofill Columns feature.

5.3.8 Saving graph images and right-click graph menu

Right-clicking on the graph pane, or a specific vertex in the graph pane, will open up a customized menu as shown on the left-hand side of Figure 5.12. Menu items allow you to select and deselect subsets of vertices (e.g., adjacent vertices, or those that are connected to the selected vertex), edit the visual properties of selected edges or vertices, modify the layout, and adjust the layout.

To save a graph (and legend if you have one showing), choose the Save Image to File option in the menu (as shown in Figure 5.12). You can modify the Image Options, which allows you to change the size of the graph pane in the created image, as well as add or remove a custom header and footer. When you choose Save Image…, you will be prompted to choose a location and image file type. If you plan on printing the image, you may want to export it as an XPS file, which is a vector file format that can be scaled up or down to any size. The other file types are all pixel-based and will not scale well but may be well suited to web and small print contexts.

5.3.9 Graph options

Options used in a current file can be shared across workbooks as described in the Advanced Topic: Exporting and importing NodeXL options.

NodeXL allows you to customize many aspects of the graph pane through the use of the Graph Options dialog available on the menu at the top of the graph pane. There are three tabs in the dialog, each of which are described below.

• Edges tab. Default edge color, size, arrow size, and opacity can be set. Additionally, edges can be made so they curve or are “bundled” (i.e., clustered together when many point to the same node), though bundling edges can slow down graph layouts considerably.
• Vertices tab. Default color, shape, size, size of images, and drop shadow and glow effects can be set (see Figure 5.7). Glow effects slow down graph layouts, but can look nice when using a dark background color (see next bullet).
• Other tab. Background color or a background image can be set. For example, an image of a geographical map can be set as the background and nodes can be overlaid on top. Labels can be customized as described in Section 5.2.4. Additionally, custom right-click menu items can be added. For example, a url can be provided and a tip to open the URL when selected in the menu.

5.4 Practitioner's summary

Network visualizations come to life when they are combined with other data describing the vertices and edges. NodeXL supports vertex and edge labels. Three types of vertex labels can be used including: (1) adjacent labels that appear next to vertices on the graph, (2) shape labels that replace the vertex with the label, and (3) tooltip labels that only appear when mousing over a vertex. Label position, color, font type, etc., can be customized using label options. Many visual properties can be mapped onto vertices including color, shape (including labels and images), size, and opacity. Visual properties for edges include color, width, style, and opacity. Features such as Autofill Columns and built-in Excel formulas can be used to enter data into the Visual Properties fields automatically.

5.5 Researcher’s agenda

Creating network visualizations that help people gain insights from networks, particularly large and complex networks, is an active area of research. There is a long history of research on information visualization that identifies the visual properties (e.g., color, distance, size) that humans are most (or least) adept at understanding [1]. Most network visualization tools now allow attribute data to be mapped onto visualized attributes such as size, color, and shape. The combination of network data with attribute data is typically called multivariate network visualization [2], an active area of research given the difficult problems associated with such rich datasets. Researchers are increasingly examining richer visualizations for nodes including images, pie charts, or content-specific visuals such as 3D proteins [3]. Network visualization tools have also begun to integrate traditional node-link visualizations with alternative, complementary visualizations. For example, CyToStruct integrates node-link diagrams with three-dimensional molecular views important for bioinformatcs data [4], and NodeTrix integrates node-link diagrams with adjacency matrices that highlight local communities with social networks [5]. Other content-specific network visualizations utilize rich sets of visual attributes or symbols to help represent attribute data, such as in the Interactive Tree of Life (iTOL) viewer [6].

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for Chapter 5: Labeling and visual attributes

Create new playlist

Sign In

Sign Up

5.1 Introduction

5.2 Labeling

5.2.1 Viewing attribute data in the ABCD network file

5.2.2 Labeling vertices

5.2.3 Adding tooltips

5.2.4 Formatting, positioning, and truncating labels using label options

5.2.5 Label vertex shape

5.2.6 Labeling edges

5.3 Visual properties

5.3.1 Vertex color

5.3.2 Vertex shape

5.3.3 Vertex size

5.3.4 Vertex opacity

5.3.5 Vertex visibility

5.3.6 Edge visual properties

5.3.7 Showing the graph legend

5.3.8 Saving graph images and right-click graph menu

5.3.9 Graph options

5.4 Practitioner's summary

5.5 Researcher’s agenda

Table of Contents for
Chapter 5: Labeling and visual attributes