This chapter dives into human cognition and visual perception to frame the contribution of pre-attentive attributes like size, color, and position and how important they are to the storytelling process. It explores how these can be used strategically to help direct an audience’s attention and create a visual hierarchy of components to communicate effectively. This chapter provides the framework for curating story arcs and layouts with visualizations in Tableau that the following chapters explore in-depth.
It’s been said—by Tableau, actually—that data visualization is one of the most significant technologies of the 21st century. Of course, the act of visually representing data is not limited to the 21st century. Chapter 2 looked at Minard’s flow map of Napoleon’s invasion of Russia, published in 1869. In addition to Minard, many other quintessential examples exist of peoples’ attempts to visually explore, explain, and communicate data throughout the last several centuries. Many of these have had significant influence on modern data visualization—from Ptolemy’s earliest preserved use of a data table to display astronomical information in 150 BCE; to Descartes’ 17th century introduction of the Cartesian coordinate system; to Playfair’s invention of the time-series line graph, bar chart, and pie chart in the 19th century, Nightingale’s 1858 Coxcomb plot, Snow’s map of the 1854 London cholera outbreak, and Tukey’s box plot just a few decades ago (see Figure 6.1). As far back as we can look, we seem to have always looked for visual ways to see and understand data more clearly.
Why? Humans are intrinsically visual creatures. In fact, of all the powerful processing systems hardwired into the human brain, none are more powerful than our visual system. Our brains are literally designed with cognitive and perceptual abilities to visually process complex information. We’ve been learning, remembering, and even writing (our earliest forms of written languages were cuneiform) using the power of pictures for nearly all of recorded human history, so it makes sense that we would apply this same cognitive horsepower to how we interact with data, too.
note
John Medina, a developmental molecular biologist who studies how the mind reacts to and organizes information, developed the concept of the picture superiority effect, which recognizes that information learned by viewing pictures is more easily and more frequently recalled than that learned purely by textural or other word-form equivalents, including audio (see Figure 6.2).
note
Read more on our shared visual human history, including a study of visual communication from cave drawings through advance visualizations, in The Visual Imperative.
The power of pictures isn’t only limited to helping us transform data into meaning. Visuals also act as memory magnets in our brains: We embed and retain memories in visual form. We’re extremely good at this visual memorization, too: Research into our visual retention systems dating back as early as the 1970s has measured our visual memory capacity to be somewhere in the vicinity of 10,000 images with a recognition rate of approximately 83%.
With that kind of retention power, one can easily understand how our visual capacity demands careful attention when working with data. We want to make sure that we are visualizing data both efficiently to leverage our cognitive abilities, and accurately to ensure that we are representing information correctly.
Chapter 2, “The Power of Visual Data Stories,” briefly covered perceptual pop-out and some of the cognitive aspects that make data visualization and visual data stories so powerful. We must discuss many important elements as we dissect the visual properties of visualization and the best ways to capitalize on them. The steps you take to format your analysis and presentation are critically important as they can make or break the visual appeal and effectiveness of any visualization.
As you might have noticed in our previous work in building basic charts and graphs in Tableau, you can format just about every visual element you see on a worksheet, from titles and subtitles; to typeface fonts; to color use, shape and size of symbols; and shading, borders, and lines. Further, you can specify format settings for a specific field within an individual worksheet, entire workbooks, or within dashboards and stories. This chapter provides a closer look at a few of the most important visual design building blocks, and how you can use Tableau functionality to embed these visual cues purposefully and intuitively as you format your visualizations.
In particular, this chapter covers:
Color
Lines
Shapes
Chapter 8, “Storyboarding Frame by Frame,” covers additional formatting for dashboards and stories.
note
Before curating your visualization, you must first understand and explore the data, and represent it using the chart or graph best suited for your data’s story. From there you can apply visual cues that enable you to intuitively and meaningfully communicate to audiences. Chapters 5, “Choosing the Right Visual,” and 9, “Advanced Storytelling Charts,” cover selecting and building the best charts.
Color is one of the most important, most complicated, and most frequently misused elements of data visualization. Used well, colors can enhance and clarify a visualization, but used poorly color can confuse, misrepresent, or obstruct clear communication. Color is such a critical element in representing data visually that Tableau employs color scientists to help design the best color palettes as well as provide deep education on the appropriate use of color. This section covers how to properly apply color to a visual that aligns to its data and your story; however, this is an area in which you should invest more time learning.
All marks in Tableau have a default color, even no fields are placed on the Color Marks card. For most marks, blue is the default color. For text, gray is the default color. We’ll explore how to use the Color Marks card when looking at how it is used in different types of visualization.
note
The visuals in this section use the Global Superstore dataset available from Tableau.
Tableau applies color depending on the field’s values. For discrete values, or dimensions, Tableau typically uses a categorical palette, whereas for continuous, or measures, Tableau assigns a quantitative palette. These translate into the three primary ways to encode data with color in data visualization:
Sequential
Diverging for continuous, quantitative values
Categorical for discrete values.
Sequential color encodes a quantitative value from low to high using gradients of a single color and is usable when all values are positive or all values are negative. A great example of this is Sales, which goes from zero to infinity. The map in Figure 6.3 uses a sequential color scale to encode positive sales amounts into each U.S. state.
The automatic sequential color palette used in Tableau is blue. In addition to being able to change the palette itself, you can also adjust the distribution of color by clicking the Color Marks card, which opens the Edit Colors dialog box (see Figure 6.4).
Diverging color encodes a quantitative value but has a midpoint—for example, zero—and numbers on either side of this midpoint—positive and negative—are displayed in a different color with each having its own sequential palette.
Figure 6.5 uses a diverging color palette to display profit by state (top) and profit by product category and subcategory (bottom). Positive profit is colored blue with darker blue reflecting higher profit. Profit could also be negative, and in this visual negative values are encoded in orange with the darker orange reflecting a bigger loss.
Similarly to the sequential color palette, Tableau automatically assigns a color palette for diverging values. The default is an orange-blue diverging color palette (we discuss why not to use a red-green diverging color palette and instead use better palettes to avoid issues of color vision deficiency later in this chapter). As with any color palette, you can adjust the diverging one (see Figure 6.6).
Diverging color palettes are clearly designated in the Tableau color library. Beyond changing the colors themselves, you can also adjust the midpoint. Midpoints are not exclusively zero. They could be the average, such as above or below average, or target, whether exceeding or below.
In addition to changing the range of colors, you can also group values into color-coded bins using stepped colors. Use the up and down arrows to specify how many bins to create (see Figure 6.7).
If it makes sense to do so, you can select the Reversed option to reverse the order of colors in the range. For sequential colors, this means darkening the intensity in the lower values rather than the higher. Likewise, for diverging colors, this means swapping the two colors in the palette in addition to reversing the shades within each range.
Finally, categorical colors encode categories—apples, bananas, and oranges; shoes, socks, and shirts; or in the case of Figure 6.8, furniture, office supplies, and technology—using distinct colors appropriate for fields that have no inherent quantitative order.
By default, Tableau assigns categorical color using the automatic Tableau color palette. As you might expect, you can adjust this to use a color palette of your choosing (see Figure 6.9). To change color values, click the Color Marks card and select Edit. A dialog box opens that allows you to select from the color palettes in Tableau. You can assign an entirely new palette to every item in the field, or you can manually assign a color to each field by selecting and assigning a color swatch to individual items. If you need to manually add a specific color shade to comply with company branding guidelines, you can do that, too.
Beyond advanced color options, additional configuration options not related to the actual colors shown are available in Tableau. These include adjusting opacity, mark borders, and mark halos (see Figure 6.10). The preceding chapter explored some of these functions briefly within the context of curating maps.
Adjusting the opacity can be helpful for looking at dense scatter plots or in maps overlaying a background image. Moving the slider left makes the marks become more transparent. Consider the before and after in Figure 6.11: The map on the top has opacity at 100% whereas the opacity in the map on the bottom is at 50%.
Tableau automatically displays all marks without borders; however, you can turn these on for all mark types except text, line, and shapes.
Borders can be helpful in distinguishing closely spaced marks. However, they can also make distinguishing color-encoded dimensions (as they make marks narrower) more difficult (like in a stacked bar chart). Adjust your visualization with and without borders to see whether they add or reduce clarity.
Lastly, mark halos can assist in making marks more visible, particularly on maps, by surrounding each mark with a ring of contrasting color (see Figure 6.12).
We can also use color to highlight data or alert audiences to important insights.
You use a highlight color to highlight one data point or category. For example, if you are tracking profits of product categories over time with a separate line representing each category and you want to highlight the consistent high profits in a certain category, perhaps technology, you can highlight one state by coloring only this line and using gray for the other categories (see Figure 6.13). This allows the audience to clearly see how well this category is doing in comparison to others on the chart.
Earth-tone colors, like blue, are great colors for highlighting.
Similarly to highlighting, you can use alerting colors to draw the audience’s attention to a particular data point. Using the same line chart as in Figure 6.13, rather than highlighting the high profits of technology maybe the goal is to instead alert the audience of the low profits in furniture (see Figure 6.14). In this use case, alerting is done with an alarming or alerting color—like red—to indicate to the audience that something is wrong.
It’s important to note that in Western culture red is often associated with negative values or associations; however, this is not always consistent with color culture in other countries, like China. Bright alerting colors could be red, orange, or yellow.
This note about the cautious use of red in visualization brings us to an important conversation regarding the use of red/green color palettes in visualization. Most of us are familiar with the traffic light palette, where red is stop, green is go or positive, and yellow or orange (or white as a midpoint) means to proceed with caution.
Although this option is available in Tableau, there is very rarely—if ever—a valid and compelling reason to use it. Instead, a very compelling reason exists not to use it: color vision deficiency (CVD). For the most part, we all share a common color vision sensory experience. However, as many as 8% of men and 0.5% of women suffer from some kind of color vision deficiency. Most prolific among these is red-green, or deuteranope, making this palette particularly troublesome. (Even if the red/green palette must be used, there are techniques to make this color palette appropriate for circumventing color vision deficiency issues, such as using a blue-green rather than a pure green, which will help make the colors distinct enough that they can be recognized by someone with CVD.)
Take a look at Figure 6.15. The highlight table on the left will appear commonly red-green to someone without color deficiency, but to someone with deuteranope, the colors appear as the image on the right. As you can tell, the colors to someone with red-green color vision deficiency are much less distinguished, which reduces the potency of the visualization.
To provide an easy solution for navigating complexity of color vision deficiencies, Tableau has both an orange-blue diverging color palette (for quantitative values) and a color-blind palette (for categorical values); see Figure 6.16.
tip
To experiment with how your own visuals might appear to someone with various types of color vision deficiencies, visit http://www.vischeck.com.
USING TITLES AS COLOR LEGENDS
Chart titles and subtitles provide an interesting opportunity to embed a color legend. Consider the example displayed in Figure 6.17.
Lines facilitate several purposes in data visualization. They act as guides, they reinforce patterns, they provide direction, and they create shapes. Like any visual element, too many lines—or lines given too much emphasis—can cause distraction or confusion in data visualization. However, used wisely, they can be transformative. Like color, lines should be used sparingly to reduce the amount of ink onscreen so that the data can lead the story.
This section covers how to format lines within individual visualizations in Tableau and make effective use of them as view lines (axis lines, reference lines, and so on), borders, and shaded bands. It also looks at how lines can affect visualizations in the form of gridlines, axis rulers, and panes in Tableau.
To format lines in Tableau worksheets, select the Format menu, then the part of the view that you want to format (see Figure 6.18).
You can also right-click on your sheet and select Format (see Figure 6.19).
Either of these methods opens the Format pane as a new tab in place of the Data pane with icons to help direct formatting of individual elements in the worksheet (see Figure 6.20).
note
In this section, I create visualizations using a Tableau-provided dataset of the most popular male and female baby names in each state for each year from 1910–2012. This data is collected from the Social Security Administration.
By default, most chart types in Tableau include gray axis lines, zero lines, drop lines, and borders. If added, reference lines that help you analyze statistical information in the data are also gray by default.
In general, a best practice is to remove as many of these as possible, keeping only what are necessary to guide audiences through the visualization or highlight important aspects of the data.
Let’s walk through the steps of formatting lines together using a series of simple visualizations.
Grid lines, zero lines, and drop lines connect marks to the axis. You format them using the lines icon on the Format pane and you can adjust them by sheet, row, or column (see Figure 6.21).
Figure 6.22 is the default view of a sorted bar chart reflecting the Top 10 Girls’ Baby Names in the U.S. from 1910–2012. (Titles/subtitles have been formatted by double-clicking on the default title line.) We will use this visualization to experiment with formatting lines.
To create this chart and follow along, follow these steps:
1. Filter the data by Gender, selecting Female.
2. Drag Occurrences (SUM) to Columns, and Top Name to Rows.
3. Sort Top Name descending by Occurrence.
4. Select all names after the first ten, right-click, and select Hide.
5. You might also adjust the view to Entire View.
Selecting which lines to remove is a matter of judgment. Typically, I remove all grid lines. You can turn off zero lines, too, but whether or not you choose to remove this line should depend on how important being above or below zero is in your data.
In this example, I remove all grid lines, zero lines, reference, and drop lines. I also remove axis ticks and row axis rulers, but keep column axis rulers for reference and reformat them as dotted lines (see Figure 6.23).
The resulting visualization is much cleaner, with only one line at the bottom of the x-axis, as shown in Figure 6.24.
Borders are the lines that surround visualizations, demarking the table, pane, cells, and headers. You can specify the border style, width, and color for the cell, pane, and header areas. You format these using the grid icon on the Format pane (see Figure 6.25).
Returning to the bar chart, I have added orange row dividers as borders to show how they appear when formatted (see Figure 6.26). Notice that because I changed the format of the axis line to a dotted line, it now appears as colored dots.
Row and column dividers are most commonly used in nested tables, because they serve to visually break up a view and separate data fields, especially when several levels of data exist. Figure 6.27 is the default view of a nested table bar reflecting the Top 10 girls and boys baby names from 1910–2012. (Titles/subtitles have been formatted by double-clicking on the default title line.)
To create this table and follow along, follow these steps:
1. Drag Gender and Top Name to Rows.
2. Drag Occurrences (SUM) to the Text Marks card.
3. Sort Top Name descending by Occurrence.
4. Scroll through the table and select all names after the first ten for each gender, right-click, and select Hide.
Using the Format Borders function, you can modify the style, width, color, and level of the borders that divide each row or each column by using the row and column divider drop-down menus. The level refers to the header level you want to divide by; at the highest level, all fields are divided (as shown earlier in Figure 6.27, which at the highest level is divided by Top Name).
This many lines are unnecessary, even in a table. Thus, at the sheet level, I’ve reformatted the row dividers to a Level 2 heading so that I can differentiate between genders rather than separating each name. At the column level, I’ve also removed the column divider on the right to simplify the table’s appearance (see Figure 6.28).
The resulting table shown in Figure 6.29 is a cleaner, easier-to-read table.
Finally, at the intersection of lines and color, you can use shading to set a background color to the entire visualization or to various areas of importance, like headers, panes, or totals.
Commonly, shaded areas are a technique used for banding, where the background color alternates from row-to-row or column-to-column. This technique is useful for tables, as shown earlier in Figure 6.29, because it helps the eye distinguish rows or columns more intuitively than added superfluous lines. To format shading and banding you use the paint can icon on the Format pane (see Figure 6.30).
The nested table in Figure 6.29 has row banding by default. If desired, you could change this by sliding the band size to zero (see Figure 6.31).
Explore additional banding options by interacting with the various settings on the Format pane. As a guide, practice with the following:
Pane and header: This affects the color of the bands.
Band size: This affects the thickness of the bands.
Level: As described earlier, if you have nested tables or multiple dimensions, this option allows you to add and format banding at specific levels.
REMOVING FIELD LABELS AND UNNECESSARY HEADERS
By default, when you create a visualization, Tableau provides both field labels and headers for each axis. Often, this is redundant, especially when you add a title to your chart. Removing unnecessary elements streamlines the visualization for your audience.
As an example, reconsider Figure 6.24. You can see the field labels for Top Name rows, as well as the axis header for Occurrences directly below the count. Does your audience need this duplicated information, or can we trust them to infer the fields without an additional header? If the latter is true, consider right-clicking the field label for Top Name and removing it. Next, to remove the axis header, right-click to Edit Axis and remove the title by erasing text in the field. The resulting visualization is much simpler (see Figure 6.32).
For additional simplification, you can remove the x axis entirely, and label individual bars instead (see Figure 6.33).
Figure 6.34 shows a before and after view of the original and finished versions of this chart
tip
Adding back a previously removed header can be a bit of a trick in Tableau. To unhide a header, go through Analysis > Table Layout. You can also unhide any header from the rows or columns by simply right-clicking on the pill. Use the header’s check box to toggle the header’s display on or off for each pill.
As a time-saving technique, shapes are one of the ways that our brains recognize patterns. We immediately group similar objects and separate them from those that look different. Some chart types, like packed bubble charts, use shapes (along with size and color) to encode meaning. Additionally, we can use shapes in interesting ways to personalize data stories in Tableau. The two ways to use shapes in Tableau are with the Shape Marks card and custom shapes.
The Shape Marks card feature allows you to assign different shapes to data marks. Dropping a dimension on the Shape Marks card prompts Tableau to assign a unique shape to each member in the field, as well as display a shape legend (see Figure 6.35). Using the Size Marks card allows you to enlarge or reduce the size of each shapes mark.
As displayed in Figure 6.35, default shapes in Tableau are unfilled symbols. This palette contains ten unique shapes. If your data has more than ten members, the shapes will repeat.
You can edit this default palette and assign a different palette from the library of shape options within Tableau. Choices include a variety of shape palettes, arrows, weather symbols, and KPI metrics. To edit the shapes palette assigned to your data, click the Shape Marks card and select Edit Shape. A dialog box, similar to the Color dialog box, appears that allows you to select a new palette as well as manually assign shapes to each data item (see Figure 6.36).
If none of the palettes in the Tableau library appeal to you or are suitable for your dataset, you can also add custom shapes into your Tableau environment for use in your workbooks. Custom shapes can add a nice design touch to your visualization, particularly when you are building a narrative or working to create engagement or visual appeal.
This function requires accessing your Tableau Repository on your machine. To add custom shape palettes into the Tableau library, follow these steps:
1. Create your image files. Each shape should be its own file, and most image formats (including .png, .gif, .jpg, .bmp, and .tiff) are acceptable. (Tableau does not support symbols in .emf format.)
2. Copy the shape files to a new folder in the My Tableau Repository>Shapes folder on your computer. The name of the folder will be the name of the new palette in Tableau.
note
If you plan to use color to encode shapes, use a transparent background in your image file (PNG). Otherwise, the entire square of the symbol thumbnail will be colored, rather than just the symbol itself.
Figure 6.37 shows that I have added two new palettes, Harry Potter and Hogwarts House Crests, to my shapes library.
When you return to Tableau, you will see the new palettes included in the Shape Palette library in the Edit Shape dialog box. If you modified the shapes while Tableau was running, you might need to click Reload Shapes (see Figure 6.38).
You can assign these new shapes in the same manner as you do any shape palette within Tableau.
tip
For tips on creating custom shapes best suited for use in Tableau, see the help article at https://www.tableau.com/drive/custom-shapes.
This chapter looked at how to format important visual cues in data visualization that can enhance your data’s story if used wisely. The next chapter covers preparing data for visual analysis and storytelling in Tableau. Chapter 8 reviews how to use additional formatting options in Tableau to build visualizations, dashboards, and story points to present a complete and compelling visual data story.