M. MucchettiBigQuery for Data Warehousinghttps://doi.org/10.1007/978-1-4842-6186-6_17

17. Dashboards and Visualization

Mark Mucchetti¹

(1)

Santa Monica, CA, USA

Accurate and up-to-date visualization of an organization’s systems is foundational to modern business. In a world where data is constantly bombarding us through the plethora of devices we use each day, your business data is literally competing with everything else for attention.

An Internet-focused career requires constant access. We use dozens of online resources daily to help us do our jobs and keep in touch. The nature of human attention means that despite our best intentions, we tend to gravitate toward the most prominent pieces of information. Regardless of how you feel about the effect this has on productivity, you can find a study to support your position. Either way, it’s in the best interest of your data program to meet its users on their terms.

Visualization should make important information apparent with minimal interpretation. Green is good; red is bad. I used to joke that an executive summary report should consist of a single emoji: ../images/491470_1_En_17_Chapter/491470_1_En_17_Figa_HTML.gif or . The state of the business is clear at a glance; this generally prompts questions that continue the discussion. In reality though, your typical stoplight dashboard is basically equivalent.

Dashboards take the visualization one step further and present it real-time. When properly executed, they act like thermometers during fundraisers on public television. Everyone watches the dashboard and responds to every major update. The rest of the time, it sends a constant signal of how things are doing. You can’t improve what you can’t measure—it seems reasonable that continuous measurement and continuous improvement would be related.

This also prevents issues based on anecdotes or feelings from spiraling out of control. I couldn’t tell you the number of times I received a call about how a user had experienced an error and, consequently, every user must be experiencing it and the sky was falling. I would immediately look to my primary dashboard, where everything was green and within range. I could confidently say that the user was experiencing an isolated incident and move to triage appropriately.

On the flip side, this level of visibility makes it hard to conceal issues. If I arrived at work and any metric was red, my team already had an explanation, even if it was just “we’re looking into it.”

Ultimately, a culture of data comes from a continual stream of things to look at coupled with an awareness of what those things mean. The goal is to be as clear as possible, such that the intent of the message is immediately apprehensible. A dashboard with 400 metrics may be very useful to its creator, but no one else is going to have a clue how to read it.

False positives and false negatives can be equally damaging. Calibrating metrics in a way that indicates the actual scale of the problem is critical, in the same way we’ve treated data accuracy. If people come to learn the presentation of the metric is misleading—even if the underlying metric is accurate—they will begin to tune it out. There’s a theory here around making periodic changes and refinements to encourage a conscious re-evaluation from time to time. That also represents a delicate balance: too much change and people will lose their instinct into reading the data; too little and it will fade to background noise.

In healthcare, this problem is known as “alarm fatigue.”¹ Healthcare professionals hear so many alarms, beeps, and background noises from medical equipment that they start failing to process their meaning. Many alarms are false or disproportionately loud or insistent. Lives are at stake in this example. While your use case may not be as extreme, this illustrates the need to calibrate dashboards to the correct sense of urgency.

The recent popularity of automatic modes like “do not disturb,” “night hours,” or “quiet time” on mobile devices is another indication that we’ve reached notification overload. More ominously, we’ve reached notification homogeneity: a picture of your friend’s dog and an impending natural disaster both notify with the same importance level.

We have the tools to program whatever dashboard design and notification hierarchy we want, but we can’t change the underlying finite resource—the attention of your users. Conveying information visually is a daunting task. As your key resource for a valuable reporting suite is a product manager, your key resource for a valuable dashboarding system is a UI designer. If you don’t have one available, don’t worry; most people are still doing this on their own.

This chapter is about creating worthwhile visualizations and dashboards for your data and deploying them as broadly and pervasively as possible.

Visualizations

A straightforward tactic to convert data into information is to visualize it. This accomplishes two things. Firstly, it shows raw data in a format that is easy to understand at a glance. Secondly, it allows the creator of the visualization to reduce the data into only its most salient characteristics. In other words, you can selectively eliminate data that isn’t relevant to the information you’re presenting.

A Bad Example

I’m a big fan of illustration by counterexample. Figure 17-1 is a terrible visualization.

../images/491470_1_En_17_Chapter/491470_1_En_17_Fig1_HTML.jpg — Figure 17-1
A terrible visualization

Can you spot all the issues?

The tendency to compress as much information as possible into the space can be overwhelming. The same issues plague slide decks, reports, and instruction manuals. It’s not always enjoyable to assemble a piece of IKEA furniture, but compared to a less-reputable manufacturer, the advantages of simplicity are obvious.

Visualization Features

There are many attributes which contribute to a clear and effective visualization. To understand the problems this visualization failed to solve, let’s look at some of those attributes. (Any design resource you have access to will easily be able to do this exercise in their heads and suggest something better, so if you have one, use this section to appreciate how difficult their work can be.)

Chart Type

Choosing the right visual form for your data is the most critical step. Thankfully, this area is well-understood, and lots of guides exist for choosing the right kind of chart. Here are a few basic pointers:

When showing data points in relationship to each other, bar and line charts are appropriate. Line charts show connected values over time (like stocks), while bar charts are good at comparing across categories.
For proportional data, where items add up to a whole, pie charts and their variants are good. If you need to compare multiple proportional series, a donut chart works.²
Showing all pieces of data in relationship to a single set can be done with histograms or scatter plots.
You can also use stacked charts to break categories down into multiple values.

The nice part about using a visualization tool instead of constructing manually is that you can quickly switch among chart types and see which one tells your story the best.

However , keep it simple. If a particular chart type looks impressively complicated, but its purpose is not clear, don’t use it. That being said, it can be fun to write a bulleted list and then style it in every conceivable PowerPoint SmartArt style. Some of these formats don’t even seem to display any type of data particularly well.

Example: A pie chart is used, but the values don’t total to 100% or even to a stated total. A bar chart is used in place of a pie chart. It’s completely unclear what purpose a scatter plot serves here.

Scale

Choose a scale that makes sense for your data. This means choosing the correct minimum and maximum values for each axis, as well as choosing the right kind of scale. The scale must be chosen appropriately so that everything fits in the available space, but also conveys the correct relationships between data points. Linear and logarithmic scales are both common.

We’re innately wired for comparison and need digestible ranges on which to understand magnitude. In casual language, people can say things like “I ate a lot of food,” and we can map it to a roughly equivalent scale of satiety. Many technology data points are at scales which are difficult to comprehend. Let’s say that I ran a “huge” query on BigQuery. What does that mean? No one could guess if it even referred to giga-, tera-, peta-, or exabyte scale. In the same way, the natural question “Compared to what?” arises as any scale is presented.

By the same token, if two graphics have a relationship to each other, the scale should be the same between them. Viewers are likely to form comparisons based on the occupied physical space. In some instances, you can use “not to scale,” but that leaves people with no reference frame whatsoever, and it’s still hard to suppress the instinct to compare.

Example: Bars have values that don’t correspond with their height. In some cases, it’s unclear what the scale is intended to represent.

Labeling

Here’s another reminder that naming things is difficult. Placing data in context means supplying accurate and succinct labels.

Place labels near the things they label. If a viewer has to jump back and forth between charts and legends, context is easily lost. Drawing arrows to make this connection means the viewer has to follow the arrow instead. When there are too many labels, the viewer is spending more time reading than looking at the measurements in relation to each other.

Always label axes and include units. If scale is involved, state it clearly (i.e., “sales in millions”). Use horizontal labeling that can be read easily unless you have a compelling reason not to do so. If a specific value is important, label it; if it’s not or if the relative scale is more important, let the axis values do the labeling.

When a single chart clearly has too many labels, decide whether you can eliminate some or if it’s actually better to separate it into two separate charts. If it can be done simply, communicating related messages in the same chart is fine, but don’t confuse clarity with compactness.

As for the actual things being labeled, use consistent language taken from your data glossary. In most cases, the data and axis labels should closely resemble the BigQuery column names they come from. Summary and aggregate labels should be used more to get your point across.

Example: There are essentially no useful labels. Charts are labeled with the kind of chart they are, but nothing about what the chart actually represents. Some labels are missing entirely. Labels aren’t placed near the things they describe.

Simplicity

Show required data only.

In an information-dense format, there’s no space to clarify or explain a statement. If people want lots of details about a subject, they will read a book.

Example: This visualization is not simple.

Relevance

As a corollary, don’t include details that aren’t important to the understanding of the information. Examples include superfluous explanations like “XXX sent an email about this” or any textual redundancy of the chart itself. If the chart is too confusing for the viewer to understand, a giant block of text isn’t going to help. If you find it difficult to explain the data yourself, re-evaluate the message you’re delivering.

Example: There’s a textual description explaining the value for pizza, and the label is placed in an irrelevant spot. Arrows distract the viewer from the data on the chart.

Consistency

Labels , colors, fonts, and data points should be used consistently within the same chart and across charts on the same report or dashboard. Using extra colors or fonts for no reason will confuse the viewer into thinking they mean something.

Remember from the previous chapter that people will assume things that look similar come from the same place. This sort of consistency is also important in visualizations.

Example: In addition to the values being mislabeled, some values have units, while others do not.

Style

As with chart type, don’t overdo it. Resist the urge to complicate a visualization to show off.

If your organization has a style guide, even if it’s primarily for external use, use it in your visualizations. In addition to looking “official,” it will prevent you from using too many colors or fonts if you are constrained to your organizational design language.

On that note, apply the same concepts of simplicity to the visual design. When in doubt, do less. Don’t use lots of different color patterns or more than two or three fonts. Also, consider what form(s) your visualization may appear in. Print design and digital design have different objectives, and the common denominator is fairly narrow. If there’s any chance that your visualization will be seen on paper, in black and white, optimize for that or produce a second visualization.

Design resources are equally intensive; if you have access to a designer, rather than asking them to design individual visualizations, let them help you to establish an organizationally aligned design language. They will know if there already is one and, if not, can help fill the gap in a natural way. This will let you focus on the actual information and its display, not the presentation.

Example: Honestly, it’s just a mess.

Accuracy

Most importantly, display accurate data. Viewers need to trust the underlying data source. They also need to trust the creator of the visualization. Avoid misspellings, references to data which has been omitted from the chart, or inaccurate or out-of-date descriptions.

Example : The description text at the bottom has misspellings. It says the data was collected from three sources, but only two appear on the chart. It suggests you refer to another part of the diagram for more information, but that part isn’t on the diagram.

Example: Several words, including the title, are misspelled. An asterisk indicates a footnote, but none is present.

Transparency

As a corollary to accuracy, don’t violate these rules to bias a viewer into a certain conclusion. Absolutely highlight the relevant data, and show contrast in areas where the emphasis is necessary for understanding. Don’t omit or obfuscate data that doesn’t support your viewpoint.³

If the viewer senses you are attempting to mislead or recognizes that the graphic is concealing important information, they will become angry. That will undermine any legitimate point you have to make.

Example: This diagram isn’t based on real-world data, but the attention drawn to the “high-performing cluster” omits any description of what the low end of the chart might refer to.

Dashboards

Unlike reports, dashboards are a much newer invention and only recently came to take on their current meaning with respect to organizational data. The original term “dashboard” was first attested in the mid-19th century. In that sense, it referred to a literal wooden board that prevented the driver from being covered by mud that had been “dashed up” by the horses (see Figure 17-2). Unlike early reporting, the early dashboard didn’t convey any information at all.

../images/491470_1_En_17_Chapter/491470_1_En_17_Fig2_HTML.jpg — Figure 17-2
A dashboard. Really!

As the years went by, automobiles replaced horses. The dashboard served its original use, first against debris and then from scalding motor oil. When cars began to feature simple instrumentation, the dashboard was the natural place to put it, given its convenient position in front of the driver. The name was retained and now primarily connotes the modern meaning: a system of dials and gauges designed to be read quickly and provide instantaneous feedback.

The original auto designers had the right idea—instrumentation needs to be in line of sight and convey basic information rapidly. Also like the original designers, you have a similar responsibility, which is to find your users’ line of sight and place the simplest possible data there. You make do with what you have. Also key to the automotive analogy is that users cannot interact with dashboards; they are read-only. Even in today’s cars, the number of screens you can reach with steering wheel buttons is limited, and most cars will prevent an attempt to do anything complicated during driving.

There are other design cues from modern automotive dashboards that apply here. Large amounts of raw data must be synthesized to provide, say, a blind spot detector. Showing the driver more than an on/off indicator that someone is in their blind spot could be distracting and possibly dangerous. Most of the data on an automotive dashboard is current point in time (speedometer, tachometer, fuel level) or reflects moving averages from the recent past (distance to empty, average fuel economy). Cumulative or running totals can also be useful to assess the overall age of the system (odometer, trip meter). Lastly, there are many alert notifications of danger or required action, but these are only visible when active. There is no “don’t check engine” light—it is only active when action is required.

Car manufacturers have had decades to discover information display best practices. They also reflect a limited system under one person’s direct control. Your task is considerably more challenging (and hopefully less dangerous). Data dashboards must reflect countless systems and health indicators with widely varying ranges. Any data that originates from an end user or process can fluctuate wildly during the course of a day and have unpredictable spikes or lulls at thousands of times average. Users aren’t yet familiar with how to incorporate dashboards into their own work. And the scale and importance of dashboard metrics may be unclear to anyone except the developers of the system.

If these hurdles are overcome, users will ask to create their own dashboards, and they will proliferate throughout other departments. Dashboards are just as effective for all kinds of business data as they are for technology data, system health, and the outside world.

In fact, as we’ll cover, providing business context for data inside technology is an effective way to get your technology team thinking along business lines and prioritizing revenue. This can eliminate schisms between technology KPIs and the rest of the organization in a low-maintenance way.

Visualizations vs. Dashboards

The biggest difference between visualizations and dashboards is the temporal dimension. This creates an emphasis on “recency” in a way that reporting visualizations don’t have.⁴ For example, a common visualization would be showing an organization’s revenue by quarter, perhaps over one or several years. Stock charts may go to five years as static visualizations.

When you place those same metrics on a dashboard, the value of the data point fades exponentially with age. This is the same concept that helped us determine how to expire activity and error logs. In that model, it also suggested that we use moving averages or to collapse less recent data into the aggregate as we go.

This window depends on the information your dashboard reports, but it rarely goes beyond 30 days or so, with the exception of grand total metrics like “Year-to-Date” (YTD) revenue. Note that a YTD metric still has a recency component—it reflects the past, but it also constantly changes to capture the present. That sort of temporal variation is a way to show scale in dashboards too. Viewers can understand the rate of change by watching the dashboard for a short period of time.

Visualizations can also tolerate a time period where the viewer is “reading into” them. For complex messages, you want the trend to be quickly discerned, but other implications of the visualization can reveal themselves as the viewer obtains additional context.

With dashboards, there’s no subtlety. A spaghetti mess of lines and colors may look impressive (there’s that word again), but you’ve lost the viewer already. If you have a metric that is 5, and it was 3 a few minutes ago, the best way to show it is a giant 5 and a green arrow pointing upward. As an outsider, I may not understand the meaning of the data point at first, but I do know that it’s 5, that it’s rising, and that that’s a good thing.

Dashboard Hierarchy

The biggest advantage that the data warehouse provides is the ability to combine metrics from disconnected areas of the business. Yes, each of your departments can use their internal authoritative tools to produce dashboards with their departmental data. Also, many dashboards will consist of data tailored from a specific department’s logical data mart. And as you democratize, stakeholders in each department can do this from themselves using BigQuery.

Where the centralized warehouse comes in is the ability to create organization-wide dashboards linking together data in innovative fashion. You can break out of individual siloed feedback loops with a single view uniting data. This can produce results that highlight problems or opportunities no department could see on their own. For example, what could it mean if website traffic is at an all-time high, but sales are below average? What if error rates on your mobile application are extremely high, but the call center reports average activity levels? Instead of relying on an email between departments to even identify these two key metrics aren’t following their normal relationship, you have it at a glance.

In a similar exercise at a higher level of abstraction, you can design a hierarchy of dashboard displays that roughly sketch out what message each dashboard will convey and at what level it must do that. This also gives you a prioritization rubric if you don’t have time or space to do all of them. Have this discussion with your data governance steering committee—they will each have a litany of ideas for metrics from which they would benefit.

Use Cases

Any set of related pieces of information that needs to be monitored on an ongoing basis is a candidate for dashboarding. A single dashboard should provide a meaningful piece of information about a system as easily as possible. A “system” may be a literal system, or it may be a department’s performance or information about the organization at large.

There’s no set level of importance that warrants dashboard creation. It’s probably not useful to visualize areas with low impact to the business or things at such a granular level of detail that they won’t ever change. (For example, if your business operates in a single state and has no immediate plans for expansion, showing a map of the United States with that state lit up and the number “1” isn’t very helpful and will probably actually lower spirits.)

By the same token, any data which can be represented in an easy-to-read form is also a candidate. This usually means following the same rules as visualizations, but as we covered, the existence of the temporal dimension emphasizes certain things and de-emphasizes others. All the rules about clean display of information apply and then some: it should roughly be as simple to the business as a speedometer is to the driver.

Ubiquity

Data everywhere, all the time. Public-facing data in reception. Department data in every department. Every single employee should have line of sight to a dashboard with metrics that are (or should be) relevant to them. Obviously, this won’t always be possible, but it’s a target.

I consulted for a Fortune 500 company that had several dashboards in the main lobby for employees and visitors to notice as they arrived for work. The metrics they had chosen were things like Twitter word clouds and sentiment analysis. They were also totally unfiltered, so negative sentiments weighted just as highly as the positive ones. This meant that every day, every employee had an instinct about how the public was feeling about them that day. That’s what you’re going for in the ubiquity category.

Surprise

The opposite of notification fatigue is “surprise.” This doesn’t mean blocking employees in their cubicles with rolling dashboards while they aren’t looking. Instead, it means creating the right amount of variation that the dashboards remain fresh.

Rotating Views

One way to maintain “surprise” is to use rotating views on dashboards, say four or five. People walking by will catch different views at each glance, and it will take proportionately longer for them to get used to a single board and stop looking at it.

Animation

The board will update on a regular basis and stay interesting, but also consider using relevant graphics or animations for unusual events. For example, you could create a dashboard where a record sales day causes the board to show an animation of raining money. This principle is the same as achievements in video games or the recent trend toward showering your screen in confetti when you complete a desired action. It’s a form of feedback in response to organizational action.

Easter Eggs

Most dashboarding systems don’t support this sort of thing yet, but they should.⁵ I’ve accomplished it with creative use of iframes or browser content plugins. Occasionally, the dashboard should do something really surprising, like a bear dancing on screen, grabbing a metric, and running off with it. A moment later it reappears on the board. It does not repeat.

This sort of secret is almost certainly not worth the effort, unless your full-time job is making dashboards, but it will get people talking.

Related Metrics

If you have a laundry list of metrics people want to see or you feel like you have hundreds of relevant metrics in general, begin to investigate their relationships and see if you can compress some of the redundancy into a more meaningful point of data.

For example, you may want to know how many messages are in your Dataflow queues. If you have multiple steps in the queue, it seems reasonable to list all of them. Instead, consider showing an average of messages in each step or a sum of messages in all steps. If you really need to know each queue individually, consider whether the actual number is important or the relation between the numbers of messages in each queue is the important data. If the latter, you can make all queue counts into a single line chart and place it next to the sum or average.

In some cases, seemingly unrelated metrics will all tie down to a single thread, for example, the number of orders or transactions. Sometimes it’s appropriate to do a calculation on metrics and show the result, if you know viewers are just going to do it in their heads. For example, even though the three numbers comprise a division problem, showing a total revenue, a number of transactions, and an average transaction value would be okay, even though all three numbers move in tandem. Do this if all related numbers are important and viewers otherwise mentally calculate them every time they look at the board.

This one is a delicate balance between showing redundant information and answering the question “Compared to what?” Viewers need a frame of reference, and the entire dashboard should communicate that message. Use your best judgment.

Major Business Events

A great way to get surprise (and also to demonstrate the value of dashboards) is to assemble special views for momentous business events.

If a new system is going live and it’s not even clear what to track yet, pull some high-level information that demonstrates the moment’s message, and style it accordingly. For example, if you are opening a new retail store, the dashboard can be entirely focused on statistics about that store, including a picture, a map, and other superfluous metrics. Obviously if this store is one of many, the dashboard won’t be important for long, but it’s a great way to celebrate and bring awareness to other teams’ success. (And again, to be successful, that data will have to come from your data warehouse…)

Community

Dashboards provoke conversation about the business. Even on an average day, you’ll see a few people stop in front of the dashboard and make a comment about how a certain thing is doing. As you “democratize” dashboards and others start to make them, others may come up with some truly creative ideas. If this is their bridge to adding business value, don’t stop them.

One company made a dashboard to show their employee’s progress in a bike race, complete with maps, graphs, and a live web cam feed from their helmet. Tableau used Game of Thrones as a platform to solicit dashboards of all kinds of data related to the show. These are valuable training exercises to onboard others into your data program. (Obviously, your organization’s code of conduct and behavior applies.)

There has recently been a proliferation of DIY projects around smart mirrors and dashboards for the home too. Commercial products like the Echo Show and Nest Hub are basically piloting this concept into how we run our personal lives as well.

How to Build Dashboards

There’s a significant logistical component to getting your dashboards up and running. Once you have a sense for how many you need and what metrics are most important, you can get to work. Use your hierarchy to do a prioritization exercise and know which metrics are most important for real-time feedback. As I’ll repeat, resist the urge to cram more metrics into less real estate.

After that, it’s a matter of procuring or finding the tools you need.

Hardware

To accomplish the goal of ubiquity, we’re going to need hardware. The best dashboard setup in the world won’t accomplish anything if no one ever sees it. If your organization does not yet understand the value of dashboards, this may actually be the most difficult part of implementing your strategy. Possible barriers include adequate wall space, support from facilities and IT for installation, or budget. Do the best you can—the barriers will begin to fall as organizational leaders begin to notice dashboards. Expect some visitors to stop by and ask, “What is that? You know, right now, that {x} is happening in our business? Is that up-to-date?” followed shortly by “How do I get one?” Then, pull out the sheet of paper on which you’ve written all the current barriers to the dashboard culture, and read it aloud. This is also a great place for your data governance team to advocate inside their own functional areas. No doubt, they have live metrics they think would inspire or inform their own teams. The spectrum of requests can be astounding—I’ve had people ask for everything from current spend on pay-as-you-go contracts (like…BigQuery) to a running list of employee birthdays. Support everything you can, with the stipulation that the data be sourced from the data warehouse. Two birds with one stone.

Screens

Not to be cliché, but get as many of the largest screens you can. Don’t even worry about the resolution; anything 1080 p or higher is fine. The larger the screen, the farther away from which it can be seen. The more screens, the more metrics you can display. However, keep the dashboard hierarchy in mind: don’t use more screens to show the same density of metrics, only larger. And definitely don’t use more screens to show more total information.

As with any hierarchy of information, the scale moves from largest and most important to smallest and least important. If you have a single, all-powerful business metric, the space, and the money, you might try placing that metric alone on a screen. The effect is overwhelming; and you will probably add supplementary data about the metric, say, recent values or other correlated metrics. But you’ll get your point across. See Figure 17-3.

../images/491470_1_En_17_Chapter/491470_1_En_17_Fig3_HTML.jpg — Figure 17-3
One metric to rule them all

Many people scrounge for old hardware or can secure approval for a single, moderately sized screen to start. Again, resist the temptation to overload that screen—if you have one 32˝ TV, stick to no more than six metrics. Then add to your list of barriers that you could display more if you had a larger screen.

Seriously , this isn’t a boasting or power game, even though some people may take it that way. This is about having business information in view of all stakeholders at all times. By the same token, you can use projectors to get more real estate, but old stock may not be bright enough in typical office lighting conditions.

As a side note, most TVs have power saving settings where they can turn on and off by a schedule or by HDMI-CEC signal. Please use these settings to save energy and only have the screens on during normal business hours.

Computers

Once you have all of the screens, you still need something to show the data. And that means another set of potential obstacles from IT, along the same lines: budget, device security, and setup. Before you can kick off a program like this, you’ll need to ensure that your computers will have broad enough Internet access to reach the dashboards. If your organization already has a CCTV or media distribution system, you may be able to start a project to hop on that. Your requirements are simple: display an HTML5 dynamic web page.

You can scrounge around for hardware here too, with the benefit being that leftover machines in your department probably already have security and wifi clearance. The power and maintenance requirements may be an issue if the machine decides to randomly do system updates or hardware crashes. You may also not like the running wires coming from a wall mount to a machine sitting on the floor.

Any system-on-a-chip (SoC) should have sufficient power to run your dashboards and may also have the benefit of being able to be powered from the 5 V USB ports on the back of an HDTV. These are available from many organizations in x86, ARM, and single-board computer options.

The Raspberry Pi frequently comes up here as the hobbyist device of choice, and it will in fact run a pretty good dashboard. You can purchase as many as you need, as it’s hard to break the bank with a list price of 35 USD. Purchase a model 3B+ or a model 4; built-in wifi and Bluetooth will make it easier to interface with the system remotely.

Finally, some smart TVs have web browsers built in. These browsers sometimes are up-to-date and fast enough to power your dashboards. Note that with the exception of the Amazon Fire Stick, most media stick-based devices don’t run their own browsers without some modification. For example, you can get a Chromecast to load an arbitrary website, but officially the supported pattern is “casting” to your device from your computer or phone, which is not sufficient. In all cases, you want dedicated hardware to display the board. If it’s frequently broken or someone has to turn it on, this will present a self-created barrier to adoption. If you do this, check it for reliability for the first couple weeks to make sure it has enough memory and CPU to handle refreshing a web page indefinitely.

Security

If your dashboard runs on a machine that has access to internal systems, you may need to prevent passersby from using an attached keyboard or mouse to gain other unauthorized access. You can solve this problem with networking, only allowing the single address hosting your dashboard to be accessible from the machine. Or, if you have remote access that automatically configures on start, you could take the more extreme action of locking the machine in a case or even epoxying the case and ports shut. You may not need to take any of these steps, but don’t inadvertently create a vulnerability during this process.

Software

There are two components to the software, one being the dashboard display itself and the other being the tool that loads it. As with reporting, the landscape is ever changing, and people’s needs are rapidly evolving. This reminds me of the application lifecycle monitoring days. People were clamoring for task boards and agile management systems—Jira was still primarily for bug tracking, and many project management systems were heavily waterfall-driven. After a time, as the actual market request coalesced, companies started targeting these users directly. Dashboards are nearly at that point, but no one’s winning yet.

Dashboard Tools

Many traditional reporting and BI companies have leapt over into the dashboard space. From another angle, many DevOps performance monitoring companies like New Relic and Datadog have created dashboarding systems that serve other types of data perfectly well. There are countless open source systems and projects of even smaller scale to tackle this problem.

Many are unattractive or feature limited. Others are great for certain kinds of data, but are harder to integrate with others. In one example, I was trying to integrate data into a dashboard from Segment. Segment is an organization that integrates customer data collection across multiple sources, for example, uniting your scripts and tags across systems like Google Analytics. In fact, they even have their own dashboarding system for visualizing data. I was attempting to tie customer data into system health data. The dashboard tool I was using had a Segment integration, and I thought I could just tie them together. Unfortunately, the metrics collected by the integration were around the health of the Segment service itself: number of Segment events delivered, rejected, retried, and so forth. I couldn’t get at the data I wanted, which was inside the events themselves.

This wasn’t a failure of the dashboard I was using—it was just that the use cases I wanted weren’t compatible. As this area matures, it should get easier to cross-reference all kinds of metrics on the same dashboard to meet your needs.

In the next chapter, we’ll explore doing this with Google Data Studio, more or less as a proxy for any of the hundreds of other tools you might use. The concepts are similar across dashboard providers, and the design language is relatively consistent.

Displaying the Dashboard

Typically you can do this with a web browser. There are a few considerations for running a browser in unattended mode that you’ll want to take into account. This depends on your hardware and its operating system; there are a few things to take into account.

First, as I mentioned earlier, make sure the hardware can handle refreshing a graphics-intensive web page for an indefinite amount of time. Second, make sure you have a script configured to boot the computer directly into the browser with a default web page. Lastly, unless your dashboard is showing public data, you will need authentication to access it. Figure out a way to handle this so that the dashboard isn’t just showing a login prompt all the time.

Maintenance

Once the system is up and running, it will be important to avoid decay. Dashboards go out of date as underlying systems change. This is a mini-version of your plan to adapt to long-term change in your data program, with a couple extra wrinkles.

First, anything tied to system performance or health will change regularly as the DevOps team changes infrastructure. While they likely want to manage their own boards, understand what if any data is going into BigQuery from Cloud Logging or other application performance monitoring tools that you may be using. It’s totally fine to keep these dashboards and monitors separate if that’s your organizational culture. Depending on the infrastructure, it may not be relevant at all anyway.

On the same basis, when an underlying integration to BigQuery is changing, confirm that the affected data schemas won’t change. This would ideally be part of your data governance decision making, but can get harder to manage with a greater number of integrations.

Most importantly and to the points made earlier, ensure that all the dashboards come up as expected every day and continue to display information throughout the day. The idea of making a dashboard to measure dashboard availability has occurred to me, but that is maybe a bit too meta.

I anonymized and redacted an actual system health dashboard from a real company in Figure 17-4. It fails to follow most of the principles we discussed. Furthermore, it sat unmaintained for a year before it was replaced with a better one, by which point it looked like the following.

../images/491470_1_En_17_Chapter/491470_1_En_17_Fig4_HTML.jpg — Figure 17-4
A dashboard graveyard — lots of boxes, mostly spinners, and no data

Summary

The next step up from reporting is creating visualizations and dashboards. Visualizations can be challenging to conceptualize and build, but an effective visual language is critical. Choosing the right ways to display your data can dramatically influence if and how stakeholders respond to it. Visualizations have a lot in common with reports and should be used in tandem, but the audience and message may vary. Another effective way of bringing people into your data program is to deploy data dashboards of all kinds everywhere possible. Dashboards and reporting visualizations also have much in common, but dashboards have a strong time-based component and heavily prioritize recency. The design discipline around effective dashboard presentation is not fully mature, but there are a number of good practices to follow. Even when dashboards have been created effectively, the program often fails due to logistical challenges of actually deploying those dashboards. In order to create and maintain a culture of data, the presentation of organizational data must be reliable and protected from deterioration.

In the next and last chapter in this part, we’ll explore an applied version of the concepts we’ve discussed for reporting and visualization. To do this, we’ll use Google Data Studio, which is directly integrated with BigQuery and can be used effectively on top of the data warehouse.

Footnotes

https://nacns.org/wp-content/uploads/2016/11/AF-Introduction.pdf

Björn Rost, the technical consultant for this book, hates pie charts. And I totally get that; the bad visualization example is a great reason why. We recommend this article for a more nuanced discussion: https://medium.com/@clmentviguier/the-hate-of-pie-charts-harms-good-data-visualization-cc7cfed243b6

Unless it’s a sales presentation, in which case, my company, DataCorp, produces the best reporting and visualization solution in the world. Email now to receive our whitepaper on truth in information and a 14-day free trial.

This also illuminates a key design distinction between Chapter 5, in which we covered batch loading, and Chapters 6 and 7, when we looked at real-time streaming. It’s also why whether you need real-time data is a fundamental question in the data charter and governance: your organization might not have thought about it.

You heard it here first.