Chapter 2. Introducing D3
D3—also referred to as D3 or d3.js—is a JavaScript library for
creating data visualizations. But that kind of undersells it.
The abbreviation D3 references the tool’s full name, Data-Driven
Documents. The data is provided by you, and the documents are
web-based documents, meaning anything that can be rendered by a web browser, such as HTML and SVG. D3 does the driving, in the sense that it
connects the data to the documents.
Of course, the name also functions as a clever allusion to the network
of technologies underlying the tool itself: the W3, or World Wide Web,
or, today, simply “the Web.”
D3’s primary author is the brilliant Mike
Bostock, although there are a few other dedicated contributors. The
project is entirely open source and freely available on GitHub.
D3 is released under a BSD license, so you may use, modify, and adapt
the code for noncommercial or commercial use at no cost.
D3’s official home on the Web is d3js.org.
Fundamentally, D3 is an elegant piece of software that facilitates
generation and manipulation of web documents with data. It does this by:
-
Loading data into the browser’s memory
-
Binding data to elements within the document, creating new elements
as needed
-
Transforming those elements by interpreting each element’s bound
datum and setting its visual properties accordingly
-
Transitioning elements between states in response to user input
Learning to use D3 is simply a process of learning the syntax used to
tell it how you want it to load and bind data, and transform and
transition elements.
The transformation step is most important, as this is where the
mapping happens. D3 provides a structure for applying these
transformations, but, as we’ll see, you define the mapping rules. Should larger values make taller bars or brighter circles? Will
clusters be sorted on the x-axis by age or category? What color palette
is used to fill in countries on your world map? All of the visual design
decisions are up to you. You provide the concept, you craft the rules,
and D3 executes it—without telling you what to do. (Yes, it’s like the
opposite of Excel’s pushy “Chart Wizard.”)
Here is a list of things D3 does not do:
-
D3 doesn’t generate predefined or “canned” visualizations for you. This is on purpose. D3 is intended primarily for explanatory visualization work, as opposed to exploratory visualizations. Exploratory tools help you discover significant, meaningful patterns in data. These are tools like Tableau and ggplot2, which help you quickly generate multiple views on the same data set. That’s an essential step, but different from generating an explanatory presentation of the data, a view of the data that highlights what you’ve already discovered. Explanatory views are more constrained and limited, but also focused, and designed to communicate only the important points. D3 excels at this latter step, but is not ideal for the former. (For ideas on other tools, see the section Alternatives later in this chapter.)
-
D3 doesn’t even try to support older browsers. This helps keep the D3 codebase clean and free of hacks to support old versions of Internet Explorer, for example. The philosophy is that by creating more compelling tools and refusing to support older browsers, we encourage more people to upgrade (rather than forestall the process, thereby requiring us to continue to support those browsers, and so on—a vicious cycle). D3 wants us to move forward.
-
D3’s core functionality doesn’t handle bitmap map tiles, such as those provided by Google Maps or Cloudmade. D3 is great with anything vector—SVG images or GeoJSON data—but wasn’t originally intended to work with traditional map tiles. (Bitmap images are made up of pixels, so resizing them larger or smaller is difficult without a loss in quality. Vector images are defined by points, lines, and curves—mathematical equations, really—and can be scaled up or down without a loss in quality.) This is starting to change, with the introduction of the d3.geo.tile plug-in. Prior to this plug-in, geomapping with D3 meant either going all-SVG and avoiding tiles or using D3 to create SVG visuals on top of a base layer of map tiles (which would be managed by another library, like Leaflet or Polymaps—see the section Alternatives later in this chapter). This question of how to integrate bitmap tiles and vector graphics comes up a lot in the D3 community. As of today, there is no super-simple and perfect answer, but I think you can expect to see lots of work done in this area, and possibly the new tile-handling methods integrated into the D3 core at some point in the future.
-
D3 doesn’t hide your original data. Because D3 code is executed on the client side (meaning, in the user’s web browser, as opposed to on the web server), the data you want visualized must be sent to the client. If your data can’t be shared, then don’t use D3. Alternatives include using proprietary tools (like Flash) or prerendering visualizations as static images and sending those to the browser. (If you’re not interested in sharing your data, though, why would you bother visualizing it? The purpose of visualization is to communicate the data, so you might sleep better at night by choosing openness and transparency, rather than having nightmares about data thieves.)
The first web browsers rendered static pages; interactivity was limited
to clicking links. In 1996, Netscape introduced the first browser
with JavaScript, a new scripting language that could be interpreted by the browser while the page was being viewed.
This doesn’t sound as groundbreaking as it turned out to be, but this
enabled web browsers to evolve from merely passive browsers to dynamic
frames for interactive, networked experiences. This shift ultimately
enabled every intrapage interaction we have on the Web today. Without
JavaScript, D3 would never exist, and web-based data visualizations
would be limited to prerendered, noninteractive GIFs. (Yuck. Thank
you, Netscape!)
Jump ahead to 2005, when Jeffrey Heer, Stuart Card, and James Landay introduced prefuse, a toolkit for bringing data visualization to the Web. prefuse (spelled with all lowercase letters) was written in Java, a compiled language, with programs that could run in web browsers via a Java plug-in. (Note that Java is a completely different programming language than JavaScript, despite their similar names.)
prefuse was a breakthrough application—the first to make web-based visualization accessible to less-than-expert programmers. Until prefuse came along, any datavis on the Web was very much a custom affair.
Two years later, Jeff Heer introduced Flare, a similar toolkit, but written in ActionScript, so its visualizations could be viewed on the Web through Adobe’s Flash Player. Flare, like prefuse, relied on a browser plug-in. Flare was a huge improvement, but as web browsers continued to evolve, it was clear that visualizations could be created with native browser technology, no plug-ins required.
By 2009, Jeff Heer had moved to Stanford, where he was advising a graduate student named Michael Bostock. Together, in Stanford’s Vis Group, they created Protovis, a JavaScript-based visualization toolkit that relied exclusively on native browser technologies. (If you have used Protovis, be sure to reference Mike’s introduction to D3 for Protovis users.)
Protovis made generating visualizations simple, even for users without prior programming experience. Yet to achieve this, it created an abstract representation layer. The designer could address this layer using Protovis syntax, but it wasn’t accessible through standard methods, so debugging was difficult.
In 2011, Mike Bostock, Vadim Ogievetsky, and Jeff Heer officially announced D3, the next evolution in web visualization tools. Unlike Protovis, D3 operates directly on the web document itself. This means easier debugging, easier experimentation, and more visual possibilities. The only downside to this approach is a potentially steeper learning curve, but this book will make that as painless as possible. Plus, all the skills you gain while learning about D3 will prove useful even beyond the realm of datavis.
If you’re familiar with any of these groundbreaking tools, you’ll appreciate that D3 descends from a prestigious lineage. And if you have any interest in the philosophy underlying D3’s elegant technical design, I highly recommend Mike, Vadim, and Jeff’s InfoVis paper, which clearly articulates the need for this kind of tool. The paper encapsulates years’ worth of learning and insights made while developing visualization tools.
D3 might not be perfect for every project. Sometimes you just need a quick
chart and you don’t have time to code it from scratch. Or you might need
to support older browsers and can’t rely on recent technologies like
SVG.
For those situations, it’s good to know what other tools are out there.
Here is a brief, noncomprehensive list of D3 alternatives, all of which
use web-standard technologies (mostly JavaScript) and are free to
download and use.
-
DataWrapper
-
A beautiful web service that lets you upload your data and quickly generate a chart that you can republish elsewhere or embed on your site. This service was originally intended for journalists, but it is helpful for everyone. DataWrapper displays interactive charts in current browsers and static images for old ones. (Brilliant!) You can also download all the code and run it on your own server instead of using theirs.
-
Flot
-
A plotting library for jQuery that uses the HTML canvas element and supports older browsers, even all the way back to Internet Explorer 6. It supports limited visual forms (lines, points, bars, areas), but it is easy to use.
-
Google Chart Tools
-
Having evolved from their earlier Image Charts API, Google’s Chart Tools can be used to generate several standard chart types, with support for old versions of IE.
-
gRaphaël
-
A charting library based on Raphaël
(see later in this chapter) that supports older browsers, including IE6. It has more visual flexibility than Flot, and—some might say—it is prettier.
-
Highcharts JS
-
A JavaScript-based charting library with several predesigned themes and chart types. It uses SVG for modern browsers and falls back on VML for old versions of IE, including IE6 and later. The tool is free only for noncommercial use.
-
JavaScript InfoVis Toolkit
-
The JIT provides
several preset visualization styles for your data. It includes lots of examples, but the documentation is pretty technical. The toolkit is great if you like one of the preset styles, but browser support is unclear.
-
jqPlot
-
A plug-in for charting with jQuery. This supports very simple charts and is great if you are okay with the predefined styles. jqPlot supports IE7 and newer.
-
jQuery Sparklines
-
A jQuery plug-in for generating sparklines, typically small bar, line, or area charts used inline with text. Supports most browsers, even back to IE6.
-
Peity
-
A jQuery plug-in for very simple and very tiny bar, line, and pie charts that supports only recent browsers. Did I mention that this makes only very tiny visualizations? +10 cuteness points.
-
Timeline.js
-
A library specifically for
generating interactive timelines. No coding is required; just use the code generator. There is not much room for customization, but hey, timelines are really hard to do well. Timeline.js supports only IE8 and newer.
-
YUI Charts
-
The Charts module for the Yahoo! User Interface Library enables creation of simple charts with a goal of wide browser support.
A “graph” is just data with a networked structure (for example, B is connected
to A, and A is connected to C).
-
Arbor.js
-
A library for graph visualization using
jQuery. Even if you never use this, you should check out how the documentation is presented as a graph, using the tool itself. (It’s so meta.) It uses the HTML canvas, so it works only in IE9 or current browsers, although some workarounds are available.
-
Sigma.js
-
A very lightweight library for graph
visualization. You have to visit this website, move your mouse over the header graphic, and then play with the demos. Sigma.js is beautiful and fast, and it also uses canvas.
I distinguish between mapping (all visualizations are maps) and
geomapping (visualizations that include geographic data, or geodata,
such as traditional maps). D3 has a lot of geomapping functionality, but
you should know about these other tools.
-
Kartograph
-
A JavaScript-and-Python combo for
gorgeous, entirely vector-based mapping by Gregor Aisch with must-see demos. Please go look at them now. I promise you’ve never seen online maps this beautiful. Kartograph works with IE7 and newer.
-
Leaflet
-
A library for tiled maps,
designed for smooth interaction on both desktop and mobile devices. It includes some
support for displaying data layers of SVG on top
of the map tiles. (See Mike’s demo “Using D3 with
Leaflet”.) Leaflet works with IE6 (barely) or IE7 (better!) and of course all current browsers.
-
Modest Maps
-
The granddaddy of tiled map libraries, Modest Maps has been succeeded by Polymaps, but lots of people still love it, as it is lightweight and works with old versions of IE and other browsers. Modest Maps has been adapted for ActionScript, Processing, Python, PHP, Cinder, openFrameworks…yeah, basically everything. File this under “oldie, but goodie.”
-
Polymaps
-
A library for displaying tiled maps, with layers of data on top of the tiles. Polymaps relies on SVG and thus works best with current browsers.
These tools, like D3, provide methods of drawing visual forms, but
without predesigned visual templates. If you enjoy the creative freedom
of starting from scratch, you might enjoy these.
-
Processing.js
-
A native JavaScript
implementation of Processing, the fantastic
programming language for artists and designers new to programming.
Processing is written in Java, so exporting Processing sketches to the
Web traditionally involved clunky Java applets. Thanks to Processing.js,
regular Processing code can run natively, in the browser. It renders using
canvas, so only modern browsers are supported.
-
Paper.js
-
A framework for rendering vector
graphics to canvas. Also, its website is one of the most beautiful on
the Internet, and their demos are unbelievable. (Go play with them now.)
-
Raphaël
-
Another library for drawing vector graphics, popular due to its friendly syntax and support for older browsers.
D3 is not the best at 3D, simply because web browsers are historically
two-dimensional beasts. But with increased support for WebGL, there are
now more opportunities for 3D web experiences.
-
PhiloGL
-
A WebGL framework
specifically for 3D visualization.
-
Three.js
-
A library for generating
any sort of 3D scene you could imagine, produced by Google’s Data Arts
team. You could spend all day exploring the mind-blowing demos on their
site.
When you want to use D3 without actually writing any D3 code, you
can choose one of the many tools built on top of D3!
-
Crossfilter
-
A library for working with large, multivariate datasets, written primarily by Mike Bostock. This is useful for trying to squeeze your “big data” into a relatively small web browser.
-
Cubism
-
A D3 plug-in for visualizing time series data, also written by Mike Bostock. (One of my favorite demos.)
-
Dashku
-
An online tool for data dashboards and widgets updated in real time, by Paul Jensen.
-
dc.js
-
The “dc” is short for dimensional charting, as this library is optimized for exploring large, multidimensional datasets.
-
NVD3
-
Reusable charts with D3. NVD3 offers lots of beautiful examples, with room for visual customizations without requiring as much code as D3 alone.
-
Polychart.js
-
More reusable charts, with a range of chart types available. Polychart.js is free only for noncommercial use.
-
Rickshaw
-
A toolkit for displaying time series data that is also very customizable.
-
Tributary
-
A great tool for experimenting with live coding using D3, by Ian Johnson.