D3 for D3 and Coding Workshop
Overview
Browser developer tools such as those provided by Chrome and Firefox provide numerous ways to explore and experiment with front-end code. The debugger feature of the developer toolbar is an especially powerful tool. It not only allows you to freeze a program to work through bugs, but allows you to explore the inner workings of a library or framework.
Today, we'll go deeper into D3 by using debugger-driven discovery...yes, D3 for D3. We'll focus on grasping one of the most important but initially confusing aspects of the D3 library -- data joins.
We'll devote any time remaining in class to hands-on work for the final project.
Pre-flight setup
Before we dive into the guts of D3, let's do some preliminary setup.
- In the shell, create a new directory, e.g.
mkdir ~/Desktop/d3-under-the-hood/
- Inside the new directory, create a file called
index.html
with the following HTML:<!DOCTYPE html>
<html>
<head>
<meta charset="utf-8">
<title>D3 - Under the Hood</title>
<style>
svg {
margin-top: 50px;
margin-left: 50px;
}
</style>
</head>
<body>
<h1>D3 - Under the Hood</h1>
<svg>
<circle></circle>
<circle></circle>
<circle></circle>
</svg>
<script src="d3.v5.js"></script>
<script type="text/javascript">
var coordinates = [[0,0], [10,10], [20, 20]];
///var coordinates = [[0,0], [10,10], [20, 20], [30,30], [40, 40]];
// circle selection
var cs = d3.select('svg')
.selectAll('circle')
.data(coordinates);
console.log(cs);
</script>
</body>
</html> - Download a nicely formatted version of the source code. Be sure to
save it asd3.v5.js
ind3-under-the-hood/
- In a new shell, navigate to the new directory and fire up a test
server:python -m http.server
- Use Firefox (or Chrome) to load the file at
http://localhost:8000/
Both Chrome and Firefox provide similar debugging functionality, but
the interfaces vary for certain features (e.g. viewing DOM
properties). It can be helpful to use both when using a debugger to
explore code..
Data Joins
We know that D3 can create, style and animate page elements based on data points. But how does it work its magic? What are the mechanics that enable D3 to do what it does?
Exploring this question will take us into the DOM and D3's internals, and hopefully illuminate the critical concept of a data join.
Data binding basics
First, we'll use the debugger to explore the mechanics of data binding in the simplest scenario - when the number of data points corresponds exactly to the number of DOM nodes.
Some key questions you should be able to answer after the debugger demonstration in class:
- How can a debugger be used to examine DOM properties?
- How can a debugger be used to set breakpoints and step through source code?
- How and where (in the source code) does D3 attach data to a DOM node?
Exercise - Display the points
Try updating the code so that the three circles will display in positions based on their respective coordinates. This will require you to set several attributes on each circle:
- radius - set the radius to 5 pixels
- fill - pick a random color, e.g. blue
- cx - the first number in each coordinate
- cy - the second number in each coordinate
When data exceeds nodes
A common idiom in D3 involves declaratively selecting page elements that don't yet exist, and using D3 to create those elements after binding them to data.
For example, in the below body
element, there's an svg with one circle
element but three data points:
<body>
<svg>
<circle></circle>
</svg>
<script src="d3.v5.js"></script>
<script type="text/javascript">
var coordinates = [[0,0], [10,10], [20, 20]];
// circle selection
var cs = d3.select('svg')
.selectAll('circle')
.data(coordinates)
.enter()
.append('circle');
</script>
</body>
Use a debugger to step through the source code and answer the following questions:
- Where in the D3 source is the
enter
"subselection" configured? - What is contained in the
_enter
property? Why are there empty
slots? - Can you explain the high-level role of the
enter
subselection? - What is the role of
.append(circle)
?
When nodes exceed data
When performing a data join, nodes that have no corresponding data are "sequestered" in the _exit
property of the D3 selection.
Dynamic visualizations that rely on data updates after page load are a common use case for exit node manipulation.
For example, the below code contains 5 circles and only 3 data
coordinates:
<body>
<h1>D3 - Under the Hood</h1>
<svg>
<circle></circle>
<circle></circle>
<circle></circle>
<circle></circle>
<circle></circle>
</svg>
<script src="d3.v5.js"></script>
<script type="text/javascript">
var coordinates = [[0,0], [10,10], [20, 20]];
// circle selection
var cs = d3.select('svg')
.selectAll('circle')
.data(coordinates);
//console.log(cs);
</script>
</body>
Above, the two "excess" circle nodes are placed in the D3 selection's _exit
attribute, where they can be manipulated (e.g. removed from the page).
Try updating the code to add .exit().remove()
as below. Then reload the
page to see the effect.
var cs = d3.select('svg')
.selectAll('circle')
.data(coordinates)
.exit().remove();
Summary
By working with a debugger, hopefully we've shed light on some of the
core concepts of data binding in D3.
Some key takeaways:
- You can gain insight into a library or framework by using a debugger to explore source code.
- D3 is first and foremost a tool to manipulate the DOM. Visualization is a focus because of D3's facilities for working with SVG and other DOM elements, but visualization is far from the only focus. For example, it's quite possible and reasonable to use D3 for projects that don't involve visualization in the traditional sense but require DOM manipulation.
- D3 is a thin layer over core web technologies, in particular the DOM.
- D3 closely couples data with the DOM by "joining" data to nodes on a
__data__
attribute. This technique lies at the heart of D3 and makes it deserving of its name (data-driven documents). - D3 provides an API that lets you create, style and animate nodes
based on data stored in the__data__
attribute. - D3 creates a 1-to-1 correspondence between a selection of nodes and
data passed to those nodes. Theenter
,update
andexit
"subselections"
allow you to work with data and nodes based on the nature of the join
(e.g. if there are no existing DOM nodes in a selection for incoming data).
Further reading
For a deeper dive into data joins, check out these two resources: