Here’s a graphic of a really deep oil well by Fuel Fighter via Visual Capitalist. This is rather reminiscent (ahem) of the long, tall graphics by the Washington Post (and the eerily similar one from the Guardian a few days later which they had to admit they had nicked) about flight MH370 at the bottom of the ocean. The WP graphic works because you have to scroll down, and down, and down, and down, and down (wow, that’s deep!), and down, and down (no way), and down before you get to the sea bed. Yes, all the usual references are there, hot air balloons and Burj Khalifas and Barad-Dûrs and what have you, but they don’t matter because it’s the scrolling that does it, giving you GU2 (“Conveying the sense of the scale and complexity of a dataset”) and GU6 (“Attracting attention and stimulating interest.”) The references don’t mean anything to me (or probably you); I may have seen the Burj Khalifa and thought it was amazingly tall, but I have no grasp of how tall and that is what matters: I’d have to have an intuitive feel for what 3 BKs are compared to the height of a jet aircraft, and I don’t have that, so why should I care about the references?
My problem with the Fuel Fighter graphic is that it doesn’t have that same sense of depth. The image file is 796 x 4554 pixels, which is an aspect ratio of 1:17. The WP image (SVG FTW) is 539 x 16030 or 1:30, which is pretty extreme! It feels to me like you’d have to get past 1:20 before it started to have enough impact.
The Washington Post have an article about the US budget out by Kim Soffen and Denise Lu. It’s not long, but brings in four different graphical formats to tell different aspects of the data story. A bar showing parts of the whole (see, you don’t need a pie for this!)
then a line/dot/whatever-you-want-to-call-it chart of the change in relative terms
then a waffle of that change in absolute terms, plus a sparkline of the past.
there’s also a link to full department-specific stories under each graphic. I think this is really good stuff, though I can image some design-heads wanting to reduce it further. It shows how you can make a good data-driven story out of not many numbers.
I’m in Rome at the International Workshop on Computational Economics and Econometrics. I gave a seminar on Monday on the ever-popular subject of data visualization. Slides are here. In a few minutes, I’ll be speaking on Inference in Complex Systems, a topic of some interest from practical research experience my colleague Rick Hood and I have had in health and social care research.
Here’s a link to my handout for that: iwcee-handout
In essence, we draw on realist evaluation and mixed-methods research to emphasise understanding the complex system and how the intervention works inside it. Unsurprisingly for regular readers, I try to promote transparency around subjectivities, awareness of philosophy of science, and Bayesian methods.
Easy links: dear-data.com deardata-deliveries.tumblr.com
Procedural notes that can be skipped:
I had previously intended to write something about the shapes employed by Giorgia Lupi and the Accurat studio – and indeed I still will. But that takes some time and it got leapfrogged by Dear Data. This post came at a good time because I didn’t get around to it straight away (we’re now at week 35 of the project) and by the time I did, some other ideas had bubbled up in conversations, focussing my attention on the process of design, critique and refinement (which is getting added to my reading pile for the summer). These ideas are so alien to statisticians that I am not sure any of them will have read this far into this post, but they (we) are the ones that need to up their (our) game in communication. Nobody else will do it for us! The other building block that came along in time was finally finding really nice writing paper and resolving to draft everything by hand from now on, preferably in time when I’m physically away from a computer. It has already proven very productive. People seem to have different approaches that work (like starting with bullet points, or cutting out phrases, or mind maps), but mine is to start writing at sentence one, like Evelyn Waugh, and just carry straight on. There is no draft; why should there be? Finding that technique and place to write is really valuable; don’t devalue it and try to squeeze it into a train journey or between phone calls. It’s the principal way in which you communicate your work, and probably the most overlooked.
I’ve been impressed with this website (constituencyexplorer.org.uk) put together by Jim Ridgway and colleagues at Durham, with input from the House of Commons Library and dataviz guru Alan Smith from the ONS. In part, it is aimed at members of parliament, so they can test their knowledge of facts about constituencies and learn more along the way. But it makes for a fun quiz for residents too. Everything is realised in D3, so it runs everywhere, even on your phone. There are a few features I really like: the clean design, the link between map, list and dotplot in the election results:
… the animation after choosing a value with the slider, highlighting the extra/shortfall icons and the numbers dropping in: nice!
… the simple but quite ambitious help pop-up:
… and the way that the dotplots are always reset to the full width of the variable, so you can’t be misled by small differences appearing bigger than they are. The user has to choose to zoom after seeing the full picture.
All in all, a very nice piece of work. I must declare that I did contribute a few design suggestions in its latter stages of development but I really take no credit for its overall style and functionality. Budding D3 scripters could learn a lot from the source code.
And while we’re on the topic, here some more innovative electoral dataviz:
And finally, take a moment to ask election candidates to commit to one afternoon of free statistical training, a great initiative from the RSS – and frankly, not much to ask. Unfortunately, none of my local (Croydon Central) would-be lawmakers have been bothered to write back yet. But here’s the parties that are most interested in accurate statistics, in descending order (by mashing up this and this):
- National Health Alliance: 4/13
- Pirate Party: 1/6
- Green Party: 47/568
- Labour Party: 51/647
- Plaid Cymru: 3/40
- Liberal Democrats: 47/647
- Ulster Unionist Party: 1/15
- Christian People’s Alliance: 1/17
- Conservative and Unionist Party: 27/647
- Scottish National Party: 2/59
- United Kingdom Independence Party: 15/624
Earlier in the week I was bloggin’ about extreme time scales and various uses of spirals in data visualisation. This morning I thought about it a little more and realised the attraction of extreme scales, like the entire lifetime of our planet, or the size of the solar system, is in large part just that it’s fun. I start my own dataviz talks with Gelman & Unwin’s 6 objectives, which I think are helpful in framing the many uses of images (for a statistician, anyway – we were trained that there is only one use of a graph and that is to check for outliers / normality briefly before it is deleted!), although I get the impression (and I would be happy to be corrected by better informed dataviz hipsters (I use the term with only the very mildest form of offense)) that those objectives are generally looked upon with some disdain as Johnnies-come-lately in a design community that has had its own goals for a much longer time. In this application, we are appealing to GU2, “conveying the sense of the scale and complexity of a dataset”. In the original paper, G&U give network graphs as an example, because they convey an overall impression but little or not concrete information, so people like me tend not to approve. I like the data to be retrievable by the viewer. But why not, if it effectively sets the scene?
A couple of unorthodox examples spring to mind: scale reconstructions of the solar system and Stamen+Nasdaq on high-frequency trading. If you wipe out the extremities with a super-log scale then you lose the fun too. (OK, it’s a sitting duck of an ugly example, but still!) Another good one is the Washington Post on Flight MH370.
And then consider two popular visualizations, US Gun Deaths and CarbonVisuals NYC. In each case, they rely on the emotional impact of the sudden acceleration or amplification of values, and they get that in very different ways. As we learnt from Haydn, the impact of the Surprise only really works the first time, but it stays fun for years afterwards.
Last week Andrew Gelman picked up on a couple of graphs of extremely long time periods. Here they are again for your convenience (when one mentions a subject such as climate change, it’s like a magnet for time-wasters, so I’ll spare you from reading through the explosion of comments at Gelman’s blog)
What’s going on in that x-axis?!
Gelman liked the spirals within spirals; not everyone did. It put me in mind of two examples I saw recently when reading Isabel Meirelles’s book “Design For Information” (which is excellent!). The first is not good, in my humble opinion:
“10 years of Wikipedia” is a series of line graphs that are bent round into a spiral. You are supposed to compare the position of the line to the ideal spiral in grey. What this adds above and beyond the area chart on the left is questionable. I find it impossible to see the patterns, and I imagine that is something to do with how our brains perceive position radiating out from a central point.
The better use is when the spiralling is metaphorical. In this image from National Geographic, the number of space exploration missions that have flown by and visited different planets and moons are shown as concentric rings. One gets an immediate feel for the number of rings.