I’ve been stockpiling Opal Fruits, which young people tell me are now called Starburst, in anticipation of today’s election results.
This is like one-tenth of the stash. I don’t want to eat them though. You know what you’re going to get if you knock here at Halloween.
I took the New York Times’ hexbin cartogram, imposed a 6×8 rectangular grid and counted the most common party in each block. There was a little bit of fudging and chopping up the sweets. It is art, no? Here’s the video:
You know how people love maps with little shapes encoding some data? Hexagons, circles, squares? Jigsaw pieces? Opal Fruits?
Rip’t from the pages of the Times Higher Education magazine, some years ago.
Or small multiples?
You know how people love charts made from emojis?
Stick them together and what do you get?
This is by Lazaro Gamio. They’re not standard emojis. Six variables get cut into ordinal categories and mapped to various expressions. You can hover on the page (his page, not mine, ya dummy) for more info. Note that some of the variables don’t change much from state to state. Uninsured, college degrees, those change, but getting enough sleep — not so much. It must be in there because it seems fun to map it to bags under the eyes. But the categorisation effectively standardises the variables so small changes in sleep turn into a lot of visual impact. Anyway, let’s not be too pedantic, it’s fun.
This idea goes back to Herman Chernoff, who always made it clear it wasn’t a totally serious proposal, and has been surprised at its longevity (see his chapter in PPF). Bill Cleveland was pretty down on the idea in his ’85 book:
“not enough attention was paid to graphical perception … visually decoding the quantitative information is just too difficult”
Here’s a graphic of a really deep oil well by Fuel Fighter via Visual Capitalist. This is rather reminiscent (ahem) of the long, tall graphics by the Washington Post (and the eerily similar one from the Guardian a few days later which they had to admit they had nicked) about flight MH370 at the bottom of the ocean. The WP graphic works because you have to scroll down, and down, and down, and down, and down (wow, that’s deep!), and down, and down (no way), and down before you get to the sea bed. Yes, all the usual references are there, hot air balloons and Burj Khalifas and Barad-Dûrs and what have you, but they don’t matter because it’s the scrolling that does it, giving you GU2 (“Conveying the sense of the scale and complexity of a dataset”) and GU6 (“Attracting attention and stimulating interest.”) The references don’t mean anything to me (or probably you); I may have seen the Burj Khalifa and thought it was amazingly tall, but I have no grasp of how tall and that is what matters: I’d have to have an intuitive feel for what 3 BKs are compared to the height of a jet aircraft, and I don’t have that, so why should I care about the references?
My problem with the Fuel Fighter graphic is that it doesn’t have that same sense of depth. The image file is 796 x 4554 pixels, which is an aspect ratio of 1:17. The WP image (SVG FTW) is 539 x 16030 or 1:30, which is pretty extreme! It feels to me like you’d have to get past 1:20 before it started to have enough impact.
The Washington Post have an article about the US budget out by Kim Soffen and Denise Lu. It’s not long, but brings in four different graphical formats to tell different aspects of the data story. A bar showing parts of the whole (see, you don’t need a pie for this!)
then a line/dot/whatever-you-want-to-call-it chart of the change in relative terms
then a waffle of that change in absolute terms, plus a sparkline of the past.
there’s also a link to full department-specific stories under each graphic. I think this is really good stuff, though I can image some design-heads wanting to reduce it further. It shows how you can make a good data-driven story out of not many numbers.
Easy links: dear-data.com deardata-deliveries.tumblr.com
Procedural notes that can be skipped:
I had previously intended to write something about the shapes employed by Giorgia Lupi and the Accurat studio – and indeed I still will. But that takes some time and it got leapfrogged by Dear Data. This post came at a good time because I didn’t get around to it straight away (we’re now at week 35 of the project) and by the time I did, some other ideas had bubbled up in conversations, focussing my attention on the process of design, critique and refinement (which is getting added to my reading pile for the summer). These ideas are so alien to statisticians that I am not sure any of them will have read this far into this post, but they (we) are the ones that need to up their (our) game in communication. Nobody else will do it for us! The other building block that came along in time was finally finding really nice writing paper and resolving to draft everything by hand from now on, preferably in time when I’m physically away from a computer. It has already proven very productive. People seem to have different approaches that work (like starting with bullet points, or cutting out phrases, or mind maps), but mine is to start writing at sentence one, like Evelyn Waugh, and just carry straight on. There is no draft; why should there be? Finding that technique and place to write is really valuable; don’t devalue it and try to squeeze it into a train journey or between phone calls. It’s the principal way in which you communicate your work, and probably the most overlooked.
Earlier in the week I was bloggin’ about extreme time scales and various uses of spirals in data visualisation. This morning I thought about it a little more and realised the attraction of extreme scales, like the entire lifetime of our planet, or the size of the solar system, is in large part just that it’s fun. I start my own dataviz talks with Gelman & Unwin’s 6 objectives, which I think are helpful in framing the many uses of images (for a statistician, anyway – we were trained that there is only one use of a graph and that is to check for outliers / normality briefly before it is deleted!), although I get the impression (and I would be happy to be corrected by better informed dataviz hipsters (I use the term with only the very mildest form of offense)) that those objectives are generally looked upon with some disdain as Johnnies-come-lately in a design community that has had its own goals for a much longer time. In this application, we are appealing to GU2, “conveying the sense of the scale and complexity of a dataset”. In the original paper, G&U give network graphs as an example, because they convey an overall impression but little or not concrete information, so people like me tend not to approve. I like the data to be retrievable by the viewer. But why not, if it effectively sets the scene?
A couple of unorthodox examples spring to mind: scale reconstructions of the solar system and Stamen+Nasdaq on high-frequency trading. If you wipe out the extremities with a super-log scale then you lose the fun too. (OK, it’s a sitting duck of an ugly example, but still!) Another good one is the Washington Post on Flight MH370.
And then consider two popular visualizations, US Gun Deaths and CarbonVisuals NYC. In each case, they rely on the emotional impact of the sudden acceleration or amplification of values, and they get that in very different ways. As we learnt from Haydn, the impact of the Surprise only really works the first time, but it stays fun for years afterwards.
This morning I heard an unusual announcement as I arrived at Balham (“gateway to the South”) railway station. The trains going into London are busiest, the man said, between 8:15 and 8:30, so travelling before or after this would make our journeys quicker and more comfortable.
I immediately thought this would make a good excuse to post this old London Transport poster, with its clever design and charming pictograms (and questionable math):
So, in there seemed to be 35,000 people on the tube between 5:00 and 5:30 pm. According to 2009/10’s London Travel Demand Survey, there were about 125,000 (roughly eyeballing Figure 4.1) crammed in cheek by jowl.