Dataviz: good and bad

I’ve made a promise to myself not to blog anything until I get some more data processing tips written up on my website. But ‘ll break it just for a quick couple of links. One rocks, the other sucks.

First, an amazing visualization of current wind and weather conditions over the whole world, by Cameron Beccario. Source code here. This brings together a few different trendy tools: the data is automatically scraped, animated in a nice way with a planet that you can click and roll around. Very neat JavaScript, but also valuable as communication of quantitative information. Why is it better than just the old synoptic chart? Because it’s engaging, it gets people interested, and because you can see the whole story at a glance; you’re not limited to national boundaries. I think it’s potentially really useful for geography teachers everywhere. Arise, Sir Cameron. The next step would be to have it play the last week’s data as a video. I spotted it at Freakonometrics. The GIF below doesn’t really do it justice, by the way, go click on it.


Second, a graph spotted at Atlantic Cities which worried them because it looked like the whole world wants smaller households fast, and that’s going to cause environmental havoc. It worried me, on the other hand, because it just looked implausible. It’s amazing how complacent analysts* become as soon as they can switch on their stats software and do some fancy stuff. The common sense part of the brain powers down. Mmmm, breakpoints regression. Ooooh, bootstrapped starting values. Here’s a graph! What does it mean? Never mind that, let’s just publish the damn thing!

If you look at the slopes, the developed countries’ breakpoint is about 1893, which makes sense with industrialisation. The devloping countries have 1987, which doesn’t make so much sense. It’s not clear from the paper, but it looks like the breakpoint regression was done at country level, without weighting them by population. I’m happy to be corrected on that, but that’s what it looks like. That gives China and Swaziland exactly the same weight in pushing and pulling the line. And, most importantly, look over at the far right of the developing countries – there’s not many there with data since 1990 (they acknowledge this in the paper), and the ones who are there have smaller household sizes. Is it a trend or is it information bias? Smaller household <– healthier economy –> regular official statistics. This is not rocket science, it’s common sense. Think about what your data might mean! Aaargh.



* – by “analysts”, I mean the authors of the paper, not Emily Badger whose writing and keen eye for interesting stats I have admired for some time



  1. This reminds me of a recent article by Andrew Gelman on dataviz vs. statistical graphics. It’s true, modern day visualizations are engaging and a great way to attract attention. But it worries me how they tend to be perceived as evidence when, in my mind, I see them more as an attractive exercise in univariate screening. What would be nice is to have more collaboration between dataviz experts and statisticians and turning results into attractive and engaging visualizations.
    I have not read the article cites, but there’s something suspicious about the graph. Perhaps is the unnatural lack of smoothness? The few data points before 1800? I also wonder how developed vs. developing is defined and whether those definitions were valid in 1800. And what about urban vs. rural with population moving into cities hence in smaller dwelling?

    1. I agree it’s univariate exploration, which is fine, but shouldn’t be dressed up as an amazing discovery. Sometimes you only have a limited database to explore, and secondary analysis is fine but interpretation should be cautious. Amusingly, I just published something along those lines with exploratory graphs ( ). But we didn’t do anything as black-and-white as breakpoints, and didn’t try to sell it as a scientific breakthrough!

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s