ishy squishy line graphs

Recently I’ve seen two examples of line graphs with thick, soft lines. They look really nice, but you have to be careful they don’t obscure the facts. Here’s a good example and later we’ll look at a bad one.


This is from Andrew Sielen’s blog Reality Prose, where he analysed trends in the price and complexity of Lego! It provoked a lot of discussion and reminiscence, as you can imagine. I thought this looked really nice. I asked him how he did it and he said the graphs were just drawn in Excel and given extra thick lines. In Excel 2007 (I can’t vouch for earlier versions) you can tick a box to choose “smoothed lines”. Here’s some random data with the default ugly settings:


and here it is without the markers – which I think makes it clearer:


and then finally with the smoothed option:


The nice thing is that it doesn’t seem to lose any of the detail. It looks to me like a cubic spline, which retains all the original data as knots. Here’s the same thing in R:


and the code is



Which looks just like the Excel version, so I guess cubic splines it is. To get the squishy look, just ditch the points and thicken the lines.



And of course you could save it as a PDF and edit it in Inkscape if you want to go ker-azy:


You might be expecting me to say nasty things about that final version, but let’s be objective: all the data points are preserved, and the labelling is still intact. Horizontal gridlines would be handy, but they are not difficult to add. The only thing I feel a little uneasy about is the glow effect around the lines, but even so I don’t think it prevents you from translating the image back into the data, as Robert Kosara suggests is the mark of a good visualization, so it is probably just my personal taste.

Now for the bad example, which comes from that rich vein of style over substance, the Health Services Journal.


The time series at the top is in the squishy style, but seems to have been squished with a graphics package rather than a stats one. Some of the spikes indicating high and low numbers of emergency admissions appear to curl back on themselves, which is impossible in time series – but your graphics software doesn’t know that. I am ignoring the linear regression and whether that is a good idea, or the the mysterious 7 or 8 months when the hospital recorded exact compliance with the target – let’s focus on graphics for now… what makes this different to my (deliberately provocative) Inkscaped line graph is that there’s a lot of detail in the Northwick Park time series, and so the squishing makes the lines merge into one blob of ink. Once that happens, you can’t see anything apart from the gradual upward trend. Which is  a pity because the second graph suggests a winter effect each year, but we can no longer see that in the time series. Basically, the ability to translate the picture back to data has been lost, and perhaps it was always so complex that it couldn’t happen anyway, but that is a reason to present statistics instead, like a proper time series analysis.


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s