Tag Archives: infovis

Dataviz of the week, 29/3/17

Here’s a graphic of a really deep oil well by Fuel Fighter via Visual Capitalist. This is rather reminiscent (ahem) of the long, tall graphics by the Washington Post (and the eerily similar one from the Guardian a few days later which they had to admit they had nicked) about flight MH370 at the bottom of the ocean. The WP graphic works because you have to scroll down, and down, and down, and down, and down (wow, that’s deep!), and down, and down (no way), and down before you get to the sea bed. Yes, all the usual references are there, hot air balloons and Burj Khalifas and Barad-Dûrs and what have you, but they don’t matter because it’s the scrolling that does it, giving you GU2 (“Conveying the sense of the scale and complexity of a dataset”) and GU6 (“Attracting attention and stimulating interest.”) The references don’t mean anything to me (or probably you); I may have seen the Burj Khalifa and thought it was amazingly tall, but I have no grasp of how tall and that is what matters: I’d have to have an intuitive feel for what 3 BKs are compared to the height of a jet aircraft, and I don’t have that, so why should I care about the references?

Screen Shot 2017-03-21 at 08.42.38

My problem with the Fuel Fighter graphic is that it doesn’t have that same sense of depth. The image file is 796 x 4554 pixels, which is an aspect ratio of 1:17. The WP image (SVG FTW) is 539 x 16030 or 1:30, which is pretty extreme! It feels to me like you’d have to get past 1:20 before it started to have enough impact.

 

Leave a comment

Filed under Visualization

Dataviz of the week, 22/3/17

The Washington Post have an article about the US budget out by Kim Soffen and Denise Lu. It’s not long, but brings in four different graphical formats to tell different aspects of the data story. A bar showing parts of the whole (see, you don’t need a pie for this!)

Screen Shot 2017-03-21 at 08.21.25

then a line/dot/whatever-you-want-to-call-it chart of the change in relative terms

Screen Shot 2017-03-21 at 08.21.38

then a waffle of that change in absolute terms, plus a sparkline of the past.

Screen Shot 2017-03-21 at 08.21.55

there’s also a link to full department-specific stories under each graphic. I think this is really good stuff, though I can image some design-heads wanting to reduce it further. It shows how you can make a good data-driven story out of not many numbers.

Leave a comment

Filed under Visualization

We were there when they made Dear Data

 

Easy links: dear-data.com deardata-deliveries.tumblr.com

dd001

dd002

dd003

dd004

dd005

dd006

 

 

Procedural notes that can be skipped:

I had previously intended to write something about the shapes employed by Giorgia Lupi and the Accurat studio – and indeed I still will. But that takes some time and it got leapfrogged by Dear Data. This post came at a good time because I didn’t get around to it straight away (we’re now at week 35 of the project) and by the time I did, some other ideas had bubbled up in conversations, focussing my attention on the process of design, critique and refinement (which is getting added to my reading pile for the summer). These ideas are so alien to statisticians that I am not sure any of them will have read this far into this post, but they (we) are the ones that need to up their (our) game in communication. Nobody else will do it for us! The other building block that came along in time was finally finding really nice writing paper and resolving to draft everything by hand from now on, preferably in time when I’m physically away from a computer. It has already proven very productive. People seem to have different approaches that work (like starting with bullet points, or cutting out phrases, or mind maps), but mine is to start writing at sentence one, like Evelyn Waugh, and just carry straight on. There is no draft; why should there be? Finding that technique and place to write is really valuable; don’t devalue it and try to squeeze it into a train journey or between phone calls. It’s the principal way in which you communicate your work, and probably the most overlooked.

3 Comments

Filed under Visualization

Afterthoughts on extreme…………………………………………………………………………..scales

Earlier in the week I was bloggin’ about extreme time scales and various uses of spirals in data visualisation. This morning I thought about it a little more and realised the attraction of extreme scales, like the entire lifetime of our planet, or the size of the solar system, is in large part just that it’s fun. I start my own dataviz talks with Gelman & Unwin’s 6 objectives, which I think are helpful in framing the many uses of images (for a statistician, anyway – we were trained that there is only one use of a graph and that is to check for outliers / normality briefly before it is deleted!), although I get the impression (and I would be happy to be corrected by better informed dataviz hipsters (I use the term with only the very mildest form of offense)) that those objectives are generally looked upon with some disdain as Johnnies-come-lately in a design community that has had its own goals for a much longer time. In this application, we are appealing to GU2, “conveying the sense of the scale and complexity of a dataset”. In the original paper, G&U give network graphs as an example, because they convey an overall impression but little or not concrete information, so people like me tend not to approve. I like the data to be retrievable by the viewer. But why not, if it effectively sets the scene?

A couple of unorthodox examples spring to mind: scale reconstructions of the solar system and Stamen+Nasdaq on high-frequency trading. If you wipe out the extremities with a super-log scale then you lose the fun too. (OK, it’s a sitting duck of an ugly example, but still!) Another good one is the Washington Post on Flight MH370.

And then consider two popular visualizations, US Gun Deaths and CarbonVisuals NYC. In each case, they rely on the emotional impact of the sudden acceleration or amplification of values, and they get that in very different ways. As we learnt from Haydn, the impact of the Surprise only really works the first time, but it stays fun for years afterwards.

1 Comment

Filed under Visualization

Staggering to work

This morning I heard an unusual announcement as I arrived at Balham (“gateway to the South”) railway station. The trains going into London are busiest, the man said, between 8:15 and 8:30, so travelling before or after this would make our journeys quicker and more comfortable.

I immediately thought this would make a good excuse to post this old London Transport poster, with its clever design and charming pictograms (and questionable math):

So, in there seemed to be 35,000 people on the tube between 5:00 and 5:30 pm. According to 2009/10’s London Travel Demand Survey, there were about 125,000 (roughly eyeballing Figure 4.1) crammed in cheek by jowl.

 

Leave a comment

Filed under Visualization

Visualization lessons from the roadside

At the weekend I read a fascinating newspaper article, unlikely as it may sound, about road signs. (I should stress here, comrades, that the Telegraph in question was bought by my otherwise sensible father-in-law.) It is now online too. The reason it held me transfixed in my armchair on a rainy Sunday morning was the many parallels with data visualization. Although I’d never given much thought to them, of course every aspect of these signs were designed and chosen carefully. The primary objective is safety, and so the information has to be absorbed rapidly as you whizz down some quaint English country lane. You know, like this:

A screen shot from the

A screen shot from the Yorkshire edition of Grand Theft Auto

As we are often told of how alarmingly little time we have to get a reader’s attention before they move on, some of these ideas are worth considering.

Firstly, the typeface is chosen for its simplicity, and directing arrows stylized. Where we used to have more accurate depictions like this:

Good luck, drivers! (c) maljoe at flickr

Good luck, drivers! (c) maljoe at flickr

We would now simplify that a lot, especially that it is actually just a T-junction, but you can only work that out once you have absorbed all the information in the sign. So, taking Gelman and Unwin’s advice to always present in more than one format, it might be worth having a simplified depiction of your key message up front, and readers then click through to the detail, to the level that they want.

A limited palette of colors and icons is a good thing too. The road signs went for contrast but usually in data viz you want a limited spectrum, shades of blue for example, with the odd splat of orange picking out some detail. And lots of blank space! Don’t feel the urge to clutter it up. The schoolkids crossing the road used to look like this:

Happy days (c) jp4712 at flickr

Happy days (c) jp4712 at flickr

Again, reality had to be subsumed to clarity. That’s why I feel anxious about the constant reference to data visualization nowadays. A lot of the time, you don’t want to show people all the data. It’s just too much. That’s why we have stats!

Jock Kinnear and Margaret Calvert, whose signs first appeared in 1958 and went national in 1963, also took pains to find out whether mixed upper- and lower-case type would be more quickly absorbed than the old upper-case (consider BIRMINGHAM or Birmingham). Excellent, evidence-based work.

Leave a comment

Filed under Visualization

Easy pictograms using R

I have been amazed for a while that there is no major stats software offering pictograms. You know the sort of classic infographic I mean:

Isotype’s classic design

Well, I have been working on an R function to help with this. It’s at Github here and below. Here’s an example:

man<-readPNG("man.png")

pictogram(icon=man,
n=c(12,35,52),
grouplabels=c("dudes","chaps","lads"))

same_icons

Simple, huh? You can also have more than one icon, although it’s up to you to keep them a sensible height:width ratio or ‘aspect’ to avoid distorting impressions.

pictogram(icon=list(man,holly,monster),
n=c(12,35,52),
grouplabels=c("men","holly","monsters"))

different_aspects

Edit 30 July 2013: Thank you to Paul Murrell, who wrote to tell me that you can do the same thing with vector images rather than raster using the grImport package and its function grid.symbols(). The advantage of vector images is that they don’t get pixellated and grainy as you zoom in on them.

Also, if you want to know more about how R handles raster images, you should check out Paul’s R Journal article from 2011.

Suggestions? e-mail me or better still, pull them on Github. Happy pictogramming!


# requires image to be read in by readPNG or similar and supplied as "icon"
# To do: allow for non-integer n
pictogram<-function(icon,n,grouplabels="",
hicons=20,vspace=0.5,labprop=0.2,labelcex=1) {
if(is.list(icon)) {
licon<-icon
} else {
licon<-list(icon)
for (i in 2:length(n)) {
licon[[i]]<-icon
}
}
library(reshape)
sumn<-sum(n)
group<-untable(df=matrix((1:length(n)),ncol=1),num=n)
vicons<-ceiling(n/hicons)
allv<-sum(vicons)
tail<-n%%hicons
# dim[1] is the height, dim[2] the width:
devaspect<-dev.size(units="px")[1]/dev.size(units="px")[2]
xlength<-1
# get dims of all elements of licon, find greatest aspect and set ylength
getdim<-function(z) {
aspect<-dim(z)[1]/dim(z)[2]
return(aspect)
}
all.ylengths<-unlist(lapply(licon,getdim))
ylength<-max(all.ylengths)
all.ylengths<-untable(df=matrix(all.ylengths,ncol=1),num=n)
ytop<-allv*ylength
if(devaspect*hicons<allv) warning("Icons may extend above the top of the graph")
# vector of icons per row
iconrow<-as.vector(as.matrix(rbind(rep(hicons,length(vicons)),tail)))
# vector for how many times to repeat elements of iconrow
reprow<-as.vector(as.matrix(rbind((vicons-1),rep(1,length(vicons)))))
perrow<-untable(df=matrix(iconrow,ncol=1),num=reprow)
spacing<-NULL
for (i in 1:(length(n))) {
spacing<-c(spacing,rep((i-1)*vspace*ylength,n[i]))
}
y0<-spacing+(ylength*untable(df=matrix((1:allv)-1,ncol=1),num=perrow))
y1<-y0+all.ylengths
# there are more elegant ways to make x0, but for now...
x0<-NULL
for (i in 1:(length(perrow))) {
x0<-c(x0,(0:(perrow[i]-1)))
}
x1<-x0+xlength
leftplot<-floor(-(labprop*hicons))
plot(c(leftplot,hicons),c(0,(devaspect*hicons)),
type="n",bty="n",ylab="",xlab="",xaxt="n",yaxt="n")
lines(x=c(0,0),y=c(min(y0)-(ylength/2),max(y1)+(ylength/2)))
for (i in 1:sumn) {
rasterImage(image=licon[[group[i]]],xleft=x0[i],xright=x1[i],
ytop=y1[i],ybottom=y0[i])
}
# find positions for labels
ylabpos<-rep(NA,length(n))
for (i in 1:length(n)) {
ylabpos[i]<-(max(y1[group==i])+min(y0[group==i]))/2
}
text(x=leftplot/2,y=ylabpos,labels=grouplabels,cex=labelcex)
}

5 Comments

Filed under R