Tag Archives: visualisation

Dataviz of the week, 28/6/17

Jake Conway, Alexander Lex & Nils Gehlenborg have made an R package called UpSetR which, as the name suggests, puts on an iron shirt and chases the devil out of Eart’. The devil in this case being Venn diagrams. Invariably, when people want to count up combinations of stuff, they end up hand-bodging some crappy diagram that isn’t even a real Venn. They use powerpoint or some other diabolical tool. Now you can do better in R.

Screen Shot 2017-06-27 at 11.27.58

Leave a comment

Filed under Visualization

Dataviz of the week, 22/6/17

All I’m going to do this week I point you to Andy Kirk’s blog. He’s considering bivariate choropleth maps. You what? Each region has two variables. Maybe one gets encoded as hue and the other saturation. No way. Yes way, and it’s not necessarily the train wreck you’d imagine. Check them out.

Bivariate-choropleth

Of course, by superimposing objects rather than colouring in (because colour is, you know, so beguiling as a visual parameter to mess around with, yet so poorly perceived), people have been doing this for ages. Bertin’s much-quoted and less-read book has many such examples, which mostly fall flat in my view. As he wrote: “It is the designer’s duty … to flirt with ambiguity without succumbing to it”. Bof!

Leave a comment

Filed under Visualization

Dataviz of the week, 15/6/17

It’s Clean Air Day in the UK. Air pollution interests me, partly as I worked in medical stats for many years, partly because I don’t want to breathe in a lot of crap, and partly because I don’t want my baby to breathe in a lot of crap. London is really bad, the worst place in Europe. Not Beijing, sure, but really bad, and it’s hard to imagine that Brexit will lead to anything but a relaxation of the rules.

Real World Visuals (formerly CarbonVisuals, who made the amazing mountain of CO2 balls looming over New York) have made a series of simple, elegant but powerful images about volumes of air and what they contain, and the volumes of air saturated with pollution which are left behind by one car over one kilometer travelled.

C1

The tweet is accidentally poetic as it can’t accommodate more than the first four images, which leaves you on a cliffhanger with the massive stack looming behind the mother and girl. You know what it is but you can’t see its enormity yet.

The crowd visualisation of 9,416 dead Londoners as dots is not bad, though I like physical images of numbers of people, like this classic (adapted from http://www.i-sustain.com/old/CommuterToolkit.htm):

images.washingtonpost.com_

Here’s a picture of apparently 8-9000 people marching in Detroit:

636206168967435532-lansing-march

All dead by Christmas. And then some.

You might like to compare and contrast with higher-profile causes of death, like terrorism.

Leave a comment

Filed under Visualization

UK election cartogram in the medium of Opal Fruits

I’ve been stockpiling Opal Fruits, which young people tell me are now called Starburst, in anticipation of today’s election results.

opal-fruits-cartogram

This is like one-tenth of the stash. I don’t want to eat them though. You know what you’re going to get if you knock here at Halloween.

I took the New York Times’ hexbin cartogram, imposed a 6×8 rectangular grid and counted the most common party in each block. There was a little bit of fudging and chopping up the sweets. It is art, no? Here’s the video:

Leave a comment

Filed under Visualization

Dataviz of the week, 7/6/17

You know how people love maps with little shapes encoding some data? Hexagons, circles, squares? Jigsaw pieces? Opal Fruits?

block choropleth

Rip’t from the pages of the Times Higher Education magazine, some years ago.

Or small multiples?

You know how people love charts made from emojis?

Screen Shot 2017-04-19 at 00.02.51

Stick them together and what do you get?

 

 

Awesomesauce.

This is by Lazaro Gamio. They’re not standard emojis. Six variables get cut into ordinal categories and mapped to various expressions. You can hover on the page (his page, not mine, ya dummy) for more info. Note that some of the variables don’t change much from state to state. Uninsured, college degrees, those change, but getting enough sleep — not so much. It must be in there because it seems fun to map it to bags under the eyes. But the categorisation effectively standardises the variables so small changes in sleep turn into a lot of visual impact. Anyway, let’s not be too pedantic, it’s fun.

This idea goes back to Herman Chernoff, who always made it clear it wasn’t a totally serious proposal, and has been surprised at its longevity (see his chapter in PPF). Bill Cleveland was pretty down on the idea in his ’85 book:

“not enough attention was paid to graphical perception … visually decoding the quantitative information is just too difficult”

Leave a comment

Filed under Visualization

Dataviz of the week, 31/5/17

A more techy one this week. Ruth Fong and Andrea Vedaldi have a paper on ArXiv called “Interpretable explanations of black boxes by meaningful perturbation”. The argument that some modern machine learning (let’s not start that one again) techniques are black boxes which produce an output but nobody can understand how and why is a serious concern. If you don’t know how it works, how do you know you can believe it, or apply it outside the bounds of your previous data (in the manner of the disastrous Challenger space shuttle launch)?

HT @poolio for tweeting this, otherwise I’d never have heard about it.

The paper is heavy on the maths but thanks to the visual nature of convolutional neural networks (CNNs), which are high-dimensional non-linear statistical models to classify images, you can absorb the message very easily. Take the image, put it through the CNN, get a classification. Here, from the paper’s Figure 1, we see this image classified as containing a flute with probability 0.9973

Screen Shot 2017-05-31 at 11.39.58

Then, they randomly perturb an area of the image and run it again, checking how it has affected the prediction probability. When they find an area that strongly adversely affects the CNN, they conclude that it is here that the CNN is “looking”. Here’s a perturbed image:

Screen Shot 2017-05-31 at 11.40.13

You can see it’s the flute that has been blurred. They then show the impact of different regions in this “learned mask” heatmap:

Screen Shot 2017-05-31 at 11.40.19

(I’m glossing over the computational details quite a lot here because this post is about dataviz.) It rather reminds me of the old days when I was an undergrad and had to calculate a gazillion different types of residuals and influence statistics, many of which were rather heuristic. You could do this kind of thing with all kinds of black boxes (as  Fong & Vedaldi suggest by proposing a general theory of “explanations”), as long as there are some dimensions that are structural (x and y position in the case of image data) and others that can get perturbed (RGB values in this case). I think it would be valuable in random forests and boosted trees.

They also have a cup of coffee where the mask makes sense when artifacts are added (the kind of artifact that is know to mess with CNNs yet not human brains) and a maypole dance that doesn’t so much (and this seems to be powered by the same CNN tendency to spot ovals). This is potentially very informative for refining the CNN structure.

Screen Shot 2017-05-31 at 11.56.38

If you are interested in communicating the robustness of CNNs effectively, you should read this too.

 

 

Leave a comment

Filed under computing, machine learning, Visualization

Dataviz of the week, 24/5/2017

On Twitter, @SirSandGoblin is tracking polls before the UK general election in the medium of cross-stitch.

You just have to look. This is clearly the work of a dataviz genius. I have nothing more to say.

Leave a comment

Filed under Visualization