Tag Archives: D3

Dataviz of the week, 5/7/17

This week we look at a clinical trial of treatments for tuberculosis, the PanaCEA MAMS-TB study. I’ve been involved with TB on and off since project-managing and statisticizing the original NICE guideline back in the day. I won’t go into detail on TB treatments but the trial compares various combinations of drugs, and there’s a new candidate drug called SQ109 in the mix. The paper is here (I hope it is not paywalled). You can see the Kaplan-Meier plot on page 44. Without going into detail, these are classic formats for clinical trials looking at time-to-event data. As time goes by, people either get recurrence of the disease or disappear out of the trial, and the numbers at risk go down. You want to be in a group whose curve descends less steeply.

But there are different ways of measuring and counting events, so the authors made an interactive web page showing these as a sensitivity analysis. Hooray!

Screen Shot 2017-06-23 at 12.55.33

It’s a pity Lancet paid such lip service to it, tucked away as a link in the margin of page 45. Boo!

I found the transitions in the table of patients at risk weird – I guess that’s the d3 transition deciding to move the numbers horizontally and it might be clearer to fade them out, remove them, then put them back from scratch. It’s also clear that Mike Bostock never had to deal with step functions in transition. But otherwise a really nice example of how trials can provide more layers of info. 

Leave a comment

Filed under healthcare, JavaScript, Visualization

Dataviz of the week, 17/5/2017

nextstrain.org is a website that offers real-time tracking of pathogens as they evolve (flu, ebola, dengue, all your favourites are here). Data gets pulled in from various monitoring systems worldwide and represented with interactive content in several pretty ways:

Screen Shot 2017-05-16 at 15.24.19Screen Shot 2017-05-16 at 15.25.02Screen Shot 2017-05-16 at 15.24.37Screen Shot 2017-05-16 at 15.25.19Screen Shot 2017-05-16 at 15.25.36

They have their own libraries called fauna, augur and auspice, the last of these doing the dataviz stuff, and as far as I could tell built on D3. I don’t pretend to understand the genetic and genomic work that has to go on to process the raw data but that is clearly substantial.

Leave a comment

Filed under Visualization

Dataviz of the week: 1/3/17

Scribbly states” is not done with felt-tip pens but with some sweet use of D3 Javascript by Noah Veltman. I admire his attention to the little details, making it more human-like and commenting on the situations where it doesn’t work. Turns out if you follow the links, that the method came out of Apple, who patented it way back. Didn’t someone like @inconvergent have a script to make coffee rings? You could chuck that on top for extra authenticity.


Leave a comment

Filed under Uncategorized, Visualization

Dataviz of the week, 17/1/2017

These simple line charts are a lot of fun. Your task is to guess what happened to various stats during the Obama years. Then the truth is revealed. I got the first one amazingly close to the truth, felt pretty smug, then missed all the others by a mile. You might expect a rather partisan message from this left-wing (by American standards) source, but it is quite neutral.


Larry Buchanan, Haeyoun Park and Adam Pearce are the creators. Oh for the good old days when everyone was using d3 for online interactive graphics and the source code was easy to follow. These images don’t have to be interactive, just to have part of the line invisible and then appear. They seem to have made the whole thing in Illustrator and done some ai2html conversion from there. Each2theirown. It seems to me like it would actually take longer to do that than to just get on and code the damn thing from first principles. Drawing the line on top is actually pretty easy to achieve, even I can do that sort of thing, so, like Ken Hom’s hot wok, so can you.

This kind of interactive would be quite nice for teaching stats. And I like the way that the y-axis range changes slightly so as not to give you any clues.


Leave a comment

Filed under Visualization

Noise pollution map of London (part 1)

I’m working on a noise pollution map of central London. Noise is an interesting public health topic, overlooked and of debatable cause and effect but understandable to everyone. To realise it as interactive online content, I get to play around with Mapbox as well as D3 over Leaflet [1] and some novel forms of visualisation, audio delivery and interaction.

The basic idea is that, whenever the need arises to get from A to B, and I could do it by walking, I record the ambient sound and also capture a detailed GPS trail. Then, I process those two sets of data back at bayescamp and run some sweet tricks to make them into the map. I have about 15 hours of walking so far, and am prototyping the code to process the data. The map doesn’t exist yet, but in a future post on this subject, I’ll include a sketch of what it might look like. The map below shows some of my walks (not all). As I collect and process the files, I will update the image here, so it should be close to live.


I’d like it to become crowd-sourced, in the sense that someone else could follow my procedure for data capture, copy the website and add their own data before sharing it back. GitHub feels like the ideal tool for this. Then, the ultimate output is a tool for people to assemble their own noise-pollution data.

As I make gradual progress in my spare time, I’ll blog about it here with the ‘noise pollution’ tag. To start with, I’ll take a look at:

The equipment

Clearly, some kind of portable audio recorder is needed. For several years, when I made the occasional bit of sound art, I used a minidisc recorder [2] but now have a Roland R-05 digital recorder. This has an excellent battery life and enough storage for at least a couple of long walks. At present, you can get one from Amazon for GBP 159. When plugged into USB, it looks and behaves just like a memory stick. I have been saving CD-quality audio in .wav format, mindful that you can always degrade it later, but you can’t come back. That is pretty much the lowest quality the R-05 will capture anyway (barring .mp3 format, and I decided against that in that I don’t want it to dedicate computing power to compressing the sound data), so it occupies as little space on the device as possible. It will tuck away in a jacket pocket easily so there’s no need to be encumbered by kit like you’re Chris Watson.

Pretty much any decent microphone, plus serious wind shielding, would do, but my personal preference is for binaurals, which are worn in the ear like earphones and capture a very realistic stereo image. Mine are Roland CS-10EM which you can get for GBP 76. The wind shielding options are more limited for binaurals than a hand-held mic, because they are so small. I am still using the foam covers that come with the mics (pic below), and wind remains something of a consideration in the procedure of capturing data, which I’ll come back to another time.


On the GPS side, there are loads of options and they can be quite cheap without sacrificing quality. I wanted something small that allowed me to access the data in a generic format, and chose the Canmore GT-730FL. This looks like a USB stick, recharges when plugged in, can happily log (every second!) for about 8 hours on a single charge, and allows you to plug it in and download your trail in CSV or KML format. The precision of the trail was far superior to my mobile phone at the time when I got it, though the difference is less marked now even with a Samsung J5 (J stands for Junior (not really)). There is a single button on the side, which adds a flag to the current location datum when you press it. That flag shows up in KML format in its own field, but is absent from CSV. They cost GBP 37 at present. There are two major drawbacks: the documentation is awful (Remember when you used to get appliances from Japan in the 80s and none of the instructions made sense? Get ready for some nostalgia.) and the data transfer is by virtual serial port, which is straightforward on Windows with the manufacturer’s Canway software but a whole weekend’s worth of StackOverflow and swearing on Linux/OS X. Furthermore, I have not been able to get the software working on anything but an ancient Windows Vista PC (can you imagine the horror). Still, it is worth it to get that trail. There is a nice blog by Peter Dean (click here), which details what to do with the Canmore and its software, and compares it empirically to other products. The Canway software is quite neat in that it shows you a zoomable map of each trail, and is only a couple of clicks away from exporting to CSV or KML.

Having obtained the .kml file for the trail plus starting point, the .csv file for the trail in simpler format, and the .wav file for the sound, the next step is synchronising them, trimming to the relevant parts and then summarising the sound levels. For this, I do a little data-focussed programming, which is the topic for next time.


1 – these are JavaScript libraries that are really useful for flexible representations of data and maps. If you aren’t interested in that part of the process, just ignore them. There will be plenty of other procedural and analytic considerations to come that might tickle you more.

2 – unfairly maligned; I heard someone on the radio say recently that, back around 2000, if you dropped a minidisc on the floor, it was debatable whether it was worth the effort to pick it up


Leave a comment

Filed under Visualization

Visualizing HDI: a fine D3 exemplar

This interactive visualisation of Human Development Index values, by country and over time, was released last week.

For me, it follows in the mould of The State of Obesity, but is much more transparent in how it is constructed when you look at the source code. That makes it a good exemplar — in fact, perhaps the exemplar currently available — for introducing people to the possibility of making interactive dataviz for their research projects.

Oh for those early days of D3, when nobody was terribly fluent with it, and websites would have all the code right there, easy to read and learn from.

That transparency is important, not just for teaching about dataviz, but for the whole community making and innovating interactive data visualisation. Oh for those early days of D3, when nobody was terribly fluent with it, and websites would have all the code right there, easy to read and learn from. Now they are tucked away in obscure links upon links, uglified and mashed up with other JS libraries, familiar to the maker but (probably) not you. There are obvious commercial pressures to tucking the code away somewhere, and you can actually obtain software to obfuscate it deliberately. At the same time, having everything in one file, hard-coded for the task at hand, may be easy to learn from, but it isn’t good practice in any kind of coding culture, so if you want to be respected by your peers and land that next big job, you’d better tuck it all away in reusable super-flexible in-house libraries. And yet, the very availability of simple D3 code was what kick-started the current dataviz boom. Everyone could learn from everyone else really quickly because everything was open source. I don’t like to think that was a short-lived phase in the early part of the technology’s life cycle, but maybe it was…

Anyway, that’s enough wistful nostalgia (I learnt yesterday that I am the median age for British people, so I am trying not to sound like an old duffer). Here’s the things I don’t like about it:

  1. it requires a very wide screen; there’s no responsiveness (remember your audience may not work in a web design studio with a screen as big as their beard)
  2. life expectancy gets smoothed while the other variables don’t – just looks a bit odd
  3. why have colors for continents? Doesn’t it distract from the shading? Don’t we already know which one is which?
  4. Why give up on South Sudan, Somalia (which seems to be bunged together with Somaliland and Puntland in one big “hey it’s somewhere far away, they won’t notice” sort of way) and North Korea? Aren’t these countries’ estimates kind of important in the context, even if they are not very good? Do you really believe the Chinese estimates more than these just because they’re official?

But all in all, these are minor points and a nice amount of grit for the mill of students thinking about it. I commend it to you.

1 Comment

Filed under Visualization

UK election facts clarified with interactive graphics

I’ve been impressed with this website (constituencyexplorer.org.uk) put together by Jim Ridgway and colleagues at Durham, with input from the House of Commons Library and dataviz guru Alan Smith from the ONS. In part, it is aimed at members of parliament, so they can test their knowledge of facts about constituencies and learn more along the way. But it makes for a fun quiz for residents too. Everything is realised in D3, so it runs everywhere, even on your phone. There are a few features I really like: the clean design, the link between map, list and dotplot in the election results:


… the animation after choosing a value with the slider, highlighting the extra/shortfall icons and the numbers dropping in: nice!


… the simple but quite ambitious help pop-up:


… and the way that the dotplots are always reset to the full width of the variable, so you can’t be misled by small differences appearing bigger than they are. The user has to choose to zoom after seeing the full picture.

All in all, a very nice piece of work. I must declare that I did contribute a few design suggestions in its latter stages of development but I really take no credit for its overall style and functionality. Budding D3 scripters could learn a lot from the source code.

And while we’re on the topic, here some more innovative electoral dataviz:



And finally, take a moment to ask election candidates to commit to one afternoon of free statistical training, a great initiative from the RSS – and frankly, not much to ask. Unfortunately, none of my local (Croydon Central) would-be lawmakers have been bothered to write back yet. But here’s the parties that are most interested in accurate statistics, in descending order (by mashing up this and this):

  1. National Health Alliance: 4/13
  2. Pirate Party: 1/6
  3. Green Party: 47/568
  4. Labour Party: 51/647
  5. Plaid Cymru: 3/40
  6. Liberal Democrats: 47/647
  7. Ulster Unionist Party: 1/15
  8. Christian People’s Alliance: 1/17
  9. Conservative and Unionist Party: 27/647
  10. Scottish National Party: 2/59
  11. United Kingdom Independence Party: 15/624

Leave a comment

Filed under Visualization