Tag Archives: interactive

Dataviz of the week, 17/5/2017

nextstrain.org is a website that offers real-time tracking of pathogens as they evolve (flu, ebola, dengue, all your favourites are here). Data gets pulled in from various monitoring systems worldwide and represented with interactive content in several pretty ways:

Screen Shot 2017-05-16 at 15.24.19Screen Shot 2017-05-16 at 15.25.02Screen Shot 2017-05-16 at 15.24.37Screen Shot 2017-05-16 at 15.25.19Screen Shot 2017-05-16 at 15.25.36

They have their own libraries called fauna, augur and auspice, the last of these doing the dataviz stuff, and as far as I could tell built on D3. I don’t pretend to understand the genetic and genomic work that has to go on to process the raw data but that is clearly substantial.

Leave a comment

Filed under Visualization

Best dataviz of 2016

I’m going to return to 2014’s approach of dividing best visualisation of data (dataviz!) from visualisation of methods (methodviz!).

In the first category, as soon as I saw Jill Pelto’s watercolour data paintings I was bowled over. Time series of environmental data are superimposed and form familiar but disturbing landscapes. I’m delighted to have a print of Landscape of Change hanging in the living room at Chateau Grant. Pelto studies glaciers and spends a lot of time on intrepid-sounding field trips, so she sees the effects of climate change first hand in a way that the rest of us don’t. There’s a NatGeo article on her work here.

download

In the methodviz category, Fernanda Viegas, Martin Wattenberg, Shan Carter and Daniel Smilkov made a truly ground-breaking website for Google’s TensorFlow project (open source deep learning software). This shows you how artificial neural networks of the simple feedforward variety work, and allows you to mess about with their design to a certain extent. I was really impressed with how the hardest aspect to communicate — the emergence of non-linear functions of the inputs — is just simple, intuitive and obvious for users. I’m sure it will continue to help people learn about this super-trendy but apparently obscure method for years to come, and it would be great to have more pages like this for algorithmic analytical methods. You can watch them present it here.

screen-shot-2016-10-25-at-22-37-26

Leave a comment

Filed under Visualization

Noise pollution map of London (part 1)

I’m working on a noise pollution map of central London. Noise is an interesting public health topic, overlooked and of debatable cause and effect but understandable to everyone. To realise it as interactive online content, I get to play around with Mapbox as well as D3 over Leaflet [1] and some novel forms of visualisation, audio delivery and interaction.

The basic idea is that, whenever the need arises to get from A to B, and I could do it by walking, I record the ambient sound and also capture a detailed GPS trail. Then, I process those two sets of data back at bayescamp and run some sweet tricks to make them into the map. I have about 15 hours of walking so far, and am prototyping the code to process the data. The map doesn’t exist yet, but in a future post on this subject, I’ll include a sketch of what it might look like. The map below shows some of my walks (not all). As I collect and process the files, I will update the image here, so it should be close to live.

600x400

I’d like it to become crowd-sourced, in the sense that someone else could follow my procedure for data capture, copy the website and add their own data before sharing it back. GitHub feels like the ideal tool for this. Then, the ultimate output is a tool for people to assemble their own noise-pollution data.

As I make gradual progress in my spare time, I’ll blog about it here with the ‘noise pollution’ tag. To start with, I’ll take a look at:

The equipment

Clearly, some kind of portable audio recorder is needed. For several years, when I made the occasional bit of sound art, I used a minidisc recorder [2] but now have a Roland R-05 digital recorder. This has an excellent battery life and enough storage for at least a couple of long walks. At present, you can get one from Amazon for GBP 159. When plugged into USB, it looks and behaves just like a memory stick. I have been saving CD-quality audio in .wav format, mindful that you can always degrade it later, but you can’t come back. That is pretty much the lowest quality the R-05 will capture anyway (barring .mp3 format, and I decided against that in that I don’t want it to dedicate computing power to compressing the sound data), so it occupies as little space on the device as possible. It will tuck away in a jacket pocket easily so there’s no need to be encumbered by kit like you’re Chris Watson.

Pretty much any decent microphone, plus serious wind shielding, would do, but my personal preference is for binaurals, which are worn in the ear like earphones and capture a very realistic stereo image. Mine are Roland CS-10EM which you can get for GBP 76. The wind shielding options are more limited for binaurals than a hand-held mic, because they are so small. I am still using the foam covers that come with the mics (pic below), and wind remains something of a consideration in the procedure of capturing data, which I’ll come back to another time.

20160914_120733

On the GPS side, there are loads of options and they can be quite cheap without sacrificing quality. I wanted something small that allowed me to access the data in a generic format, and chose the Canmore GT-730FL. This looks like a USB stick, recharges when plugged in, can happily log (every second!) for about 8 hours on a single charge, and allows you to plug it in and download your trail in CSV or KML format. The precision of the trail was far superior to my mobile phone at the time when I got it, though the difference is less marked now even with a Samsung J5 (J stands for Junior (not really)). There is a single button on the side, which adds a flag to the current location datum when you press it. That flag shows up in KML format in its own field, but is absent from CSV. They cost GBP 37 at present. There are two major drawbacks: the documentation is awful (Remember when you used to get appliances from Japan in the 80s and none of the instructions made sense? Get ready for some nostalgia.) and the data transfer is by virtual serial port, which is straightforward on Windows with the manufacturer’s Canway software but a whole weekend’s worth of StackOverflow and swearing on Linux/OS X. Furthermore, I have not been able to get the software working on anything but an ancient Windows Vista PC (can you imagine the horror). Still, it is worth it to get that trail. There is a nice blog by Peter Dean (click here), which details what to do with the Canmore and its software, and compares it empirically to other products. The Canway software is quite neat in that it shows you a zoomable map of each trail, and is only a couple of clicks away from exporting to CSV or KML.

Having obtained the .kml file for the trail plus starting point, the .csv file for the trail in simpler format, and the .wav file for the sound, the next step is synchronising them, trimming to the relevant parts and then summarising the sound levels. For this, I do a little data-focussed programming, which is the topic for next time.

Footnotes

1 – these are JavaScript libraries that are really useful for flexible representations of data and maps. If you aren’t interested in that part of the process, just ignore them. There will be plenty of other procedural and analytic considerations to come that might tickle you more.

2 – unfairly maligned; I heard someone on the radio say recently that, back around 2000, if you dropped a minidisc on the floor, it was debatable whether it was worth the effort to pick it up

 

Leave a comment

Filed under Visualization

Visualizing HDI: a fine D3 exemplar

This interactive visualisation of Human Development Index values, by country and over time, was released last week.
visualizing-hdi

For me, it follows in the mould of The State of Obesity, but is much more transparent in how it is constructed when you look at the source code. That makes it a good exemplar — in fact, perhaps the exemplar currently available — for introducing people to the possibility of making interactive dataviz for their research projects.

Oh for those early days of D3, when nobody was terribly fluent with it, and websites would have all the code right there, easy to read and learn from.

That transparency is important, not just for teaching about dataviz, but for the whole community making and innovating interactive data visualisation. Oh for those early days of D3, when nobody was terribly fluent with it, and websites would have all the code right there, easy to read and learn from. Now they are tucked away in obscure links upon links, uglified and mashed up with other JS libraries, familiar to the maker but (probably) not you. There are obvious commercial pressures to tucking the code away somewhere, and you can actually obtain software to obfuscate it deliberately. At the same time, having everything in one file, hard-coded for the task at hand, may be easy to learn from, but it isn’t good practice in any kind of coding culture, so if you want to be respected by your peers and land that next big job, you’d better tuck it all away in reusable super-flexible in-house libraries. And yet, the very availability of simple D3 code was what kick-started the current dataviz boom. Everyone could learn from everyone else really quickly because everything was open source. I don’t like to think that was a short-lived phase in the early part of the technology’s life cycle, but maybe it was…

Anyway, that’s enough wistful nostalgia (I learnt yesterday that I am the median age for British people, so I am trying not to sound like an old duffer). Here’s the things I don’t like about it:

  1. it requires a very wide screen; there’s no responsiveness (remember your audience may not work in a web design studio with a screen as big as their beard)
  2. life expectancy gets smoothed while the other variables don’t – just looks a bit odd
  3. why have colors for continents? Doesn’t it distract from the shading? Don’t we already know which one is which?
  4. Why give up on South Sudan, Somalia (which seems to be bunged together with Somaliland and Puntland in one big “hey it’s somewhere far away, they won’t notice” sort of way) and North Korea? Aren’t these countries’ estimates kind of important in the context, even if they are not very good? Do you really believe the Chinese estimates more than these just because they’re official?

But all in all, these are minor points and a nice amount of grit for the mill of students thinking about it. I commend it to you.

1 Comment

Filed under Visualization

Look down the datascope

Maarten Lambrechts has a great post over at his blog. It’s all about interactive dataviz, regarding it as a datascope, that – like a telescope – lets you look deep into the data and see stuff you couldn’t otherwise. You must read it! But just to give you the punchline:

A good datascope

  1. unlocks a big amount of data
  2. for everyone
  3. with intuitive controls
  4. of which changes are immediately represented by changes in the visual output
  5. that respects the basic rules of good data visualization design
  6. and goes beyond what can be done with static images.

Maybe I should add a 7th rule: a facet or view of the datascope should be saveable and shareable.

Thanks to Diego Kuonen for sharing on Twitter

Leave a comment

Filed under Visualization

Visualizing multivariate data – at the RSS

I’ll be talking (and videoing) at the RSS on 11 March. This is in part a re-run of the highly popular dataviz session at the conference last September, though not every speaker could make it, so I’m also giving an overview. You can expect an introduction to interactive online graphics for people more familiar with stats software, and advice on how to get started making your own.

Tuesday 11 March 2014, 02:00pm – 05:00pm
Location Royal Statistical Society, 12 Errol Street, London, EC1Y 8LX

Programme:
2pm Urska Demsar (St. Andrew’s) Bringing together geovisualisation, time geography and computational ecology: using space-time density of trajectories to visualise dynamics in animal space use over time

2:40pm Duncan Smith (LSE) An Urban Renaissance Achieved? Visualising Urban Form, Dynamics and Sustainability

3:20pm Tea/Coffee/Biscuits

3:50pm Robert Grant (St. George’s) Pretty persuasion: visualisation trends and tools from a statistician’s viewpoint

4:30pm Discussion: “The role of Statisticians in data visualisation research”

5pm Close

Booking with payment required – please book using the relevant booking form.

Registration fees:
£20 RSS Student & Retired Fellows
£22 RSS CStats & GradStats
£25 RSS Fellows
£35 RSS section & student members
£45 None of the above

Leave a comment

Filed under Visualization

Data viz comes to Errol Street

That sounds a little unfair. It’s not that the Royal Statistical Society is inimical to visualization, just that they don’t keep an eye on it in the way that a certain zone of the blogosphere does. My own Stat Comp committee’s session on visualization at the conference in Newcastle last month was the most overcrowded room of the whole four days. We brought in double the chairs and still people stood out into the corridor and sat on the floor, which was pleasing but not all that surprising.

Last Thursday they held a joint meeting with the Association for Survey Computing on data viz with speakers from the Guardian Digital Agency and the Office for National Statistics. There were a lot of people in the room, and apparently a waiting list for cancellations to attend.

The speakers from Guardian Digital described a process comprising data – story – chart – design, with quite a long time spent getting the data right and looking for the potential stories. (You have to remember, these guys typically start with an interesting dataset gathered for its own merits, not a study with a pre-specified hypothesis.) If a barrier is encountered at any point, you have to start again to ensure integrity.

They cited the case of Ivan Cash’s infographic of infographics, which excels in all aspects except the data, being based on a small number of cases from a single website. The lack of data integrity makes the whole thing collapse. Not that it was ever intended as more than a bit of fun.

Alan Smith from the ONS had a bunch of interactive graphics to share, some already published, some new. My favourite was the internal migration map of the UK, which reduces an intractable (for human brains) 350×350 transition matrix to a clickable map showing where people move home from and to.

Two good questions from the audience which I paraphrase:

Q: Where do your data viz people sit within the organisation to achieve this level of integration and output?
A: Part of methodology, so we are seen as working on best practice; early discussion saw some managers suggesting we fall under the IT department because we would be doing things with computers [cue widespread mirth].

Q: Will there still be a future for static images, printed or otherwise,  or will everything have to be interactive?
A: Static images remain very important. We can’t expect all information to be absorbed online and through no other medium, and also a lot of animations are really just the precursor to a static image that can be considered in depth [e.g. Every Drone Strike]. But it is quite simple to translate a static image into a clickable one for online consumption using some of the newer JavaScript libraries like D3. The idea of Data Driven Documents should appeal to statisticians. The RSS Centre for Statistical Education have used interactive graphics as a stimulus to learning about statistical thinking, although most interest from academics has come from web design / computer science departments, not statisticians. This should act as a wake-up call to statisticians to get involved and acquire these new skills.
[My emphasis]

Leave a comment

Filed under Visualization