I recently wrote an article for Significance magazine in the series “A life in statistics” which came out last week. You can read it free here. I spoke to Nathan Yau of FlowingData.com about his experiences and his predictions of the future of infoviz. Then I noticed that Andrew Gelman and Anthony Unwin at Columbia University had written on the topic of graphs vs infoviz for the March 2013 issue of the Journal of Computational and Graphical Statistics, with responses from some prominent viz people, and a rejoinder (in fact, as I write, the issue of the JCGS is not yet out, but you can read all the papers here, courtesy of Robert Kosara).
Whatever you do with numbers, if you are reading this far then you really need to go and look at the JCGS papers as they come from the real experts in the topic and crystallise a lot of debates that we are all having in half-baked ways around our water coolers. If you only have a 10 minute coffee break to spare, sit back and at least read the Significance article, not because I’m such a great writer but because it tries to condense the ideas into 4 pages for a lay audience, and because we all need to be on top of the game in making good visualizations rather than assuming it is someone else’s job. One of my own catchphrases: “You have to communicate as well as calculate”.
If there’s one thing I do feel smug about, it’s quoting Pierre Boulez for perhaps the first time in any stats periodical.
Boulez à la Yau
LSHTM are running a course on multiple imputation and inverse probability weighting, popular techniques for dealing with missing data on 12-14 June 2013. The instructors are some of the world’s foremost experts so this is a great chance to learn about these methods.
The course will:
- provide an introduction to the issues raised by missing data, and the associated statistical jargon (missing completely at random, missing at random, missing not at random)
- illustrate the shortcomings of ad-hoc methods for ‘handling’ missing data
- introduce multiple imputation for statistical analysis with missing data
- compare and contrast this with other methods, in particular inverse probability weighting and doubly robust methods, and
- to introduce accessible methods for exploring the sensitivity of inference to the missing at random assumption
Through computer practicals using Stata, participants will learn how to apply the statistical methods introduced in the course to realistic datasets.
Her Majesty’s government have released a short report into the recent consultation on the proposed changes to “strengthen” the NHS Constitution. My concern (see previous blog here) arose from the “no top-down reorganisation” of the Health Service; whether one likes it or not, it looks like private sector care providers are coming into the NHS fold. Will the Constitution require these providers to make their anonymised patient data available for research for the common good, or will it be their private property to sell and share as they see fit for commercial gain? Time will tell – but it at least doesn’t look all bad. The report says:
58. Respondents identified several specific areas where the proposed changes to the NHS
Constitution could be made clearer. These include:
• that data are shared with and across organisations to facilitate joined-up care;
• how identifiable data are used for non-clinical and research purposes;
• how data are shared with colleagues outside the NHS and with private companies;
• more detail about whether ‘all’ or ‘appropriate’ staff will have access to an
individual’s data and how data will be shared with non-NHS colleagues;
• definition of terms such as ‘anonymise’ and ‘relevant professionals’;
• clarification that only the data that are necessary for the other ‘relevant
professionals’ purpose will be released;
• access to electronic data; and
• what the difference is between information and data.
I’m really pleased that this message got across, hopefully not just from me as a stats crank but from others concerned that we use our data to make the world a better place. But the devil will be in the detail. I am writing an extended blog entry on this for Radical Statistics in the next week or so.
This is the title of a new exhibition being held 13 March – 17 April 2013 in the London School of Hygiene & Tropical Medicine in Bloomsbury, London. I love a good map so I will be sneaking some time off to look at it when I’m in central London during that month.
Update 15 April 2013: last week I was in the area and went to see the exhibition. There were some really good things in there, and with a couple of days left I urge anyone in the area to get along. I was struck by the wonderful hand-drawn maps (sometime with charts thoughtfully superimposed to good effect) from the first half of the 20th century.
The University of Bristol’s Centre for Multilevel Modelling maintains a Multilevel Gallery which contains publications with multilevel statistical models.
Mostly, they use Bristol’s own software MLwiN, Stat-JR, and the Stata interface runmlwin, but even if you are not using these packages, you will find nice examples of how to present multilevel results in here. As they say:
We hope that this will prove useful for teachers and learners of multilevel modelling as a source of examples to aid understanding, as well as for those who are currently writing a paper and could benefit from seeing how results from multilevel models are typically presented in their research area.
Periscopic have just released a very fine interactive infographic on the details and sheer numbers of gun murders in the USA. It balances numbers with stories and aesthetic impact. And, a bit like my favourite visualisation of 2012, the video by CarbonVisuals, it is quite scary when it suddenly accelerates. There are lots of options for exploring the data too – a very nice piece of work, and it is worth reading their comments on the experience of making it.