I saw an online advert for Altova MapForce software today while looking up my dentist’s phone number (aren’t those Google ads algorithms clever? Well, yes, sometimes). This looks rather interesting. You have your databases of various formats appear on the screen as lists of variables and then you link them together by joining the arrows to ‘pipe’ the data through various transforming functions and into the final combined format that you want to use.
This is an interface design I hadn’t seen before in the data world, although such things are widely used in audio signal processing. If you are scared of programming and quite a visual sort of person, this could come in handy. But I suspect that, by the time you have learnt to use this, you could have taught yourself a good dose of Python or R, which would give you much greater flexibility.
There’s a nice paper just out in BMC Health Services Research by Kristofferson and colleagues where they looked at hospital mortality stats in Norway and counted deaths in three different ways:
- exclude patients transferred between hospitals and count deaths in hospital
- exclude patients transferred between hospitals and count deaths within 30 days wherever they happened
- count patients weighted by the proportion of time in each hospital, and count deaths within 30 days wherever they happened
OK, that’s not every possibility but the point is to test how sensitive a league table would be to changing this definition. The assumption is often made that mortality is the best statistic to fall back on when all else fails, but the notion that a patient is either dead or alive is all very well until you get down to the fine details of how you count these deaths… and then it gets complicated.
They found a considerable number of hospitals moving in and out of being “outliers” when the definition of mortality changed. This is no great surprise to anyone who has analysed comparative hospital stats, or has looked ito the methodological literature on it. But it remains the case that league tables get a lot of attention both from journalists and bureaucrats.
As further reading I cannot recommend highly enough the book “Performance measurement for health system improvement” and the landmark JRSS paper.
PS: the graphs in the Kristofferson paper are bad: inadequately lablled, ugly and confusing
Here’s a lovely example of a good graph from the Guardian, 13 October 2012. It’s simple, it’s clear, it’s topical and it tells you something that you can’t easily understand from words or numbers.
Gas and electricity prices to the consumer in Britain have risen steadily while wholesale prices fluctuate but stay in the same ballpark (Guardian 13 October 2012)
Pretty simple graph, huh? But the message is really clear, so why make it more complex? The only grumble one might have is that the vertical axes don’t cover the same range, but even I don’t think that’s compulsory in this case. We are often told that prices have to go up because wholesale prices keep rising. In the most recent increase that sparked (no pun intended) this graph, British Gas said the rises were “unwelcome” and “outisde our control”. And indeed they are right – the wholesale prices have gone up quite often… and then gone back down again. In fact, if we believe this graph, they are not very different to where they were six years ago before the credit crunch and recession. The graph confirms what cynical householders seem to suspect, that the price is ratcheted up and only token reductions are made. The profit margin appears to have doubled for electricity and quadrupled for gas. Of course, there may be investment to make in infrastructure out of that profit, but there’s no good complaining about that when you run a privatised industry. If only I’d invested in one of these energy companies… I’d hire someone to do my blogging for me!
Some public lectures coming up soon that may interest anyone with a social/medical/political focus. These are at Conway Hall, Red Lion Square, London.
||Our Public Relations, their Propaganda
||8 Principles for successful optimists
||Prof David Healy
||The History and Future of Bioethics
||Prof Richard Ashcroft
||Preventative medicine? Are screening tests about science or politics?
||Dr Margaret McCartney
||The Ethics of Open Borders
||Prof Phil Cole
||Invisible England: Holding Therapy Practices in the UK
Thanks to Jay Ginn for posting these on the Radstats list.
Here’s a new paper just out in Nurse Education Today that my colleague (and boss) Prof Vari Drennan wrote with me and Liz Porter of Southampton. Just some simple statistics, but we traced graduates from Southampton with this dual qualification to find out what jobs they had held over time and what had been the drivers for leaving the old job and choosing the new one.
Spreadsheets do funny things to data. Firstly, they do a lot of formatting and sometimes that messes up your numbers. Secondly, they behave like programs but don’t get any of the rigorous checking that we would expect of a chunk of C++ code or the like. Here, should you still need to be told this, is a fine example of how you can end up in deep trouble without even knowing it’s coming. In 2010 it seems the organisation most Brits call MI5 tapped 134 phones by mistake. Oops! The reason was formatting in a spreadsheet:
In 2010 the Security Service reported 1061 errors to my office which can be
split into two categories. First, subscriber data was acquired in relation to 134 incorrect telephone numbers. These errors were caused by a formatting fault on an electronic spreadsheet which altered the last three digits of each of the telephone numbers to ‘000’. These unfortunate errors were identified by the Security Service and duly reported, which is again a positive indication that public authorities are self auditing and identifying any conduct which constitutes non-compliance. A degree of unintended collateral intrusion occurred in relation to these 134 requests as the subscriber data acquired had no connection or relevance to any investigation or operation being undertaken by the Security Service. In line with paragraph 6.21 of the Code of Practice the Security Service has now destroyed this material. The technical fault on the spreadsheet has been rectified and all requests are also now checked manually before being sent to the CSPs which will reduce the potential for recurrence of such errors.
– from the 2010 annual report of the Interception of Communications Commissioner.
Remarkable stuff. Thanks to EuSpRIG for cataloguing this and many other horror stories.