Monthly Archives: October 2012

Box-linking GUI for data conversion

I saw an online advert for Altova MapForce software today while looking up my dentist’s phone number (aren’t those Google ads algorithms clever? Well, yes, sometimes). This looks rather interesting. You have your databases of various formats appear on the screen as lists of variables and then you link them together by joining the arrows to ‘pipe’ the data through various transforming functions and into the final combined format that you want to use.

MapForce screenshot

This is an interface design I hadn’t seen before in the data world, although such things are widely used in audio signal processing. If you are scared of programming and quite a visual sort of person, this could come in handy. But I suspect that, by the time you have learnt to use this, you could have taught yourself a good dose of Python or R, which would give you much greater flexibility.

Leave a comment

Filed under Uncategorized

Hospital mortality league tables: another layer of uncertainty

There’s a nice paper just out in BMC Health Services Research by Kristofferson and colleagues where they looked at hospital mortality stats in Norway and counted deaths in three different ways:

  • exclude patients transferred between hospitals and count deaths in hospital
  • exclude patients transferred between hospitals and count deaths within 30 days wherever they happened
  • count patients weighted by the proportion of time in each hospital, and count deaths within 30 days wherever they happened

OK, that’s not every possibility but the point is to test how sensitive a league table would be to changing this definition. The assumption is often made that mortality is the best statistic to fall back on when all else fails, but the notion that a patient is either dead or alive is all very well until you get down to the fine details of how you count these deaths… and then it gets complicated.

They found a considerable number of hospitals moving in and out of being “outliers” when the definition of mortality changed. This is no great surprise to anyone who has analysed comparative hospital stats, or has looked ito the methodological literature on it. But it remains the case that league tables get a lot of attention both from journalists and bureaucrats.

As further reading I cannot recommend highly enough the book “Performance measurement for health system improvement” and the landmark JRSS paper.

PS: the graphs in the Kristofferson paper are bad: inadequately lablled, ugly and confusing

Leave a comment

Filed under Uncategorized

Energy profit margins show up in a good graph

Here’s a lovely example of a good graph from the Guardian, 13 October 2012. It’s simple, it’s clear, it’s topical and it tells you something that you can’t easily understand from words or numbers.

Gas and electricity prices, wholesale and retail, rising over time

Gas and electricity prices to the consumer in Britain have risen steadily while wholesale prices fluctuate but stay in the same ballpark (Guardian 13 October 2012)

Pretty simple graph, huh? But the message is really clear, so why make it more complex? The only grumble one might have is that the vertical axes don’t cover the same range, but even I don’t think that’s compulsory in this case. We are often told that prices have to go up because wholesale prices keep rising. In the most recent increase that sparked (no pun intended) this graph, British Gas said the rises were “unwelcome” and “outisde our control”. And indeed they are right – the wholesale prices have gone up quite often… and then gone back down again. In fact, if we believe this graph, they are not very different to where they were six years ago before the credit crunch and recession. The graph confirms what cynical householders seem to suspect, that the price is ratcheted up and only token reductions are made. The profit margin appears to have doubled for electricity and quadrupled for gas. Of course, there may be investment to make in infrastructure out of that profit, but there’s no good complaining about that when you run a privatised industry. If only I’d invested in one of these energy companies… I’d hire someone to do my blogging for me!

Leave a comment

Filed under Uncategorized

Lectures at Conway Hall, London

Some public lectures coming up soon that may interest anyone with a social/medical/political focus. These are at Conway Hall, Red Lion Square, London.

28/10/2012 Our Public Relations, their Propaganda Graham Bell Jay Ginn
04/11/2012 8 Principles for successful optimists Mark Stevenson Andrew Copson
18/11/2012 Pharmageddon Prof David Healy
25/11/2012 The History and Future of Bioethics Prof Richard Ashcroft
02/12/2012 Preventative medicine? Are screening tests about science or politics? Dr Margaret McCartney
09/12/2012 The Ethics of Open Borders Prof Phil Cole
16/12/2012 Invisible England: Holding Therapy Practices in the UK

Thanks to Jay Ginn for posting these on the Radstats list.

Leave a comment

Filed under noticeboard

New paper: career paths of dual-qualified (nurse + health visitor) graduates

Here’s a new paper just out in Nurse Education Today that my colleague (and boss) Prof Vari Drennan wrote with me and Liz Porter of Southampton. Just some simple statistics, but we traced graduates from Southampton with this dual qualification to find out what jobs they had held over time and what had been the drivers for leaving the old job and choosing the new one.

Leave a comment

Filed under research

Converting continuous to binary outcomes for meta-analysis

I was intrigued by a paper just out in the International Journal of Epidemiology by da Costa et al. They look into the difficult situation where you are carrying out a meta-analysis and some papers report  odds ratios or relative risks for achieving a certain threshold of response to treatment (odds or risk of being a “responder”), while others report mean changes in outcomes. For example, some blood pressure studies might report mean changes in millimeters of mercury (mmHg) while others count how many people got down to the normal range. How does one then combine these studies without having the original data? There are five different techniques that the authors identify for approximating an odds ratio from the continuous outcomes. They go on to compare how they perform in terms of real life data where they knew both the odds ratio and the mean change, using studies in osteoarthritis of the knee or hip.

These are the five methods:

  • Hasselblad and Hedges (1995): multiply the standardised mean difference and its standard error by 1.81 – that’s the log-odds ratio and its standard error! (On average, if the mean scores follow a logistic distribution in all treatment groups)
  • Cox and Snell (1989): as above but multiply by 1.65 (assumes a normal distribution rather than logistic)
  • Furukawa & Leucht (2011): estimate a control group risk (or find it buried in the paper), then estimate the treatment risk using the SMD and probit transformations
  • Suissa (1991): similar to Furukawa & Leucht but using group-specific means, standard deviations and sample sizes; this should be superior if the group sizes are quite different to each other
  • Kraemer and Kupfer (2006): calculate a risk difference from an estimated area under the curve (AUC), which is just the CDF of the normal distribution at SMD/1.414

Their conclusion is that all the methods are good enough except Kraemer & Kupfer, which in fact gave estimated odds ratios significantly different to the true ones, and so they recommend not using the method. I noticed in their Table 2 that the 4 recommended methods all showed an underestimated odds ratio when the baseline risk was less than 20%, although this was not a significant trend for any of them. I wonder how the techniques behave for small risks (0.01% to 1%)… that would be a nice project for somebody to try out.

The moral of the story is a familiar one to many statisticians: David Cox got there first. Seriously though, a simple heuristic method is usually good enough, because our aim is to help people see the pattern in the data, right? Somehow my generation of statisticians are much more fixated on fancy methods that work in every situation and have proven properties (and I am a bit guilty of that too), but it is sobering to remember the lessons of the days before immensely powerful computers on every desk: if you draw a histogram or quantile plot and then just multiply the SMD by 1.65, you will often get the same result.

Leave a comment

Filed under noticeboard

A salutary lesson not to trust spreadsheets

Spreadsheets do funny things to data. Firstly, they do a lot of formatting and sometimes that messes up your numbers. Secondly, they behave like programs but don’t get any of the rigorous checking that we would expect of a chunk of C++ code or the like. Here, should you still need to be told this, is a fine example of how you can end up in deep trouble without even knowing it’s coming. In 2010 it seems the organisation most Brits call MI5 tapped 134 phones by mistake. Oops! The reason was formatting in a spreadsheet:

In 2010 the Security Service reported 1061 errors to my office which can be
split into two categories.  First, subscriber data was acquired in relation to 134 incorrect telephone numbers.  These errors were caused by a formatting fault on an electronic spreadsheet which altered the last three digits of each of the telephone numbers to ‘000’.  These unfortunate errors were identified by the Security Service and duly reported, which is again a positive indication that public authorities are self auditing and identifying any conduct which constitutes non-compliance.  A degree of unintended collateral intrusion occurred in relation to these 134 requests as the subscriber data acquired had no connection or relevance to any investigation or operation being undertaken by the Security Service.  In line with paragraph 6.21 of the Code of Practice the Security Service has now destroyed this material.  The technical fault on the spreadsheet has been rectified and all requests are also now checked manually before being sent to the CSPs which will reduce the potential for recurrence of such errors.

– from the 2010 annual report of the Interception of Communications Commissioner.

Remarkable stuff. Thanks to EuSpRIG for cataloguing this and many other horror stories.

Leave a comment

Filed under Uncategorized