This is (perhaps the last) in my short series of academic endeavours that never got finished and published
This developed out of my Masters dissertation in the Medical Statistics course at the London School of Hygiene and Tropical Medicine. I was comparing different composite measures of hospital quality, and then I went on to explore ways of assessing and visualising the uncertainty in those measures.
What are composite performance indicators?
In the context of New Public Management, we have a bunch of hospitals (you can substitute schools, prisons, privatised railways or privatised deportation agencies or whatever), and politicians have set some very broad-brush goals for them (perhaps, that they should have low mortality and low re-admission rates, and that they should reduce any debt year-on-year). Some agency or Death Panel (the sort of thing I used to do for a living When We Were Very Young) expands this into some measurable indicators. They might have to prioritise things so that it isn’t too burdensome, and they end up with things like:
- % of patients with fungal toenail infections seen by a fungo-podiatrist within 24 hours of being diagnosed
- number of nurses per patient on the fungal toenail infection ward
- % of patients turning up a second time for their FTI, after you said you’d fixed it
(with apologies to anyone who suffers from fungal infections in the toenail, and feels I am making light of their plight; someone had to take the fall (why not you?))
Great, now we have three numbers but someone is sure to say that it doesn’t help patients choose a hospital and doesn’t help funders direct the money to the best performers. You might be tempted to make a composite indicator by some mathematical process. It can often be as crude as averaging them.
One more thing I’ll mention here is that, following Donabedian, it is typical to classify indicators as structure (like the 2nd one above, measuring the facilities), process (like the 1st one, measuring whether you do the right things), or outcome (like the 3rd, measuring how the patient is doing after your care).
Sources of uncertainty
The most obvious way in which your composite indicator can give you the wrong answer is because it is assessed on the basis of a sample of patients, and not all of them. This is sampling error, and we have a lot of statistical theory to tell us how big it might be. But there are other problems too.
Order of averaging
Reeves and colleagues wrote a paper in 2007 which hardly anyone has heard of — but they should have. They explored what happens when you have multiple indicators assessed on multiple patients, as is often the case. Do you summarise the indicators into one number for each patient, and then summarise the patients, or do it the other way round? It turns out that you can get quite different composite scores.
Weighting and other calculations
To combine your indicators, you have some formula that takes multiple numbers as input, and produces one number as output. That formula might give more weight to one input than other. You could choose weights on the basis of clinical importance, or you could opt for a variance-maximising summary such as the first principal component. Or, you might also introduce changing implicit weights by steps like dichotomising some of the inputs before averaging them.
That choice obviously affects the composite scores. The tricky thing is that you cannot avoid a judgement of relative weights. Even if you just average the inputs, you will still be giving more weight to some than others, specifically, those with higher inter-hospital variance will come to have a bigger impact on ranking. There is no value-free composite.
So, I made a poster and it was shown at a visualisation conference at the Open University in Milton Keynes in 2011. And here it is below. I haven’t managed to do anything further on this subject since then. If you would like to take it on, feel free. Get in touch if you want to discuss it.