How to assess quality in primary care

Jim Parle, of the University of Birmingham, and I have an editorial just out in the BMJ responding to the recent Health Foundation report on quality indicators in primary care. There’s a lot one could say about this subject but we had to be brief and engaging. Hopefully the references serve as a springboard for readers who want to dig in more. In brief:

  • We think it’s great that composite indicators received a strongly worded ‘no’; remember that Jeremy Hunt (and probably Jennifer Dixon too) started this process quite keen on global measures of quality reducing all the complexity of primary care organisation and care to a traffic light.
  • We agree that a single port of call would be invaluable. Too much of this information is scattered about online
    • but along with that, there’s a need for standardised methods of analysis and presentation; this is not talked about much but it causes a lot of confusion. At NAGCAE, my role is to keep banging on about this to make analysts learn from the best in the business and to stop spending taxpayers’ money reinventing wheels via expensive private-sector agencies
    • and interactive online content is ideally suited to this, viz ‘State of Obesity’
  • We think they should have talked about the value of accurate communication of uncertainty, arising from many different sources. Consider Elizabeth Ford’s work on GP coding, or George Leckie and Harvey Goldstein on school league tables (googlez-vous).
  • We also think they should have talked about perverse incentives and gaming. It never hurts to remind politicians of Mr Blair’s uncomfortable appearance on Question Time

Leave a comment

Filed under healthcare

Everything you need to make R Commander locally (packages, dependencies, zip files)

I’ve been installing R Commander on laptops for our students to use in tutorials. It’s tedious to put each one online with my login, download it all, then disable the internet (so they don’t send lewd e-mails to the vice-chancellor from my account, although I could always plead that I had misunderstood the meaning of his job title). I eventually got every package it needed downloaded and I’ve done it all off a USB stick. But I didn’t find a single list of all the Rcmdr dependencies, recursively. Maybe it’s out there but I didn’t find it. So, here it is. You might find it useful.

I suppose this is one of my less engaging posts…


Filed under learning, R

Frequentist accuracy of Bayesian estimates

Today I sat in on the Royal Statistical Society’s webinar discussing Brad Efron’s recent paper “Frequentist Accuracy of Bayesian Estimates”. Prof Efron introduced the paper, Andrew Gelman gave a response with three questions, and there were questions from the floor. It all worked rather well in terms of logistics, and I hope the RSS does more online broadcasts of events – or broadcasts of online events. It is quite easy to do this sort of activity now, and you don’t have to do a training course beforehand: it was Prof Efron’s first ever webinar. I think the trick is to make use of existing simple technology and not to attempt to reinvent the wheel, or feel so anxious about it that one hires an expensive agency to take care of the event. I base this on the very positive experience of Harvard Med School’s GCSRT and associated blended learning courses, which built a strong curriculum out of mostly free tools like Facebook, Dropbox and Google Hangouts. Not only do the students get an online learning environment, they learn how to create their own environment for collaboration.

I wanted to write this and post it quickly because the paper is available to all for free until 4 November 2015. The recording of the webinar is on the RSS journals webinar page, and will move to Youtube soon.

The point of this paper is that when we make some estimate and an interval around it by Bayesian methods, we can interpret the interval in a frequentist framework provided the prior is informative and informed by prior evidence. That is to say, given that prior, we could reasonably imagine the study and its analysis taking place repeatedly forever in identical circumstances. Our interval, whether we call it confidence or credible, would be calculated afresh each time and 95% of them would contain the true value.

Here comes a philosophical aside; you can omit it without loss of understanding, as they say. Now, here I confess some confusion regarding what that value really is, because it seems to contain both a sample of data and a prior, but for a population the prior would cease to influence it. Is it a statistic summarising the posterior distribution, or the population? Does it matter or am I splitting hairs? Furthermore, is it any more reasonable to imagine indefinite collection of data, influenced by an empirical prior, without the prior being updated? Perhaps they have to be collected simultaneously without communication, but even if we admit the plausibility of this (remember that we cannot make inferences about one-off events like elections without bending the rules), we are faced with the problem of subjectivity lurking within. If information is available but is just not known, does that make events ‘random’ in the frequentist sense that they can be described by probability distributions? Most old-school frequentists would say no, because to say yes means that when researcher A learns of researcher B’s data, researcher A inferences collapse like a Schrödinger wave function. What if A knows about B but nobody knows that A knows? It could be random for B but not for A: subjectivity, goddam it! Everybody for the lifeboats! This is not to mention the fact that the (human) subject of the research knows even before you ask them. On the other hand, if they say no, then they deny their own frequentist framework, because repeated experiments mean that they cannot be truly repeated, because the very existence of information means there is no such thing as probability (and they might be onto something there…but that’s another story). This, friends, is why frequentism is a steaming pile of poop. But we will leave that aside, because nothing in statistics stands up to philosophical scrutiny except (I think!) personally subjective Bayes, and probably (boom boom!) Jaynes’s fuzzy logic. It doesn’t make sense but it is useful nonetheless.

So, on to the paper. If you don’t have an empirically informed prior (and not many of us use them), what would the meaning be of the frequentist properties of intervals? In such circumstances, we have to some extent made a subjective choice of our prior. This immediately made me think of Leamer’s fragility analysis, in his paper “Let’s Take The Con Out Of Econometrics” (I’m grateful to Andrew Chesher for pointing me to this very entertaining old paper, something of a classic of econometrics but unknown to statistical me). Efron explores this question via the multivariate gradient of the log-probability of the data over values of its summary statistic(s). This gives standard errors via delta method, or you can get a bootstrap too. There is a very nice motivating example where a summary of the CDF of a particular participant in a lasso regression predicting diabetes progression gets both a Bayesian and a frequentist CI, and the ratio of their widths is explored. It turns out that there is a formula for this ratio yielding matrix whose eigenvalues give a range of possible values, and this sets bounds on the frequentist-Bayesian disagreement for any given parameter – potentially a useful diagnostic.

Prof Gelman liked the non-ideological nature of the paper (I do too, despite the rant above), but wondered what the practical conclusion was: to trust Bayes because it is close to frequentist under some specific conditions, or to do the comparison with theoretical simulation each time one does a Bayesian analysis (and to use empirical priors… well, that might not work very often). What if one used Weakly Informative Priors (WIPs), falling somewhere between informative and non-informative? Could inference be done on some measure of the width of the prior? Efron liked this idea.

Gelman also liked the “creative incoherence” between the method employed for the bootstrap, and the estimator it aims at. This is a bit like a confidence interval for difference of medians alongside a rank-based test, as discussed in these very pages a few months back. Why is coherence good? Why not be incoherent and see more than the two sides of the coin familiar from asymptotics: test and interval? Why not employ different methods to understand complementary answers about a Bayesian model?

Leave a comment

Filed under Bayesian

Using colour in dataviz

One of my little obsessions when drawing any chart is to get the color scheme defined at the beginning and stick with it. That way, I feel like I can hang out with the cool kids, also known as designers. Of course, they see straight through me, but I feel good about it anyway. Here’s my top tips:

  1. Get the logo of any organisation or project you are working with, and suck out the colours. I would just save that image, bring out the GIMP, and use the eye-dropper to get a few colors.
  2. If you don’t have some corporate colour to fall in with, try an app like Color Grab, which uses your phone’s camera to save the colours at the centre. You’re aiming for understated elegance, like design-seeds. If you don’t want to stand up from your computer, get one of the classic Crayola colours.
  3. Use a website like to give you a palette that matches what you got from GIMP
  4. specify those colours in R, or CSS, or Stata; marvel at the results
  5. In general, use a fairly bland and limited palette throughout, except for where you want emphasis, and there use it sparingly and only one (or maybe at a strech two) colors
  6. Think very seriously about transparency. It rocks. If you have a hex code like #4682b4 (aka the strangely ubiquitous steelblue), you can just add another byte on the end for the transparency, for example: #4682b499. Note that, although most people call it transparency, it’s actually opacity (a high value means more opaque), and CSS calls it this if you specify it separately to the colour code (you can also use rgba() function in JavaScript)
  7. Always save your graph in the native format of your data analysis package that made it, and in a vector graphics format like .svg, .pdf or .ps. You can always make raster images like .png later, but you can’t get back from them. (And always save the code that made the image – but you’d do that anyway, right?)
  8. Consider opening up the vector graphics version in Inkscape and messing about with the look there. In fact, if you have Stata or SPSS, you could add transparency in Inkscape.
  9. An unrelated point, but think hard about good fonts too. A nice design could be spoilt by a grim typeface. My current favourite is Transport Heavy, the official UK road sign font. And thanks to HM Government’s open data policies, it is free to download (not every ascii character is there – but you can substitute the similar free font Swansea for those).
  10. Make a decision about spelling the word. Only academics get the luxury of messing about.

The hexbin cholera map at the top of this blog is a fine old example of this process. I had at that time recently made some charts for a poster for my faculty, and used their lurid green (#acca56), which seemed perfect for a deadly water-borne infection (the Wal-Mart virus came to mind). (Believe it or not, at one point we had an entire wall in the reception area painted acca56, but I think it gave people migraines and had to go.) I looked up a palette, and allocated that to the number of cholera deaths in the hexagon. Pretty grim, which is to say, pretty and grim.

Leave a comment

Filed under Visualization

Statistical demolition

What a remarkable article this is. Nickerson and Brown take apart a recent paper that claimed to show an effect of mindfulness on motivated perception. I am no psychologist and will make no judgment of the plausibility of the finding, but it is heartening (if somewhat scary) to see thorough statistical critique like this getting written in the public domain, and indeed getting published (well done Personality & Individual Differences). If we had more painful demolitions like this, we might have more caution being exercised by those who dabble in statistics. I am sympathetic, for the simple reason that they probably find it hard to track down a statistical collaborator, though I am always suspicious of psychologists’ tendency toward gung-ho number-crunching.

It does seem harsh to include one of the authors’ Masters thesis in the demolition though. They have to follow the advice of the supervisor (not a statistician), the requirements of the course, and they don’t get a chance to rewrite them after marking, like a doctoral thesis. I think it’s appropriate to look at it for an insight into the evolution of the ideas, but not to find mistakes. It would be a very unusual Masters thesis that did not contain mistakes; there are certainly some in mine.

Leave a comment

Filed under Uncategorized

Showing a distribution over time: how many summary stats?

I saw this nice graph today on Twitter, by Thomas Forth:

but the more I looked at it, the more I felt it was hard to understand the changes over time across the income distribution from the Gini coefficient and the median. People started asking online for other percentiles, so I thought I would smooth each of them from the source data and plot them side by side:


Now, this has the advantage of showing exactly where in society the growth or contraction is, but it loses the engaging element of the wandering nation across economic space (cf Booze Space; where do we end up? washed up on the banks of the Walbrook?), which should not be sneezed at. Something engaging matters in dataviz.

Code (as you know, I’m a nuts ‘n’ bolts guy, so don’t go recommending ggplot2 to me):

uk$Year <- as.numeric(substr(uk$Year,1,4))
main="Percentiles of UK income over time",
sub="(Colour indicates governing political party)",
ylab="2013 GBP",
lines(uk$Year[4:10],sm[4:10,4],col=redcol) # Wilson I
lines(uk$Year[11:14],sm[11:14,4],col=bluecol) # Heath
lines(uk$Year[15:19],sm[15:19,4],col=redcol) # Wilson II, Callaghan
lines(uk$Year[20:37],sm[20:37,4],col=bluecol) # Thatcher, Major
lines(uk$Year[38:50],sm[38:50,4],col=redcol) # Blair, Brown
lines(uk$Year[51:53],sm[51:53,4],col=bluecol) # cameron
for(i in 5:22) {
lines(uk$Year[1:3],sm[1:3,i],col=bluecol) # Macmillan, Douglas-Home
lines(uk$Year[4:10],sm[4:10,i],col=redcol) # Wilson I
lines(uk$Year[11:14],sm[11:14,i],col=bluecol) # Heath
lines(uk$Year[15:19],sm[15:19,i],col=redcol) # Wilson II, Callaghan
lines(uk$Year[20:37],sm[20:37,i],col=bluecol) # Thatcher, Major
lines(uk$Year[38:50],sm[38:50,i],col=redcol) # Blair, Brown
lines(uk$Year[51:53],sm[51:53,i],col=bluecol) # Cam'ron

(uk_income.csv is just the trimmed down source data spreadsheet)

Leave a comment

Filed under R, Visualization

The irresistible lure of secondary analysis

The one thing that particularly worries me about the Department of Health in its various semi-devolved guises making 40% cuts to non-NHS spending is that some of the activities I get involved in or advise on, which rely on accurate data, can appear beguilingly simple to cut by falling back on existing data sources, but the devil is in the detail. It is very hard to draw meaningful conclusions from data that were not collected for that purpose, but when the analytical team or their boss is pressed to give a one-line summary to the politicians, it all sounds hunky dory. The guy holding the purse strings might never know that the simple one-liner is built on flimsy foundations.

Leave a comment

Filed under healthcare