Monthly Archives: June 2014

A must-read for academics and all who measure our outputs

A brief and important submission to HEFCE from Sir David Spiegelhalter. If you need a really brief version:

  • Indicators indicate, they do not measure what you really want to know, like the quality of someone’s work. What do you mean you haven’t written a definition of quality for me yet?
  • As the indicators get simpler (or are simplified into one mega-index or league table), they become all the more amenable to gaming and create more perverse incentives.

Or as I have not yet tired of saying, statistics is no substitute for thinking.

one, two, three, four, five…

Leave a comment

Filed under Uncategorized

Meta-analysis methods when studies are not normally distributed

Yesterday I was reading Kontopantelis & Reeves’s 2010 paper “Performance of statistical methods for meta-analysis when true study effects are non-normally distributed: A simulation study“, which compares fixed-effects and a variety of random effects models under the (entirely realistic) situation where the studies do not happen to be drawn from a normal distribution. In theory, they would be if they were just perturbed from a global mean by sampling error, and that leads us to the fixed effects model, but the random effects says that there’s other stuff making the inter-study variation even bigger, and the trouble is that by definition you don’t know what that ‘stuff’ is (or you would have modelled it – wouldn’t you???)

The random effects options they consider (with my irreverent nutshell descriptions in brackets; don’t write and tell me they are over-simplifications, that’s the point) are DerSimonian-Laird (pretend we know the intra-study SD), Biggerstaff-Tweedie (intra-study SDs are themselves drawn from an inverse-chi-squared distribution), Sidik-Jonkman (account for unknown global SD by drawing the means from a t-distribution), “Q-based” (test Cochran’s Q for heterogeneity, if significant, use D-L, if not, use FE), maximum likelihood for both mean and SD, profile likelihood for mean under unknown SD, and a permutation version of D-L which was proposed by Follmann & Proschan, and seeing as everyone has been immortalized in the meta-analysis Hall of Infamy, I’m going to do it to them too. All in all a pretty exhaustive list. They tested them out in 10,000 simulations with data from a variety of skewed and leptokurtic population distributions, and different numbers of studies.

Conclusion one is that they are all pretty robust to all but the most bizarre deviations from normality.┬áHaving said that, the authors offer a graph dividing the world into D-L optimal and profile likelihood optimal, which I found a bit odd because, firstly, DerSimonian-Laird is never true, it’s just an approximation, secondly, profile likelihood required bespoke, painful programming each time and thirdly, they just said it doesn’t matter. I rather like the look of Sidak-Jonkman in the tables of results, but that may be a cognitive bias in me that prefers old-skool solutions along the lines of “just log everything and do a t-test, it’ll be about right and then you can go home early” (a strange attitude in one who spends a lot of time doing Bayesian structural equation models). I also like Follmann-Proschan for their auto-correcting permutations, but if a non-permuting method can give me a decent answer, why bother?

Interestingly, the authors have provided all these methods in an Excel plug-in (I can’t recommend that, but on your head be it) and the Stata package metaan, which I shall be looking into next time I have to smash studies together. In R, you can get profile likelihood (and, I think, Biggerstaff-Tweedie) from the metaLik package, and maximum likelihood, Sidik-Jonkman and a REML estimator too from metafor. Simpler options are in rmeta and some more esoteric ones in meta. However, it still seems to me that the most important thing to do is to look at the individual study effects and try to work out what shape they follow and what factors in the study design and execution could have put them there. This could provide the reader with much richer information than just one mega-result (oops sorry, inadvertently strayed a little too close to Eysenck there) that sweeps the cause of heterogeneity under the carpet.

1 Comment

Filed under R, Stata

stata2leaflet v0.1 is released

Use Stata? Want to make an interactive online map with markers at various locations, colored according to some characteristic, with pop-up information when they’re clicked on? Easy. Head over to my website and download stata2leaflet. It’s in a kind of alpha testing version, so what I really want are your suggestions on making it even better. You’ll see my plans for v0.2 there too.

You have data like this:


You type this:

stata2leaflet mlat mlong mlab, mcolorvar(mcol) replace nocomments ///
title("Here's my new map") ///
caption("Here's some more details")

A file appears which looks like this inside:


And opens in your browser like this (static version because of restrictions – clickety click for the real McCoy):


Leave a comment

Filed under Stata, Visualization