Two handy documents for making good UK maps

Everybody loves a good map. Even if you don’t have any reason to make one, your boss will love it when you do, so check this out and get yourself a pay rise (possibly).

First, this set of diagrams via ONS Geographies on Twitter, showing how the different terminologies of UK administrative geography overlap and nest. It looks horrible, like it was made by NSA staffers after their PowerPoint refresher day, but it does the trick, and I haven’t seen this pulled together in one simple way like this before.

Second, this nice presentation from the most recent LondonR meeting, by Simon Hailstone. He shows the value of proper mapping tools inside R with some real-life ambulance service data. Croydon, Romford, Kingston, West End, OK. Heathrow Airport is a bit of a surprise.

Leave a comment

Filed under R, Visualization

Visualising data for clinical audit – videos now online

HQIP held a most enjoyable seminar recently on presenting healthcare quality data to clinicians and the public. It’s all now online on YouTube and I recommend it to anyone working in this field.

It took them hours to do my make-up but it was worth it.

Really, I’d now like to reverse the order of topics in my talk (and a nother version of it popped up at Hertfordshire University Business School), starting with interactivity and trends, then going into chart design and perception a bit more. Stats could appear at the end if there’s time, along with software, if there’s time. I’ve also decided to ditch the silly pictures and have concrete examples, good and bad, at every stage. We live and learn (mostly).

Leave a comment

Filed under Visualization

A must-read for academics and all who measure our outputs

A brief and important submission to HEFCE from Sir David Spiegelhalter. If you need a really brief version:

  • Indicators indicate, they do not measure what you really want to know, like the quality of someone’s work. What do you mean you haven’t written a definition of quality for me yet?
  • As the indicators get simpler (or are simplified into one mega-index or league table), they become all the more amenable to gaming and create more perverse incentives.

Or as I have not yet tired of saying, statistics is no substitute for thinking.

one, two, three, four, five…

Leave a comment

Filed under Uncategorized

Meta-analysis methods when studies are not normally distributed

Yesterday I was reading Kontopantelis & Reeves’s 2010 paper “Performance of statistical methods for meta-analysis when true study effects are non-normally distributed: A simulation study“, which compares fixed-effects and a variety of random effects models under the (entirely realistic) situation where the studies do not happen to be drawn from a normal distribution. In theory, they would be if they were just perturbed from a global mean by sampling error, and that leads us to the fixed effects model, but the random effects says that there’s other stuff making the inter-study variation even bigger, and the trouble is that by definition you don’t know what that ‘stuff’ is (or you would have modelled it – wouldn’t you???)

The random effects options they consider (with my irreverent nutshell descriptions in brackets; don’t write and tell me they are over-simplifications, that’s the point) are DerSimonian-Laird (pretend we know the intra-study SD), Biggerstaff-Tweedie (intra-study SDs are themselves drawn from an inverse-chi-squared distribution), Sidik-Jonkman (account for unknown global SD by drawing the means from a t-distribution), “Q-based” (test Cochran’s Q for heterogeneity, if significant, use D-L, if not, use FE), maximum likelihood for both mean and SD, profile likelihood for mean under unknown SD, and a permutation version of D-L which was proposed by Follmann & Proschan, and seeing as everyone has been immortalized in the meta-analysis Hall of Infamy, I’m going to do it to them too. All in all a pretty exhaustive list. They tested them out in 10,000 simulations with data from a variety of skewed and leptokurtic population distributions, and different numbers of studies.

Conclusion one is that they are all pretty robust to all but the most bizarre deviations from normality.┬áHaving said that, the authors offer a graph dividing the world into D-L optimal and profile likelihood optimal, which I found a bit odd because, firstly, DerSimonian-Laird is never true, it’s just an approximation, secondly, profile likelihood required bespoke, painful programming each time and thirdly, they just said it doesn’t matter. I rather like the look of Sidak-Jonkman in the tables of results, but that may be a cognitive bias in me that prefers old-skool solutions along the lines of “just log everything and do a t-test, it’ll be about right and then you can go home early” (a strange attitude in one who spends a lot of time doing Bayesian structural equation models). I also like Follmann-Proschan for their auto-correcting permutations, but if a non-permuting method can give me a decent answer, why bother?

Interestingly, the authors have provided all these methods in an Excel plug-in (I can’t recommend that, but on your head be it) and the Stata package metaan, which I shall be looking into next time I have to smash studies together. In R, you can get profile likelihood (and, I think, Biggerstaff-Tweedie) from the metaLik package, and maximum likelihood, Sidik-Jonkman and a REML estimator too from metafor. Simpler options are in rmeta and some more esoteric ones in meta. However, it still seems to me that the most important thing to do is to look at the individual study effects and try to work out what shape they follow and what factors in the study design and execution could have put them there. This could provide the reader with much richer information than just one mega-result (oops sorry, inadvertently strayed a little too close to Eysenck there) that sweeps the cause of heterogeneity under the carpet.

1 Comment

Filed under R, Stata

stata2leaflet v0.1 is released

Use Stata? Want to make an interactive online map with markers at various locations, colored according to some characteristic, with pop-up information when they’re clicked on? Easy. Head over to my website and download stata2leaflet. It’s in a kind of alpha testing version, so what I really want are your suggestions on making it even better. You’ll see my plans for v0.2 there too.

You have data like this:

s2l1

You type this:

stata2leaflet mlat mlong mlab, mcolorvar(mcol) replace nocomments ///
title("Here's my new map") ///
caption("Here's some more details")

A file appears which looks like this inside:

s2l2

And opens in your browser like this (static version because of WordPress.com restrictions – clickety click for the real McCoy):

stata2leaflet_example

Leave a comment

Filed under Stata, Visualization

Open Data Institute lunchtime lecture on care.data &c &c

Get along to this, it’s sure to be good. Why the government selling your data is not the same thing as the government creating open data for the public good.

Leave a comment

Filed under Uncategorized

care.data: the one must-read summary

The April 2014 issue of Significance, which has just dropped into my pigeon hole, has an excellent editorial summary of what’s been going on with the National Health Services’s care.data proposals to bundle up health data and share it, perhaps with researchers, perhaps with commercial enterprises. If you can only read one thing about care.data, make it this one. All the different aspects are together in one place.

20140529_112522_resized

Confused of Tooting

1 Comment

Filed under Uncategorized