Monthly Archives: September 2012

Spreadsheet for calculating confidence intervals

I do most of my teaching in SPSS, often through gritted teeth, because of the blind spot in epidemiological analyses. Students are justifiably demoralised when I tell them you can only get an odds ratio and its confidence interval by clicking on the rather obscure “Cochran’s and Mantel-Haenszel statistics” option under “Crosstabs”… and you can’t get a relative risk at all, nor can you just ask for a confidence interval round a proportion. Gee, who needs these boring stats anyway? They won’t help us “increase revenue, outperform competitors, conduct research and make better decisions” [ – accessed 19 September 2012].

OK, that’s enough ranting. Here’s the solution: buy Stata. But in the meantime, until the shiny CD-ROM arrives from College Station, Texas, you can download this little spreadsheet which I hand out to my students. [Link updated 25 October 2013] It gives you an approximate and an exact CI in each case, and also you can look at the Excel formulas to try and work out what’s going on ‘under the hood’.


Filed under learning, SPSS, Stata

Data linkage: one-day conference

At the RSS in London on Friday 16 November 2012: subsidised by the ESRC (what lovely people!) to £35 and £25 for students. Click here for abstracts and booking. As their e-mail says:

Administrative data has many advantages for social science research.  Typically datasets are very large, can cover long time periods, are low cost and regularly updated.  The linkage of administrative data to other data sources provides further power and utility for research.  However, such linkage is generally problematic with legal, governance, access and ownership barriers to overcome.

I’m particularly looking forward to Harvey Goldstein’s talk on probabilistic matching of data…

Leave a comment

Filed under learning, noticeboard

Training for analysing routine healthcare databases

This is a huge growth area: computerised health records bring “Big Data” to the world of medical research (even if you don’t believe Big Data is all that big or new). But opportunities to learn the tricks from the experts are rare, and believe me, there are a lot of tricks needed to get reliable findings out. So these three short courses at UCL (Royal Free Medical School) are selling out fast.

  • Course 1: An introduction to Primary care databases – Monday 5 November
  • Course 2: Missing data and new methods for multiple imputation of longitudinal electronic health records – Tuesday and Wednesday 6 – 7 November
  • Course 3: An introduction to Hospital Episode Statistics – Thursday and Friday 8 – 9 November

Trainers include members of the prolific THIN research team and experts in epidemiology, statistics and missing data. In particular, Course 2 is the only chance you’re going to get this year to learn two-fold FCS imputation, a new method for imputing longitudinal data without getting tangled up in horrendous multicolinear imputation models, so new the methods paper is not even out yet, and you can learn from the inventors! They also have a new Stata command to carry out the imputation called -twofold-

Leave a comment

Filed under advanced, learning, Stata

SPSS and R day in York

On 2 November there are some introductory health-focussed training sessions on SPSS and R software going on in York:

Two sets of two half-day parallel workshops and two tutorial sessions are provisionally planned to be held at the Alcuin Research Resource Centre, University of York, on Friday 2nd November 2012. The R workshops and a tutorial session on ‘Health Applications in SPSS’ will run in parallel from 10am to (approx) 12-45pm and two SPSS workshops and tutorials on ‘First steps in data analysis using R’, from 1-50pm to (approx) 4-30pm. Each delegate may, therefore, attend a morning workshop or tutorial session and/or an afternoon workshop or tutorial session.

Workshop topics and other details including booking forms are at The workshops and tutorials will be taught in an interactive hands-on workshop-style format, with frequent examples. A full set of notes and example files will be given to all workshop attenders. There will also be handouts at the tutorial sessions.

Leave a comment

Filed under learning, noticeboard, R, SPSS

Stan: new software for Hamiltonian Monte Carlo

As much as we love Markov Chain Monte Carlo as a flexible method for estimating all sorts of statistical models even when old-fashioned likelihood-based estimators aren’t available, nobody likes waiting till next Christmas for it to converge, having to throw away most of their massively auto-correlated steps on Boxing Day, or scratching his or her head at 2 a.m. when yet another set of initial values fails.

In recent years there has been a flurry of activity devising better algorithms that explore the parameter space efficiently and give you posterior distributions. One such is Hamiltonian Monte Carlo, and now Andrew Gelman and colleagues have released version 1 of new software that provides us with the first off-the-peg tool to try this technique out for ourselves! It is called Stan and its homepage is here. I wonder if that logo is inspired by Professor Gelman’s journey to work every morning… There is also an R interface called RStan with a useful quick start guide here.

Excuse me, does this go to 116th Street?
Yes, with probability 1 as time tends to infinity.

Now, I haven’t tried this out yet but initial reports say “reliable” and “very fast”. These are words I like to hear!

Leave a comment

Filed under advanced

Meeting: causal inference in observational data

This promises to be a good gathering in Manchester on 17  October 2012:

14.00 – 14.50: Professor Jonathan Sterne, Head of School of Social and Community Based Medicine, The University of Bristol

Causal inference for dynamic treatment regimens: how analyses of observational data changed international guidelines on when to start antiretroviral therapy

14.50 – 15.40: Dr Rhian Daniel, London School of Hygiene and Tropical Medicine

Causal mediation analysis with multiple causally-ordered mediators

16.00 – 16.50: Professor Kate Tilling, The University of Bristol

Examining associations between gestational weight gain, birthweight and gestational age using multivariate multilevel models

Register by email to  or 0161 275 5764.

Leave a comment

Filed under advanced, noticeboard

Systematic review of the prevalence of incontinence in people with dementia

I have a paper now in press with my colleagues Laura Cole and Prof Vari Drennan, as well as Dr Greta Rait and Prof Steve Iliffe of UCL. We review and critique the literature on incontinence problems among people with cognitive impairment and dementia. Clearly this is a terrible and important problem and with an ageing population, service planners and commissioners need to know the size of the problem in the community, yet little has been done to give us good estimates. Hope fully this will be out soon in Neurourology & Urodynamics.

Leave a comment

Filed under research