Cochrane reviews made a huge difference to evidence-based medicine by forcing consistent analysis and writing on systematic reviews, but now I find them losing the plot in a rather sad way. I wanted to write a longer critique while still indemnified by being a university employee and after the publication of a review I have nearly completed with colleagues (all of whom say “never again”). But those two things will not overlap. So, I’ll just point you to some advice on writing a Summary Of Findings table (the only bit most people read) from the Musculo-skeletal Group:

- “Fill in the NNT, Absolute risk difference and relative percent change values for each outcome as well as the summary statistics for continuous outcomes in the comments column.”

“Summary”, you say? Well, I’m all for relative + absolute measures, but the NNT is a little controversial nowadays (cf Stephen Senn everywhere) and are all those stats going to have appropriate measures of uncertainty, or will they be presented as gospel truth? With continuous outcomes, we were required to state means, SDs, % difference, and % change in either arm, which seems a bit over the top to me, and, crucially, relies on some pretty bold assumptions about distributions: assumptions that are not necessary elsewhere in the review.

- “When different scales are used, standardized values are calculated and the absolute and relative changes are put in terms of the most often used and/or recognized scale.”

I can see the point of this but that requires a big old assumption about the population mean and standard deviation of the most often used scale, as well as assumption of normality. Usually, these scales have floor/ceiling effects.

- “there are two options for filling in the baseline mean of the control group: of the included trials for a particular outcome, choose the study that is a combination of the most representative study population and has a large weighting in the overall result in RevMan. Enter the baseline mean in the control group of this study. […or…] Use the generic inverse variance method in RevMan to determine the pooled baseline mean. Enter the baseline mean and standard error (SE) of the control group for each trial”

This is an invitation to plug in your favourite trial and make the effect look bigger or smaller than it came out. Who says there is going to be one trial that is most representative and has a precise baseline estimate? There will be fudges and trade-offs aplenty here.

- “Please note that a SoF table should have a maximum of seven most important outcomes.”

Clearly, eight would be completely wrong.

- “Note that NNTs should only be calculated for those outcomes where a statistically significant difference has been demonstrated”

Jesus wept. I honestly can’t believe I have to write this in 2017. Reporting only significant findings allows genuine effects and noise to get through, and the quantity of noise can actually be huge, certainly not 5% of results (cf John Ioannides everything being false, and Andrew Gelman on types of error).

On calculating some absolute changes in % terms (all under 10%), reviewers then came back and told us that they should all be described as “slight improvement”, the term “slight” being reserved for absolute changes under a certain size. They also recommend using Cohen’s small-medium-large classification quite strictly, in a handy spreadsheet for authors called Codfish. I thought Cohen’s D and his classification had been thrown out long ago in favour of, you know, thinking. This is rather sad, as we see the systematic approach being ossified into a rigid set of rules. I suspect that the really clever methodologists involved in Cochrane are not aware of this, nor would they approve, but it is happening little by little in the specialist groups.

This advice for reviewers is not on their website but needs proper statistical review and revision. We shouldn’t be going backwards in this era of Crisis Of Replication.