Statistical demolition

What a remarkable article this is. Nickerson and Brown take apart a recent paper that claimed to show an effect of mindfulness on motivated perception. I am no psychologist and will make no judgment of the plausibility of the finding, but it is heartening (if somewhat scary) to see thorough statistical critique like this getting written in the public domain, and indeed getting published (well done Personality & Individual Differences). If we had more painful demolitions like this, we might have more caution being exercised by those who dabble in statistics. I am sympathetic, for the simple reason that they probably find it hard to track down a statistical collaborator, though I am always suspicious of psychologists’ tendency toward gung-ho number-crunching.

It does seem harsh to include one of the authors’ Masters thesis in the demolition though. They have to follow the advice of the supervisor (not a statistician), the requirements of the course, and they don’t get a chance to rewrite them after marking, like a doctoral thesis. I think it’s appropriate to look at it for an insight into the evolution of the ideas, but not to find mistakes. It would be a very unusual Masters thesis that did not contain mistakes; there are certainly some in mine.

Leave a comment

Filed under Uncategorized

Showing a distribution over time: how many summary stats?

I saw this nice graph today on Twitter, by Thomas Forth:

but the more I looked at it, the more I felt it was hard to understand the changes over time across the income distribution from the Gini coefficient and the median. People started asking online for other percentiles, so I thought I would smooth each of them from the source data and plot them side by side:


Now, this has the advantage of showing exactly where in society the growth or contraction is, but it loses the engaging element of the wandering nation across economic space (cf Booze Space; where do we end up? washed up on the banks of the Walbrook?), which should not be sneezed at. Something engaging matters in dataviz.

Code (as you know, I’m a nuts ‘n’ bolts guy, so don’t go recommending ggplot2 to me):

uk$Year <- as.numeric(substr(uk$Year,1,4))
main="Percentiles of UK income over time",
sub="(Colour indicates governing political party)",
ylab="2013 GBP",
lines(uk$Year[4:10],sm[4:10,4],col=redcol) # Wilson I
lines(uk$Year[11:14],sm[11:14,4],col=bluecol) # Heath
lines(uk$Year[15:19],sm[15:19,4],col=redcol) # Wilson II, Callaghan
lines(uk$Year[20:37],sm[20:37,4],col=bluecol) # Thatcher, Major
lines(uk$Year[38:50],sm[38:50,4],col=redcol) # Blair, Brown
lines(uk$Year[51:53],sm[51:53,4],col=bluecol) # cameron
for(i in 5:22) {
lines(uk$Year[1:3],sm[1:3,i],col=bluecol) # Macmillan, Douglas-Home
lines(uk$Year[4:10],sm[4:10,i],col=redcol) # Wilson I
lines(uk$Year[11:14],sm[11:14,i],col=bluecol) # Heath
lines(uk$Year[15:19],sm[15:19,i],col=redcol) # Wilson II, Callaghan
lines(uk$Year[20:37],sm[20:37,i],col=bluecol) # Thatcher, Major
lines(uk$Year[38:50],sm[38:50,i],col=redcol) # Blair, Brown
lines(uk$Year[51:53],sm[51:53,i],col=bluecol) # Cam'ron

(uk_income.csv is just the trimmed down source data spreadsheet)

Leave a comment

Filed under R, Visualization

The irresistible lure of secondary analysis

The one thing that particularly worries me about the Department of Health in its various semi-devolved guises making 40% cuts to non-NHS spending is that some of the activities I get involved in or advise on, which rely on accurate data, can appear beguilingly simple to cut by falling back on existing data sources, but the devil is in the detail. It is very hard to draw meaningful conclusions from data that were not collected for that purpose, but when the analytical team or their boss is pressed to give a one-line summary to the politicians, it all sounds hunky dory. The guy holding the purse strings might never know that the simple one-liner is built on flimsy foundations.

Leave a comment

Filed under healthcare

St Swithun’s Day simulator

I got a bit bored (sorry Mike), and wrote this. I didn’t take long (I tell you that not so much to cover my backside as to celebrate the majesty of R). First, I estimated probabilities of a day being rainy if the previous was dry, and of it being rainy if the previous day was rainy. I couldn’t be bothered with any thinking, so I used ABC, which basically is an incredibly simple and intuitive way of finding parameters than match data. I started with 16 rain days in July and 15 in August, in my home town of Winchester (which is also Swithun’s hood) from here, and that led to probabilities that I plugged into a simulator (it’s only AR1, but I’m no meteorologist). That ran 10,000 years of 40-day periods (there’s a sort of run-in of ten days to get to a stable distribution first; it’s basically a Markov chain), and not a single one had rain for 40 days.

It ain’t gon’ rain.

# Estmiate Winchester July/August rainy day transition probabilities
# We'll use Approximate Bayesian Computation
for(i in 1:ldr) {
for(j in 1:lrr) {
for(k in 1:abciter) {
for (m in 2:10) {
for (m in 2:40) {
# I'm going to go with P(rain | dry)=0.32, P(rain | rain)=0.51

# St Swithun's Day simulator

for (i in 1:iter) {
for (j in 2:10) {
for (j in 2:40) {
print(paste("There were ",sum(runs==40),
" instances of St Swithun's Day coming true, over ",
iter," simulated years.",sep=""))


Filed under R

They each wanted to improve education; together, they ruined it

If you have any interest in using data to improve public services like education or healthcare, whether enthusiastic or sceptical, read this article. The story is a familiar one to me but rarely sees the public eye in such careful detail as it does here. The road to hell, as you know, is paved with dashboards, performance indicators and league tables.

Righton Johnson, a lawyer with Balch & Bingham who sat in on interviews, told me that it became clear that most teachers thought they were committing a victimless crime. “They didn’t see the value in the test, so they didn’t see that they were devaluing the kids by cheating,” she said. Unlike recent cheating scandals at Harvard and at Stuyvesant High School, where privileged students were concerned with their own advancement, those who cheated at Parks were never convinced of the importance of the tests; they viewed the cheating as a door they had to pass through in order to focus on issues that seemed more relevant to their students’ lives.

Lewis said, “I know that sometimes when you’re in the fight, and you’re swinging, you want to win so badly that you don’t recognize where your blows land.”

There have been similar stories in the UK news recently, but you can get it all from Parks, so I suggest just reading this and carrying those cautionary ideas around with you.

Leave a comment

Filed under Uncategorized

Can statistics heal?

(A rather florid headline, I’ll admit.)

I was reading this article on survivors of the 2005 London bombings and was struck by the story of someone who had been involved in two acts of terrorism but found it helpful to learn the stats from a disinterested third party:

One patient – so unlucky as to have also been caught up in a previous terrorist attack – was only reassured after being taken to a bookmakers to see how long the odds were of being in a third.

It makes me wonder how statisticians can help provide this sort of information to clinicians caring for people with PTSD and related problems. It is so easy to construct comparisons that are not quite right (horse-riding=ecstasy, for example, or Sally Clark, or lightning = cricket balls), so it is unfair to expect the doctor or psychologist to put them together as required. Perhaps future generations will have a more advanced grasp of statistics from school, but I wouldn’t count on it being able to overcome pathological distortions of risk perception. There may be a role for the quantified-self sort of data here; consider the popularity of pedometers and GPS for knowing how much exercise you really are doing.

Leave a comment

Filed under healthcare

Responses to the BASP psychologists’ p-value ban: the one that got away

When StatsLife collected responses to the BASP p-value ban (see blogs hither and yon), I suggested they contact Ian Hunt, a wise and philosophically minded critical voice in the wilderness of cookbook analysts. I also know that he and I take rather divergent opinions on deduction and induction and such, but I hold his arguments in the highest respect because they are carefully constructed. Alas! he couldn’t send in a response in time, but here it is, reproduced with his kind permission:

How good are the reasons given by the editors of Basic and Applied Social Psychology (BASP) to ban hypothesis tests and p-values?

I argue that BASPs reasoning for banning “statistical inference” is weak.

First, they (the editors) offer a non sequitur: ban p-values because “the state of the art remains uncertain” (2015 editorial) and “there exists no inferential statistical procedure that has elicited widespread agreement” (2014 editorial).  I argue inter-subjective (dis)agreement is not decisive.

Secondly, they imply that good inductive inferences require posterior probabilities. This is contentious, especially since both posteriors and p-values are just deductions.

Thirdly, they plead for larger sample sizes “because as the sample size increases, descriptive statistics become increasingly stable and sampling error is less of a problem.” This is contradictory: the evidence for this claim is best shown by the statistical inferences being banned.

Fourthly, they correctly assume that with a large enough sample size many “significant effects” (or “null rejections” or low p-values or interesting things or whatever) can be identified by looking at canny descriptive statistics and adroitly drawn charts. But I believe p-values ARE descriptive statistics – with which both frequentists and Bayesians can work.

Finally, BASP “welcomes the submission of null effects”.  But without tests and concomitant power profiles the evidential value of a “null effect” is unclear.

BASPs editors appear to conclude that modern statistics is inductive and akin to “the art of discovery” (as David Hand puts it). Fair enough. But I conclude that careful deductive inferences, in the form of hypothesis tests with clear premisses and verifiable mathematics, still have a role in discovering interesting things.

Now, it would be unfair of me to say anything more on this here but I believe you can hear Ian talking on this very subject at this year’s RSS conference, which is in Exeter. Personally, I’ve never been to Exeter, and I don’t think this is going to be the year for it either, but as Southern towns go, I suspect it’s neither as depressing as Weymouth nor as humourlessly overrated as Salisbury. (That counts as enthusiasm round here.) I recommend the conference to you. It’s just about optimal size and always interestingly diverse.

Leave a comment

Filed under Bayesian, learning