Citizen science, R and the Heathrow cursus

In this post I walk through how to map topography in the landscape at an extremely fine resolution of about an 8-inch grid. You can do this with a mobile phone, floor sweeper and a little data processing programming. The total cost is potentially just your travel to and from the site.

The motivating example is a small section of the Heathrow Cursus, which I measured as a test run late last winter. Cursuses are late Stone Age earthworks, of which there are about two dozen in England. They were mostly constructed in the 33rd century BC and, of course, we don’t know why. This particular one, sometimes called the Stanwell Cursus, ran for about 4km through some low-lying boggy land in the Thames valley. Now, its Southern end is under the middle of the village of Stanwell, its Northern end under the junction of the M4 & M25 motorways, and a lot of it is under Heathrow Airport, where it passes directly under Terminal 5. In fact, it is thanks to excavations carried out as a condition for building T5 that we know much about it. There is one remaining section visible on the ground, and this is in parkland called Harmondsworth Moor which is open to the public. Admittedly, there is not much to see but a long, straight causeway about ten meters wide and one meter high. It dips a little in the middle, which may reflect people using it as a path over the damp earth, whether ceremonial or practical, over the millennia, or may reflect some characteristic of its construction. You can read about it in the T5 excavation report here and see my own photos and notes on it here.

There are many minor archaeological sites that get gradually degraded by not being protected from erosion and damage. Throughout England it is quite common for 4000 year old chieftains’ burial mounds to be ploughed over each year by farmers. Unfortunately, the Heathrow Cursus has a greater enemy to contend with in the shape of a big old airport. We learnt today¬†that Heathrow Airport will be extended with a new runway, and, of two location options, that appears to be aiming for the one that will obliterate the remaining bit of cursus. However, there is many a slip ‘twixt cup and lip.

Early this year, while the sun was low and the weeds dormant in the soil (both essential for seeing little bumps in the landscape), I took a day’s holiday and spent it in muddy fields outside Heathrow Airport while jets roared overhead every 90 seconds. When I lived in Winchester, Hampshire, I had rather compulsively visited the prehistoric bumps in the district on bike rides, but I had never seen this particular relic before, so I didn’t know if anything would be visible. There was little detail about it online, but when I joined the start and end points from the archaeological report on Google Maps, I found that there was a suspicious straight line visible in satellite photos. Getting to Heathrow from my house is a real pain, so as I made my way there I was really hoping that I wouldn’t find myself looking at a flat gigantic mudpie of a field. And I was not disappointed.


Fig 1: Looking down the length of the cursus, away from the airport

I had somehow dreamt up this idea of plotting surfaces with mobile phone sensors, and had intended to do some testing and calibration beforehand, but never got round to it, so I was just launching myself into the deep end that day when I got on the train with my floor sweeper and a rucksack full of sandwiches and rubber bands. I was also wondering at the back of my mind how to explain it to the cops when they swooped on this suspicious character (that didn’t happen, I’m pleased to say; British quirkiness is still tolerated).

So, this is the arrangement. Get some sensor logging app on your phone, rubber band it firmly onto some kind of sweeper like in the photo, the sort that has a flat head that can swivel freely in all directions. This one is designed to have wipes tucked into it and then swiped over your lino or floorboards. Be prepared for your phone to get muddy; it will survive. It’s best to have a separate GPS logger in your pocket, but you could use the phone for this as long as it is really accurate (it probably isn’t). When you put the sweeper & phone down, its orientation will be very precisely recorded. I used an Android app called Accelerometer Meter and it logged about every 0.4 seconds, outputting a csv file and using little battery power (which can be a problem with some apps, apparently). Then I set the GPS going, and start walking in parallel lines across the cursus. In theory you would work your way along the object you want to measure but this could take a really long time, depending on the resolution you want. The logger files I get are like this one.


Fig 2: Leveraging best-in-breed tech (note the mud)

In practice, I didn’t use the GPS location because I knew where I was at each point, but it would be fairly straightforward to synchronise them and bolt on the GPS point closest to each sensor log point.

It seems from reading around online that the accelerometer streams of data in various devices is not always labelled the same way. This particular app finds the angle we want and labels it y (meters per second) but it isn’t that. I’m not entirely sure how to interpret the units, but that doesn’t matter too much because, to use this seriously, you would have to calibrate it against known angles anyway.

The R code for everything that follows is here. If we plot the angle from the horizontal against time, we get this:


Fig 3: raw angle data

There is a mixture of spikes and flat regions, arising from the lopsided weight of the phone on the sweeper. Every time I picked it up, the phone flopped forwards until it was laid down at the next step. At first, I thought this was annoying, but now I see it as a strength: it is easy to identify the flops which demarcate steps. If we zoom in on the 50-60 second region, which is just after reaching the top of the cursus and starting to slope gently down into the central dip, we can see the pattern more clearly:


Fig 4: likewise, zoomed in

So, our task is to find the regions which are fairly stable in terms of the angle (steps), throw out the bits in between (flops) and then get an average angle for each of the steps. I took a two-stage approach, first accepting any measure that was less than a given threshold different from the previous measure, and then secondly accepting any run of these that was three or more measures long. Both of these are parameters that could be tweaked.


Fig 5: Selecting measures under a certain change threshold


Fig 6: likewise, zoomed in


Fig 7: selecting stable runs of three or more measures (~1.2 seconds)

Now we reduce this to a series of steps and get the average angle. This is then translated to a change in vertical distance, given that the steps have a fairly constant distance along the hypoteneuse — the ground surface — and then we can plot what we get out!


Fig 8: estimated transect profile (compare with Fig 1 above)

So, in the end, the resolution I got is about a 22cm grid, which is roughly the length of the sweeper, and I was placing it head to toe as I moved along, so that makes sense. A smaller base for the phone would achieve a smaller grid but would take longer to do. This shows that you can bodge together different technologies and do stuff on the cheap, and also that anyone can contribute data on threatened things (citizen science). Don’t assume that you are not up to it — anyone can do this. If you wanted to do this the old fashioned way, by throwing money at it, the only way would have been to hire surveyors to map it all out, and then you wouldn’t have such a fine grid. LIDAR scanning would give good resolution but be more expensive still, and perhaps more affected by tufts of grass and stinging nettles which the citizen scientist can gently press out of the way. As for the R programming, there is nothing clever in there. Only base R is used, and you will see my characteristic un-R over-use of loops and other clumsy but effective approaches. In fact, if you are using the same app I had, you can pretty much just shove your data straight through this same program. For graphics, you would have lots of these transects piled up behind one another and it would be tempting to do something like my slice density plot, although wireframes, contour plots and heatmaps are options too.

So ends our brief encounter with archaeology in this blog. But there may be more in the future! Will I go back to Heathrow and map out the whole cursus? Probably not, as the best impact I can achieve is by blogging and stirring people up. But you never know… and more likely, I will be pondering Bayesian modelling and Stan for archaeo applications.

Leave a comment

Filed under Uncategorized

On methodological terrorism

Elevator pitch

A psychologist and journal editor has written that outspoken statistical critics on blogs and social media are “methodological terrorists”. Some people are upset by this. I think it is pretty accurate, in that incumbent statistical cultures will not change from their current data-torturing ways until they are scared out of it. In this post, I take a level-headed view of terrorism and revolution and see what we can learn from the early Soviet Union. Disclaimer: don’t hurt anyone, readers.


Some methodologists are increasingly critical in blogs and social media of bad analyses, typically carried out by people who are not statistically trained, and typically arriving at overly definitive headline-grabbling messages. Now, there is discussion about whether we should confine critique to polite scientific journals and conferences, continue with the tweeting and blogging but be nicer, or continue the attack. Most recently, the focus has turned more specifically to scientists who refuse to accept that their work could be wrong after criticism, or even respond that it is irrelevant whether they are wrong or not. I have been mulling this over in light of the term “methodological terrorists” deployed in protest against unpleasant criticism (the irony seems not to have been detected by the author, one Susan Fiske). I got Trotsky off the shelf, then Azrael, and had a little bit of early-USSR reading. I thought about whether this is some kind of revolution, and whether terrorism is in fact an accurate term and maybe even not to be rejected but mined for ideas about forcing change in the system. I also thought back on things I saw while growing up in Apartheid South Africa. I came to these conclusions:

  • Dilettante analyses kill people, and methodologists who care are duty bound to take strong measures to change the damaging norms in scientific culture. There is no logical reason to reject unpleasantness, and in fact it is likely given the imbalance of power in certain scientific fields that intimidation is necessary. This is not something to be scared of; methodological intimidation gave us randomised clinical trials with allocation concealment, and to some extent pre-specified protocols and analysis plans.
  • To avoid any misunderstanding, bodily threats and violence should not be employed; hard-nosed critics who are not convinced by an appeal to their better natures should note that posting mystery powders to social science departments will simply be counter-productive
  • The current system of a small number of the same people approving funding for studies, doing them, and editing the journals where they are published is arguably corrupt. The subject experts who run it benefit so much from it that they certainly don’t allow dissenting voices on their patch, and, unable to control self-publication on blogs and social media, react forcefully. Journals and conferences are used as an organ of repression, and we should focus on influencing them and not allowing them to be a refuge for irresponsible conduct.
  • We need to stop appearing as venomous cranks with axes to grind. Criticise someone’s work once, and then move on. Don’t keep picking at them. The next generation of experts needs to fear us, not because we are unpleasant, but because we are right.
  • We need every practitioner of dilettante analysis to feel the gaze of potentially career-limiting critics. Criticise more people, more studies and put less venom into each one. (I like the idea of explicitly stating an intention to critique one paper chosen at random from each issue of a chosen journal, and intend to announce my own chosen journal soon.)
  • We need to use the media to further these arguments, or we will never get off the ground. There are just not enough of us. Also, the thrill of appearing in the newspapers or whatever is one of the big drivers of silly interpretations of studies. Journalists love balance, which most scientists shudder at, and let’s not go there now, other than to say that we can use balance to our advantage, by becoming the dissenters. So, when you write your critique in your blog, also send it to the science editor at various news outlets. Make an e-mail list of these poor saps so you can make the whole process as quick as possible for yourselves. And do it fast! They want to report the study today and move on to something else tomorrow.
  • We need to acknowledge and indeed reward efforts to fix the problem. Critiques online should be amended to reflect any effort by the target to make good (although I feel uneasy about deleting things). Everyone makes mistaks. In fact that is the only way science can proceed. It’s what they do next that matters.

I also suggest some classic tactics that the dilettanti may take to suppress further criticism in scientific fields where they hold the balance of power. We need to be alert to these.

Terrorism and positivism

To recap for those who have had better things to focus on recently: there is a so-called crisis of replication. Starting in psychology and social sciences and then spreading elsewhere, big names are getting criticised for their statistical practices, which allegedly lead to headline-grabbing conclusions that evaporate when someone attempts to re-run the study. A more general tendency to blog outspoken criticisms has gained momentum too. Even your mild-mannered author has had a pop at fanciful concoctions of pseudo-science cooked up by politicians, politicians appropriating stats for potentially damaging tautologies, influential scientists who line up a series of logical fallacies to support a political cause, powerful people who wish for the moon with stats, and famous studies with screwed up analyses. I have many more that I don’t blog about but keep for teaching examples.

For a while, the scientists in the firing line kept quiet but now have started fighting back. One publication that is getting a lot of attention is an article in a magazine of the Association for Psychological Science by Susan Fiske. I picture old Susan* rattling this editorial off on her porch in the Hamptons with a Tom Collins or two on the go; the language is emotional and gets more florid as it goes along. She coins a few neat terms: destructo-critics, data police, vigilante critique, but it is “methodological terrorists” that has irked many. The destructo-critics with the best sense of humour, like Eric-Jan Wagenmakers, have enjoyed adopting this new job title. (The closest thing to destruction I’ve seen EJ do is to consume an intemperate amount of sushi. I mean, when Abu Bakr al Baghdadi gives one of his talks to stir up the troops or whatever, does he use a slide with Spongebob Squarepants to illustrate a point? No.) I am too small a fish to be honoured by anyone with such a label, although, unlike Pablo Picasso, I do — fairly — get called an asshole for being outspoken. But I too will add it to my name badge, with all the unseemly bandwagon haste of Dee Dee Ramone putting out a rap album.

* – I don’t know how old Susan is or isn’t. For some reason this scene came to mind and the sheer Salinger-ness of it all changed it into his voice in some terribly mystical way and all, just for your information.

Vis a vis assholes, this article by Jesse Singal (who, I’m sure, is a nice guy) is a great overview, summarising the counter-argument by Tal Yarkoni (likewise) that the fact someone is unpleasant in pointing out that you screwed up doesn’t mean that you, in fact, didn’t screw up. If there’s one really surprising turn in this story, it’s that the methodo-destructos have solidified the logical basis of their actions, and done so quickly, from several independent writers, while the complaining ex-spurts have resorted to emotion and rhetoric.

Now, some people get upset about the terrorism tag, but I’d like to take it seriously and explore it — or explode it, if you will. (Fiske, in the cold light of day, has decided to delete it from her article, in case you’re wondering where it went.) Some terrorism aims to provoke larger scale insurrection or revolution (the good old Cold War variety: PLO, Angry Brigade, Unabomber, N17, etc etc, possibly Isis, but it’s not relevant here anyway), some to consolidate a prior revolution (the Bolshevik variety, which we will explore further), and some is just a chance to lash out or pretend you’re in a real war (the IRA or ETA variety, oddballs claimed by Isis post hoc). We don’t fit any of those, but scientific revolution may be at hand, in the Kuhn sense. I am considering to what extent this recent development constitutes a revolution, which is to say that the previously meek statisticians rise up and impose a new order on their former masters, the subject experts.

Fiske is using the term terrorism here in the fashion of Karl Kautsky (who was, like, y’know, surely there’s no need to be so unpleasant, our ideas are great so let’s win them over with words instead). So, I took a look at Trotsky’s Terrorism And Communism, which was written as a riposte to Kautsky and a defence of Bolshevik terror, (and he was, like, don’t be naive, they are perfectly comfortable as they are, and will destructo us if we don’t destructo them first) to consider what we can learn about the term. It certainly seems to be true that statisticians are more critical than we used to be, or at least some of them are more outspoken and abrasive in doing so. Andrew Gelman (who doesn’t like being called a terrorist) has recently pointed out in his essential-reading blog post which offers a short history of events leading up to this crisis of replication that the Neuroskeptic blog was prominent in establishing the destructo genre — principally because the crisis took root firstly in the field of psychology. I don’t believe that the problems are unique to psychology or even that they are at their worst there; consider the sisyphean labour of medico-destructo Ben Goldacre, or the universal condemnation of John Ioannides.

I also am aware of a number of older statisticians who feel uneasy about the abrasive confrontational approach (not to mention PR skills) of the younger generation. It does feel as though they regard a statistician’s proper place as a discreet consultant who does their thing and steps back into the shadows afterwards (and I understand that for many, their (self-)employment requires this, so destructo-mania is a luxury for the academic, for the most part). David Colquhoun wrote recently:

“I recall talking to a statistician at a recent meeting of the Royal Statistical Society. He was involved in analysis of clinical trials. I asked why he allowed the paper to make claims of discoveries based on P values close to 0.05. His answer was that if he didn’t allow that he’d lose his job and the clinician would hire a more compliant statistician. That lies at the heart of the problem.”

So, this is not just a methodologist vs subject expert thing. What is it?

Is it that we methodologo-destructos are no longer seen, or see ourselves, as servants of a larger scientific technical process? We surely have insights of a methodological nature to contribute. Perhaps it is because our numbers are greater than before. The advent of Data Science, bringing a probabilistic statistics background into contact with a computer science background has clearly been a major driving force here. Also, the mass of critical writing about statistical practices has increased (viz Ziliak and McCloskey), along with popular writing on cognitive biases (Gigerenzer vs that other more famous guy whose name eludes me). And finally, there is the internet, so we can self-publish even if we are still just cogs in a bigger science factory. If you’ve been in the same stats job for years, you might not have noticed any of this. Tenure and job longevity correlates with age, so the idea that this is a generation effect might be a red herring.

So, in what sense could we call this revolutionary? It has the characteristic of a lesser power base (the method-people) challenging the norms of a greater power base (the ex-spurts). The meth-heads have pointed out that they are unhappy, and been ignored (or so they feel), and now in some quarters resort to force. The X-Men must feel somewhat threatened because upon finding the statisticians are getting too big for their boots, they too have reacted with force. What does Trotsky have to say about the role terror plays in subduing incumbent powers, and what parallels might there be between bloody armed conflict and online arguments over methodology?

Is there a cause worth fighting for?

Firstly, for me, it is worth noting that bad publically-funded research takes tax money away from other causes and wastes them on vanity projects — and that’s the best-case scenario. The retired professor may console themselves about their life’s work of fanciful claims built by p-hacking from the safety of their little cottage near Aix-en-Provence, but for the most part, their research projects achieved precisely nothing. That’s the good news. The bad news is that sometimes they waste money and also do harm, when they deal with the health and wellbeing of the nation. I have said before: dilettante analyses kill people. In that context, it is not unreasonable to conceive of this as class struggle. If we view it as our civic duty to promote good research, it is also our civic duty not to tolerate bad research. If you saw someone in a lab coat and goggles hurting an old person to get their hands on some money, would you intervene? I hope so, and this situation is no different. There is a corrupt system which you are obliged to end, and you will have to act outside the system to do so. Not by blowing up their offices, as I explain below, but by confronting their work when it is wrong, in the best scientific tradition, and refusing to go away until it is fixed. Fiske said “it’s careers that are getting broken”; yes, that is precisely the objective. Acting out of ignorance, then seeing the light and fixing the problem is one thing, fighting not to change is another, and someone who refuses to learn and improve is not a scientist but a dancer. No offense to actual dancers. The difference between us and Trotsky is that we are not fighting for survival of one group or the other, we want the great majority of subject experts to continue their work, but to be aware of methoological problems and to behave better in future.

“‘But in that case, in what do your tactics differ from the tactics of tsarism?’ we are asked by the high priests of liberalism and Kautskianism. You do not understand this, holy men? We shall explain to you. The terror of tsarism was directed against the proletariat. The gendarmerie of tsarism throttled the workers who were fighting for the socialist order. Our Extraordinary Commissions shoot landlords, capitalists and generals who are striving to restore the capitalist order. Do you grasp this… distinction? Yes? For us Communists it is quite sufficient.”

What is to be done?

Now we must diverge from Trotsky. We are not rebuilding society after civil war, which is his general excuse for all manner of terrorist acts. We want to reform it without breaking it, and even more delicately without breaking the trust or conversation between the subject experts and the methodologists. But we cannot be nice about it or we will just be ignored. Let’s consider our one real success, perhaps the only one.

Getting ethical review and having randomised, blinded allocation is still grumbled about in my field (healthcare), where doctor still knows best, but they get on and do it. Why? Because they have seen the light? Sometimes, yes, but sometimes because they know it would be career-ending to breach the protocols. If that isn’t terror, I don’t know what is. But not all terror will work. We have to change the culture, probably without removing the experts (see Expert Power… below) (although a few will doubtless jump). Let’s be clear and avoid any misunderstandings: actual acts of violence, harrassment or physical threat are simply wrong and, anyway, will serve only to make your argument easily dismissed. Who doesn’t want to be the noble scientist persecuted by the barbarians? They already have role models attacked by animal rights people.

We should scare them all right, but in a thoroughly scientific way. It needs to be clear that nobody’s blunders are safe from being called out. We need to go after anyone and everyone, not just the big names. Perhaps a good way to do this is to allocate a methodological destructo-critic to each scientific journal of interest. Then, with each issue, you simply pick a paper at random and blog your critique, without pulling any punches. But crucially, we need to incentivise good behaviour too: congratulate authors of well-conducted studies, and especially importantly, go back and acknowledge when someone you criticised has taken steps to fix the problem.

The role of the ANC in South Africa was pre-revolutionary, in contrast to the Bolsheviks. Notably, Mandela distanced himself and uMkhonto we Sizwe from the word terrorism, promoting “armed struggle” at all times. From his self-defence at the Rivonia trial:

Firstly, we believed that as a result of Government policy, violence by the African people had become inevitable, and that unless responsible leadership was given to canalize and control the feelings of our people, there would be outbreaks of terrorism which would produce an intensity of bitterness and hostility between the various races of this country which is not produced even by war. Secondly, we felt that without violence there would be no way open to the African people to succeed in their struggle against the principle of white supremacy. All lawful modes of expressing opposition to this principle had been closed by legislation, and we were placed in a position in which we had either to accept a permanent state of inferiority, or to defy the Government.

But the violence which we chose to adopt was not terrorism.

The notion of canalizing is perhaps relevant to us today.

Trotsky was keen to defend the censorship of newspapers, again appealing to the war footing which he claimed to be indefinitely extended while the Bolsheviks felt the pressure of their enemies. We have the opposite problem: our business is largely conducted through publications, but these are somewhat incontinent of studies, releasing the good and bad all together (why not just offer to peer-review for them? see below under Peace And Love). How then can we change the behaviour of the publications too? Well, they are commercial enterprises to some extent, and bad publicity about methods hurts them. Publicity is also one of the motivators for scientists who seek an exciting story out of their data by any means. We need to turn publicity against them. We need to give fresh, exciting (whiff of scandal) material to journalists. They are obsessed with balance: if you have someone on the news representing political party A, you must must must get someone from party B to speak against them. They then rather simplistically extend this to science, bringing on all manner of tinfoil-hat cranks and giving them disproportionate coverage. Whatever. Let’s give them something better. Every time they have a story about some professor who found that wearing socks with sandals is an early sign of dementia, we rapidly wheel out someone who can eloquently and scientifically shoot it down. I guarantee you those silly claims will dry up soon. The problem is that journalists want to cover the story today and then forget it, so rapid response is essential — but that’s what the blogging and tweeting destructo-critics are good at!

One thing that worries me is the Dunning-Kruger effect. When you are new to a subject (a recent stats graduate, for example), your new-found confidence is as yet untempered by harsh reality. It would be a pity if inexperienced destructo-critics (trying, perhaps, to build an online presence and land a great job leveraging big data analytics) made asses of themselves and undermined the whole movement. But what can you do?

Expert power and methodological politics

The relationship between experts and methodologists may not be very well represented by the USSR in, say, 1922, but it could more closely follow the 1930s tension between industrial managers (that’s them) and the party in its various guises (that’s us). So I leafed through Jeremy Azrael’s book Managerial Power and Soviet Politics. At the end of the 1920s, managers (who were largely still in post since before the revolution) were routinely accused of sabotage if production did not meet targets. Sabotage could take any form, including just not working hard enough. You can see how scary and frustrating that would be, and doubtless this is how many subject experts feel now. But at the same time, their expertise was needed, and experiments to replace them with soviets were not entirely successful. Indeed, many of them had strong ambitions to up the scale of their work (like subject experts talking a good game on replication but not really knowing how to get there), which made them valuable assets, known as “Americans” for their industrialising zeal. There were many layers of regulation and inspection that had sprung up organically around them, but the most relevant was the presence of Red Directors, who were intended to impose Party priorities but could also come under accusations of sabotage. To be a Red Director might sound like a good career move, but it was not a comfortable situation. Oppressive regimes are most scared by their ‘own’ people.

Broadly, both camps were won over to support Stalin in the years immediately after Lenin’s death with promises of immunity from the sabotage accusations, and recognition of a policy of “one-man management”. But this was not to be. Their position was undermined finally at the 16th Party Congress in 1929, where “one-man management” was rejected. In the years that followed, five-year plans, league tables, performance indicators and purges served to entrench a system of corruption and perverse incentives. The opportunity to build on the skill of the “Americans” had been lost, and was never regained. I think we have to be careful not to go down that route. The crucial difference is that we do not have a shared system for critique and control of scientific analysis (the Party), but rather disparate individuals (guerrillas, if you will), so there is no risk at present of destroying science in quite the apocalyptic way that Fiske envisages. However, I call here for more action, and quite possibly for more organisation of the destructo-critics, and so we need to be careful that any organisation that does emerge doesn’t turn into an ossified set of rules that will stifle quantitative science.

Peace and love, man

Not everyone writing about this has suggested getting tougher. Jeff Leek, pitching himself as the Desmond Tutu of replication, blogged in a more conciliatory tone recently and proposed six steps to improve matters while taking the heat out of the argument. Deborah Mayo picked this up and contrasted it with Gelman’s rejoinder, which has more of a this-is-war attitude. Mayo comes down on Gelman’s side. I agree — who couldn’t — with Leek’s sixth point that statistical analyses should involve statistical experts, but the rest seems naive to me. Journals want meth-heads but can’t find them. X-men who want meth-fuelled collaborators also can’t find them. Other X-men have been told after a six-lecture course in t-tests and SPSS that they now know everything there is to know about stats, so they’re never going to start looking. In fact, they start doing the analyses for all their X-buddies and the whole thing becomes a love-in (you may choose to insert an earthier epithet here). If medical students had a spare day in their timetable, they would spend it memorising more metabolic pathways, not learning stats (and so they should). None of Leek’s apple pie stuff is going to come to pass, in large part because of simple supply and demand because, as you all know by now, statistics is the sexy job of the 21st century. They want to hire one of us, but they can’t find one. We are few, and many of us have already been ruled, found, brought and bound in Santa Clara county or The City, and are saying nothing to anyone, ever.

Professional identity

Another angle I started thinking along was how other professions/jobs would react, or do react, in comparable situations. Let’s consider accountants, stock market traders and dentists. Doubtless there are crooked accountants who help clients evade tax and launder criminal proceeds, just as there are statisticians who comply with requests for repeated hypothesis tests until p<0.05, but for the most part they act as police, simply because the repercussions of default for them as individuals is too great, and the rewards not sufficient to warrant the risk. In contrast, the rogue trader may well accept the risk thanks to the enormous reward. Perhaps it also helps that they don’t have a shared and valued professional identity. Now, statisticians have a professional identity, but it’s not strong. If it was bolstered with greater promotion of chartered / PStat status in the public eye, and a willingness on the part of stats societies to strike members off, then we might get tougher at this. At present, most young statisticians feel too anxious to attempt anything like this. Now, consider the dentist. You pay them to tell it like it is. If your molar is rotten and has to come out, you want to hear it and have some straight-talking advice on what to do about it. You don’t enjoy hearing the news, but better now than later in agony. That is the service they provide — to tell you the facts, not to be your friend. We need to stop being friends of subject experts and start being their dentists instead. (The dentist analogy deserves a big hairy hat-tip to Mike Monteiro and his excellent keynote talk from interaction15.)

Counter-revolutionary tactics

It will be interesting to see how this power struggle plays out in the next few years. Will the critics grow in numbers? Will it mature from sniping to more concerted and funded campaigns (like AllTrials)? Is it a generational thing, in which case there really will be a shift across all sciences? Will the dilettanti adopt some classic counter-revolutionary measures to divide-and-conquer? Here are some forms they might take; we need to be vigilant to them.

  • Critical statisticians could be blacklisted in some way so that pliable Quislings appear as collaborators on all manner or studies, giving that gloss of methodological respectability.
  • They could launch a Pledge To Be Nice and pressure all graduates and early career researchers to sign it or be blacklisted. Will it come under the disguise of Safe Places and Trigger Warnings? It would certainly be easy to tag onto that in the USA at present.
  • They could push for collective punishment, identifying a whole institution as having inappropriate ‘checks and balances’ or ‘governance’ if someone speaks out, effectively blacklisting the whole place.
  • They could promote peer punishment, where it is your stated duty to act against methodological terrorism, perhaps by refusing to collaborate ever again with your outspoken colleague. Failure to do so makes you suspect too.
  • Informants can be recruited too by the same logic, and one can compile a database of methodologists. The trouble with this is the lack of external governance: your dad calls his friend who calls his friend and suddenly the guy who upset you is getting a 4am raid by the B.O.S.S.

If you combine these things into a database and an institutional pledge to good conduct, which is cascaded down into people’s job descriptions and objectives, you have a deadly combination. I say it with some trepidation, but it’s better to be forewarned. Don’t think this stuff doesn’t happen.

The good news for us bad guys is that today’s terrorist is tomorrow’s statesman, so I look forward to Andrew Gelman’s keynote talk at the 2036 APA Convention. Our goal in the future is analogous to the Bolsheviks’ (although Trotsky fumbles the distinction (see above)): once everyone understands philosophy of science and statistics, there will be no more interdisciplinary conflict. That’s if we “win”. If the status quo continues and we go back to being meek, there is perpetual oppression of the sort that Colquhoun highlighted, and waste of public resources, and sometimes harm to vulnerable people.

One day I’ll revisit this from the perspective of the Puritans before and after the First English Civil War, but probably not until I’ve retired.


Filed under Uncategorized

I’m writing a dataviz book

Today I am starting work on a major new project, writing a book on data visualisation for the CRC-ASA series on statistical reasoning in science and society. There are several excellent dataviz books out there but I’m excited to be adding something new. This will be a brief, affordable overview that does not assume any previous training in statistics, or design, or coding. A lot of techniques will get described, but rather than just a baffling gallery, I want to make this a tour that shows the reader how to think through the options critically and justify their choices.


Procrastinating by taking a selfie in my secret hideout

The series should be a great collection for just this reason. More people than ever before have to work with data, and not all are experts or intend to be. I was inspired by the popularity of short, simple books on various business topics that you see in airport & railway station bookshops, and hope to provide something like that. I picture as my readers the manager in charge of risk analysis at a credit card company, or starting up a new modeling department in an insurance company, or the charity boss who wants to know what to ask for from the design team so their publications are more compelling (with apologies to any friends who see their own images there). You won’t see this in bookshops for a little while, but I’ll keep you posted on progress.


Filed under learning, noticeboard, Visualization

Futility audit

Theresa May’s “racial disparity audit” announced on 27 August, is really just a political gesture that works best if it never delivers findings. I’m reminded of the scene in Yes, Minister (or is it The Thick Of It? Or both?) where the protagonists are all in trouble for something and when the prime minister announces that there will be a public inquiry to find out what went wrong, they are delighted. They know that inquiries are the political equivalent of long grass, with the intention being that everybody involved has retired by the time it reports*.


Larry knows better than to look for mice in 300,000 different places.

It’s not entirely clear what is meant by audit here. Not in the accountants’ sense, surely. Something more like clinical audit? Audit, done properly, is pretty cool. Timely information on performance can get fed back to professionals who run public services, and they can use those data to examine potential problems and improve what they do. But when central agencies examine the data and make the call, it is not the same thing. The trouble is that, whatever indicators you measure, indicators can only indicate; it takes understanding of the local context to see whether it really is a problem.

But there’s another, more statistical problem in this plan: it is impossible to deliver all those goals in the announcement from the prime ministers office:

  • audit to shine a light on how our public services treat people from different backgrounds
  • public will be able to check how their race affects how they are treated on key issues such as health, education and employment, broken down by geographic location, income and gender
  • the audit will show disadvantages suffered by white working class people as well as ethnic minorities
  • the findings from this audit will influence government policy to solve these problems

So that pulls together data across the country from all providers of health services, all schools and colleges, all employers. There needs to be sufficient numbers to break them down into categories by ethnicity (18 categories are used by the Census in England), location at sufficient scale to influence policy (152 local authorities, presumably), income (maybe deciles?) and gender (in this context, they probably need more than two, let’s allow four). Also, social class has been dropped into the objectives, so they will need to collect at least three categories there.

This gives about 300,000 combinations. Inside each of these, sufficient data are needed in order to give precise estimates of fairly rare (one hopes) adverse outcomes. Let’s say maybe 200 people’s data. On total, data from 60,000,000 people, which is just short of the entire UK population, but that includes babies etc, who are not relevant to some of the indicators above. Oh dear. Now, those data need to be collected in a consistent and comparable way, analysed and fed back, including a public-friendly league table from the sounds of it, in timely fashion, say within six months of starting.

I’m being fast and loose with the required sample size, because there are some efficiency savings through excluding irrelevant combinations, multilevel modeling, assumptions of linearities or conditional independence etc, but it is still hopeless. I suspect then that this was never intended actually to happen, but just to be a sop to critics who regard our current government as representing the interests of white UK citizens only, while throwing some scraps to disenchanted white working class voters who chose Brexit and might now be disappointed that police are not going door to door rounding up Johnny Foreign.

One more concern and then I’ll be done: when politicians ask experts to do something, and everybody says no, they sometimes like to look for trimmed down versions such as a simpler analysis based on previously collected data. After all, it would be embarrassing to admit that you couldn’t do a project. However, that would be a serious mistake because of the inconsistencies and problems in making the extant sources commensurate. I hope any agency or academic department approached says no to this foolish quest.

* – you might like to compare with Nick Bostrom’s criticism of the great number of twenty-year predictions for technology: close enough to be exciting, but still after the predictor’s retirement.

1 Comment

Filed under Uncategorized

email list vs RSS feed vs Twitter vs periodical

About a year or two ago, I signed off my last e-mail list and rather assumed that they were a thing of the past. They were increasingly choked with announcements of self-promoting hype ‘articles’, of the “5 Amazing Things Every Great Data Scientist Does While Taking A Dump” variety. Now, to promote a workshop I’m organising, I find myself back on a couple and they’re far, far better than they were. In fact, there seem to be things on them that I hadn’t heard about by other means. It’s so hard to keep up with all the cool developments around the data world now, much harder than 10 years ago, and that’s wonderful but also time-consuming and potentially distracting from the kind of Deep Work that we are actually paid to do.

I got into Twitter instead (@robertstats), and that also served as an outlet for many little quick points I wanted to make, that were too small to constitute a blog post. And through Twitter I have learned about more people and ideas than I can even begin to count. But at the same time, that massively cut my blog output, which I regret somewhat, and intend to boost again a bit more.

The third source was other people’s blogs. It feels to me (without any data) that blogs are declining in popularity but the ones that make a genuine substantive contribution remain active. I used to get RSS feeds of new postings through Google and then later through (who host this blog), and I suppose I still do get those feeds, but never look at them. I really mean never! It’s just not immediate in the way the email is, and not compelling in the way that Twitter is. But it’s easy to post to Twitter every time you blog, and you could even set up some kind of bot to do it for you. So, I have to accept that those blogs that are not syndicated in any other way are going to get missed. It’s unfortunate but you can’t catch everything. The really good ones get tweeted by their readers if nothing else.

The crappy websites full of self-promotion still exist, and perhaps there are even more of them now, but somehow they seem to be controlled better and don’t sneak through. Maybe they fell foul of their own One Deep Learning Trick That Will Change Everything You Know About Everything, and got classified in the trash with 0.01 loss function. For my part, I only follow people who retweet with discretion. There are plenty of data people out there who seem to fire off everything that passes through their own feeds without reading it first, and although you feel you’re missing out on a great party, it’s best to just unfollow them. They won’t notice. And if you look a little deeper, you realise these people often have no Amazing Data Science to show for themselves but a whole lotta tweets; don’t forget what our former Prime Minister said on the subject.

I don’t read magazines on these sorts of subjects, except for Significance, which I am obliged to receive as an RSS (different kind of RSS there folks) fellow, and that often has something good. But I have started subscribing to the New York Times (digital). At the time it was far and away the best newspaper in the world for data journalism, dataviz and such, and I think they still have the lead but have lost some of their best team members while competitors grew into the field. Nevertheless I learn quite a lot from it as a well-curated, wide-covering international newspaper.

So, now I have two carefully chosen mailing lists, which send a daily digest, and I read them maybe once a week, taking no more than 10 seconds (literally) on each email. I get some tables of contents from journals, which are almost never interesting, but have occasional gems, so they get the same rough treatment. I read the paper but probably not as much as I should, and I am (as my homeboy Giovanni Cerulli put it) an avid consumer of Twitter, which signposts me off to all the blogs and publications and websites I might need.

I think the message here is that, as a data person, you need to think carefully about how you curate your own flow of information about new developments. It can easily take up too much of your time and disrupt your powers of concentration, but at the same time you can’t cloister yourself away or you will soon be a dinosaur. Our field is moving faster than ever and it’s a really exciting time to be working in it.

Leave a comment

Filed under Uncategorized

How the REF hurts isolated statisticians

In the UK, universities are rated by expert panels on the basis of their research activities, in a process called the REF (Research Excellence Framework). The resulting league table not only influences prospective students’ choices of where to study, but also the government’s allocation of funding. More money goes to research-active institutions in a ‘back the winner’ approach that aims explicitly to produce a small number of excellent institutions out of the dense (and possibly over-supplied) field that exists at present. The recent publication of the Stern Review into this process has been widely welcomed. I have been involved with institutional rankings, albeit hospitals rather than universities, for a long time, and of all the scoring systems and league tables that could be produced, the REF’s 2014 iteration is as close to a perfectly bad system as could be conceived. It might have been written by a room full of demons pounding at infernal typewriters until a sufficient level of distortion and perversity was achieved. Universities are incentivised to neglect junior researchers and save the money until a last minute frenzied auction to headhunt established academics nearing retirement. The only thing that counts is a few peer-reviewed papers by a few academics, and despite assurances of holistic, touchy-feely assessment, everybody knows it comes down to some kind of summary statistic of the journal impact factors.

Stern tries to tackle some of that, and I won’t rehash the basics as you can read that elsewhere. I want to focus on the situation that isolated statisticians, in the ASA’s sense of the term, find themselves in. Many statisticians in academia end up ‘isolated’, in that they are the only statistician in another department. Whatever their colleagues’ background, and whatever the job description may say, the isolated statistician exists to some extent as a helpdesk for the colleagues who are lacking in stats skills. I am one such, the only statistician in a faculty of 282 academic staff. Most of my publications are the result of colleagues’ projects, and only occasionally as a result of my own methodological interests. Every university department has to submit its best (as defined by REF) outputs into one particular “unit of assessment”, which in our case is “Allied Health Professions, Dentistry, Nursing and Pharmacy”.

This mapping of departments into units goes largely uncriticised — because it largely doesn’t matter — but it excludes those people like isolated statisticians who don’t belong to the same profession as the rest of the unit. All my applied work with clinical / social worker colleagues, which is the bulk of the day job, can count (and of course, I chip into so many applied projects that I actually look like a superhero in the metric of the REF), but any methodological spin-offs do not, yet they are the bit that really is Statistics, the bit that I would want to be acknowledged if I were looking for a job in a statistics department. I’m not looking for that job, but a lot of young applied jobbing statisticians are. Why is it necessary to have that crude categorisation of whole departments to a unit of assessment? It doesn’t strike me as making the assessment any easier for the REF staff, because they rate the individual submissions and then aggregate them across units. The work-around is to have joint appointments into different university departments, so applied work counts here and methodological there, except that REF would not allow that. You must belong to one unit. This may not matter so much to statisticians, who have the most under-supplied and sexiest job of the new century, because we can always up sticks and head for Silicon Valley or the City, but is it really the intention of the REF to promote professional ghettos free from methodologists throughout academia? We have seen from the psychology crisis of replication what happens when people get A Little Knowledge and only ever talk to others like themselves.

Leave a comment

Filed under Uncategorized

A bird’s eye view of statistics in two hours

Next week I am giving a two-hour talk and discussion for Kingston University researchers and doctoral students, with the aim of being an update on statistics for those who are not active in the field. That’s an interesting and quite challenging mission, not least of all because it must fit into two hours, with the first hour being an overview for newcomers like PhD students from health and social care disciplines, and the second hour looking at big current topics. I thought I would cover these points in the second half:

  • crisis of replication: what does it mean for researchers, and how is “good practice” likely to change?
  • GAISE, curriculum reform & simulation in teaching
  • data visualization
  • big data
  • machine learning

The first half warrants a revised version of this handout, with the talk then structuring the ideas around three traditions of teaching and learning stats:

  • classical, mathematically grounded, stats, exemplified by Snedecor, Fisher, Neyman & Pearson, and many textbooks with either a theoretical or applied focus. Likelihood and/or adding prior to get posterior distributions are the big concepts here.
  • cookbook, exemplified by many popular textbooks out there, especially if their titles make light of statistics as a ‘hard’ subject (you could count Fisher here as the first evangelical writer in 1925, though it is harsh to put him in the same camp as some of these flimsy contemporary textbooks)
  • reformist, exemplified by¬†Tukey in the 70s but consolidated around George Cobb and Joan Garfield’s work for the American Statistical Association. The only books for this are “Statistics: Unlocking the Power of Data” by the Lock family and “Introduction to Statistical Investigations” by Tintle et al.

It’s worth remembering that there are other great thinkers who accept the role of computational thinking and yet insist that you can’t really do statistics without being skilled in mathematics, of whom David Cox springs to mind.


Hiroshige’s Eagle over the 100,000 acre plain of statistics. Note the density plot of some big data in the background.

The topics to interweave with those three traditions are models, sampling distribution versus data distribution, likelihood, significance testing as a historic aide to hand calculation, and Bayesian principles. I’ll put slides on my website when they’re ready.

While I’m on this subject, I’ll tell you about an afternoon meeting at the Royal Statistical Society on 13 October, which I have organised. The topic is making computational thinking part of learning statistics, and we have three great speakers: Helen Drury (Mathematics Mastery) representing the schools perspective, Kari Lock Morgan (Penn State University) representing the university perspective, and Jim Ridgway (University of Durham) considering what the profession should do about the changing face of teaching our subject.

Leave a comment

Filed under learning