Next week I am giving a two-hour talk and discussion for Kingston University researchers and doctoral students, with the aim of being an update on statistics for those who are not active in the field. That’s an interesting and quite challenging mission, not least of all because it must fit into two hours, with the first hour being an overview for newcomers like PhD students from health and social care disciplines, and the second hour looking at big current topics. I thought I would cover these points in the second half:
- crisis of replication: what does it mean for researchers, and how is “good practice” likely to change?
- GAISE, curriculum reform & simulation in teaching
- data visualization
- big data
- machine learning
The first half warrants a revised version of this handout, with the talk then structuring the ideas around three traditions of teaching and learning stats:
- classical, mathematically grounded, stats, exemplified by Snedecor, Fisher, Neyman & Pearson, and many textbooks with either a theoretical or applied focus. Likelihood and/or adding prior to get posterior distributions are the big concepts here.
- cookbook, exemplified by many popular textbooks out there, especially if their titles make light of statistics as a ‘hard’ subject (you could count Fisher here as the first evangelical writer in 1925, though it is harsh to put him in the same camp as some of these flimsy contemporary textbooks)
- reformist, exemplified by Tukey in the 70s but consolidated around George Cobb and Joan Garfield’s work for the American Statistical Association. The only books for this are “Statistics: Unlocking the Power of Data” by the Lock family and “Introduction to Statistical Investigations” by Tintle et al.
It’s worth remembering that there are other great thinkers who accept the role of computational thinking and yet insist that you can’t really do statistics without being skilled in mathematics, of whom David Cox springs to mind.
Hiroshige’s Eagle over the 100,000 acre plain of statistics. Note the density plot of some big data in the background.
The topics to interweave with those three traditions are models, sampling distribution versus data distribution, likelihood, significance testing as a historic aide to hand calculation, and Bayesian principles. I’ll put slides on my website when they’re ready.
While I’m on this subject, I’ll tell you about an afternoon meeting at the Royal Statistical Society on 13 October, which I have organised. The topic is making computational thinking part of learning statistics, and we have three great speakers: Helen Drury (Mathematics Mastery) representing the schools perspective, Kari Lock Morgan (Penn State University) representing the university perspective, and Jim Ridgway (University of Durham) considering what the profession should do about the changing face of teaching our subject.
Lots of stats are being bandied about as we prepare for the famous Brexit vote. Not all of them are good, and there are conflicts of interest everywhere, perceived or real. It is tedious to demolish bad stats over and over, so I will take a different tack that caught my eye today, and that is the application of good, solid, old-fashioned logic. A few years ago, I recall being in a session at the RSS conference, in a room with about 50 people. Ian Hunt asked for a show of hands if the audience had ever studied any logic course at school or college, and mine was the only one to go up. I really enjoyed that course, and the textbook was an old one by Wilfred Hodges (“Logic”) which has been reprinted a zillion times since it first came out. It is pithy but engaging, a real exemplar of textbook writing at an introductory level. I commend it to all humans. Its benefits last a lifetime.
Let’s apply those skills, cobwebbed since the 1990s, to this webpage and this letter (paywalled) to The Times from the Institute for New Economic Thinking (INET) at Oxford. INET is in part funded by the European Commission (inet.ox.ac.uk/files/publications/INET%20Highlights%20Report%202012-14.pdf page 58) – let’s just put that fact out there and let you make of it what you will – personally, I don’t think it counts for all that much.
Rather surprisingly, they make three arguments and each is unsupported by the data they provide, and also logically fallacious. A tour de force of blundering.*
So, arguments in three parts:
- No 1: “History is clear: things have gone very well for Britain as a member of the EU.”
For this, you can see a chart that shows GDP per capita relative to 1973 going up, and faster than those blasted French, Germans and Americans. Ha! That’ll teach them. Or perhaps we were just in a really bad place in 1973, and were subsequently buoyed up by sales of Pink Floyd’s Dark Side of the Moon.** More importantly though, the fact that we did well while in the EU does not imply we did well because of the EU. Our GDP per capita went up 12.3 times in the 40 years that followed, but China’s went up 43.9 times, which, by the same logic, is clear evidence that we lost out by not having the Cultural Revolution. Damn! This fallacy is called post hoc ergo propter hoc, and is a staple of politicians everywhere. You could succinctly describe it as conflation of subsequence and consequence. Furthermore, even if we did well because of the EU, that still gives only a weak level of confidence in future performance, which is the real decision to be made here.
- No 2: “Secondly, growth in the UK was more equally shared than in the USA”
I’m not sure what this has got to do with Brexit, other than the unspoken suggestion that if we left, our Government (more right-wing than most EU countries in economic terms, but still verging on Trotskyite from an American perspective), would gradually erode policies that promote equality. INET say this:
“Britain has had the best of both worlds while a member of the EU — not just strong growth, but more equal growth”,
which still has a dose of post hoc about it, but also Weak Analogy: the suggestion is that we’d better stay in because someone outside is less equal than us, and if we leave we are sure to become like them. If that’s not the implication, then it must be irrelevant. There may also be some selective quoting going on here. Why the USA? They have a World Bank Gini coefficient of 41.1 to our 32.6, which means we are not as unequal as them. Note here that South Africa is not quoted as the inequality example, despite being statistically more striking (63.4), because there are well-known historical reasons why we would not become like them. To quote South Africa would be pushing it too far. Quote the USA and you might just get away with it. Norway, famous for leaving the EU, has 25.9. Just sayin’. (Forget about their gas because inequality is not the same as GDP per capita.)
- No 3: “At present 45% of the UK’s exports go to other EU member countries. In response to the concern that the EU might impose high tariffs or punitive measures if the UK leaves, some Brexiteers have said that we can ‘just trade with Australia and Canada’. These two countries, however, only account for a meagre 2.9% of British exports.”
Well, they would, wouldn’t they. That’s because we’re in the EU, not some Commonwealth trading bloc. The real question is how things might change, not what they are now. So, like no 2, an unwritten implication is being made here about the future. In no 2, it was that everything would change, and here, strangely, it is that nothing will change. Why this difference? Perhaps because it fits the prior beliefs of the authors, or perhaps it is just carelessness (oops). So, if the assumption is true that nothing will change, then we will trade little tomorrow with the same people we traded little with yesterday, which proves (wait for it) that nothing will change! What mastery of the argument, what skill with the pen. This is in fact a nice example of begging the question.
I’ve nothing against them making a strong case for what they believe, and I am delighted to see an attempt to use data to support any such argument, but I think one should not do the public the disservice of misleading them through repeated abuse of both logic and statistics.
* – my wife has told me not to antagonise people online, so I do not say this without first considering whether I am truly justified.
** – this is grotesque and silly hyperbole. But it is at least not post hoc ergo propter hoc, which makes it an improved explanation on the INET letter.