Futility audit

Theresa May’s “racial disparity audit” announced on 27 August, is really just a political gesture that works best if it never delivers findings. I’m reminded of the scene in Yes, Minister (or is it The Thick Of It? Or both?) where the protagonists are all in trouble for something and when the prime minister announces that there will be a public inquiry to find out what went wrong, they are delighted. They know that inquiries are the political equivalent of long grass, with the intention being that everybody involved has retired by the time it reports*.

larry-and-theresa

Larry knows better than to look for mice in 300,000 different places.

It’s not entirely clear what is meant by audit here. Not in the accountants’ sense, surely. Something more like clinical audit? Audit, done properly, is pretty cool. Timely information on performance can get fed back to professionals who run public services, and they can use those data to examine potential problems and improve what they do. But when central agencies examine the data and make the call, it is not the same thing. The trouble is that, whatever indicators you measure, indicators can only indicate; it takes understanding of the local context to see whether it really is a problem.

But there’s another, more statistical problem in this plan: it is impossible to deliver all those goals in the announcement from the prime ministers office:

  • audit to shine a light on how our public services treat people from different backgrounds
  • public will be able to check how their race affects how they are treated on key issues such as health, education and employment, broken down by geographic location, income and gender
  • the audit will show disadvantages suffered by white working class people as well as ethnic minorities
  • the findings from this audit will influence government policy to solve these problems

So that pulls together data across the country from all providers of health services, all schools and colleges, all employers. There needs to be sufficient numbers to break them down into categories by ethnicity (18 categories are used by the Census in England), location at sufficient scale to influence policy (152 local authorities, presumably), income (maybe deciles?) and gender (in this context, they probably need more than two, let’s allow four). Also, social class has been dropped into the objectives, so they will need to collect at least three categories there.

This gives about 300,000 combinations. Inside each of these, sufficient data are needed in order to give precise estimates of fairly rare (one hopes) adverse outcomes. Let’s say maybe 200 people’s data. On total, data from 60,000,000 people, which is just short of the entire UK population, but that includes babies etc, who are not relevant to some of the indicators above. Oh dear. Now, those data need to be collected in a consistent and comparable way, analysed and fed back, including a public-friendly league table from the sounds of it, in timely fashion, say within six months of starting.

I’m being fast and loose with the required sample size, because there are some efficiency savings through excluding irrelevant combinations, multilevel modeling, assumptions of linearities or conditional independence etc, but it is still hopeless. I suspect then that this was never intended actually to happen, but just to be a sop to critics who regard our current government as representing the interests of white UK citizens only, while throwing some scraps to disenchanted white working class voters who chose Brexit and might now be disappointed that police are not going door to door rounding up Johnny Foreign.

One more concern and then I’ll be done: when politicians ask experts to do something, and everybody says no, they sometimes like to look for trimmed down versions such as a simpler analysis based on previously collected data. After all, it would be embarrassing to admit that you couldn’t do a project. However, that would be a serious mistake because of the inconsistencies and problems in making the extant sources commensurate. I hope any agency or academic department approached says no to this foolish quest.

* – you might like to compare with Nick Bostrom’s criticism of the great number of twenty-year predictions for technology: close enough to be exciting, but still after the predictor’s retirement.

Leave a comment

Filed under Uncategorized

email list vs RSS feed vs Twitter vs periodical

About a year or two ago, I signed off my last e-mail list and rather assumed that they were a thing of the past. They were increasingly choked with announcements of self-promoting hype ‘articles’, of the “5 Amazing Things Every Great Data Scientist Does While Taking A Dump” variety. Now, to promote a workshop I’m organising, I find myself back on a couple and they’re far, far better than they were. In fact, there seem to be things on them that I hadn’t heard about by other means. It’s so hard to keep up with all the cool developments around the data world now, much harder than 10 years ago, and that’s wonderful but also time-consuming and potentially distracting from the kind of Deep Work that we are actually paid to do.

I got into Twitter instead (@robertstats), and that also served as an outlet for many little quick points I wanted to make, that were too small to constitute a blog post. And through Twitter I have learned about more people and ideas than I can even begin to count. But at the same time, that massively cut my blog output, which I regret somewhat, and intend to boost again a bit more.

The third source was other people’s blogs. It feels to me (without any data) that blogs are declining in popularity but the ones that make a genuine substantive contribution remain active. I used to get RSS feeds of new postings through Google and then later through WordPress.com (who host this blog), and I suppose I still do get those feeds, but never look at them. I really mean never! It’s just not immediate in the way the email is, and not compelling in the way that Twitter is. But it’s easy to post to Twitter every time you blog, and you could even set up some kind of bot to do it for you. So, I have to accept that those blogs that are not syndicated in any other way are going to get missed. It’s unfortunate but you can’t catch everything. The really good ones get tweeted by their readers if nothing else.

The crappy websites full of self-promotion still exist, and perhaps there are even more of them now, but somehow they seem to be controlled better and don’t sneak through. Maybe they fell foul of their own One Deep Learning Trick That Will Change Everything You Know About Everything, and got classified in the trash with 0.01 loss function. For my part, I only follow people who retweet with discretion. There are plenty of data people out there who seem to fire off everything that passes through their own feeds without reading it first, and although you feel you’re missing out on a great party, it’s best to just unfollow them. They won’t notice. And if you look a little deeper, you realise these people often have no Amazing Data Science to show for themselves but a whole lotta tweets; don’t forget what our former Prime Minister said on the subject.

I don’t read magazines on these sorts of subjects, except for Significance, which I am obliged to receive as an RSS (different kind of RSS there folks) fellow, and that often has something good. But I have started subscribing to the New York Times (digital). At the time it was far and away the best newspaper in the world for data journalism, dataviz and such, and I think they still have the lead but have lost some of their best team members while competitors grew into the field. Nevertheless I learn quite a lot from it as a well-curated, wide-covering international newspaper.

So, now I have two carefully chosen mailing lists, which send a daily digest, and I read them maybe once a week, taking no more than 10 seconds (literally) on each email. I get some tables of contents from journals, which are almost never interesting, but have occasional gems, so they get the same rough treatment. I read the paper but probably not as much as I should, and I am (as my homeboy Giovanni Cerulli put it) an avid consumer of Twitter, which signposts me off to all the blogs and publications and websites I might need.

I think the message here is that, as a data person, you need to think carefully about how you curate your own flow of information about new developments. It can easily take up too much of your time and disrupt your powers of concentration, but at the same time you can’t cloister yourself away or you will soon be a dinosaur. Our field is moving faster than ever and it’s a really exciting time to be working in it.

Leave a comment

Filed under Uncategorized

How the REF hurts isolated statisticians

In the UK, universities are rated by expert panels on the basis of their research activities, in a process called the REF (Research Excellence Framework). The resulting league table not only influences prospective students’ choices of where to study, but also the government’s allocation of funding. More money goes to research-active institutions in a ‘back the winner’ approach that aims explicitly to produce a small number of excellent institutions out of the dense (and possibly over-supplied) field that exists at present. The recent publication of the Stern Review into this process has been widely welcomed. I have been involved with institutional rankings, albeit hospitals rather than universities, for a long time, and of all the scoring systems and league tables that could be produced, the REF’s 2014 iteration is as close to a perfectly bad system as could be conceived. It might have been written by a room full of demons pounding at infernal typewriters until a sufficient level of distortion and perversity was achieved. Universities are incentivised to neglect junior researchers and save the money until a last minute frenzied auction to headhunt established academics nearing retirement. The only thing that counts is a few peer-reviewed papers by a few academics, and despite assurances of holistic, touchy-feely assessment, everybody knows it comes down to some kind of summary statistic of the journal impact factors.

Stern tries to tackle some of that, and I won’t rehash the basics as you can read that elsewhere. I want to focus on the situation that isolated statisticians, in the ASA’s sense of the term, find themselves in. Many statisticians in academia end up ‘isolated’, in that they are the only statistician in another department. Whatever their colleagues’ background, and whatever the job description may say, the isolated statistician exists to some extent as a helpdesk for the colleagues who are lacking in stats skills. I am one such, the only statistician in a faculty of 282 academic staff. Most of my publications are the result of colleagues’ projects, and only occasionally as a result of my own methodological interests. Every university department has to submit its best (as defined by REF) outputs into one particular “unit of assessment”, which in our case is “Allied Health Professions, Dentistry, Nursing and Pharmacy”.

This mapping of departments into units goes largely uncriticised — because it largely doesn’t matter — but it excludes those people like isolated statisticians who don’t belong to the same profession as the rest of the unit. All my applied work with clinical / social worker colleagues, which is the bulk of the day job, can count (and of course, I chip into so many applied projects that I actually look like a superhero in the metric of the REF), but any methodological spin-offs do not, yet they are the bit that really is Statistics, the bit that I would want to be acknowledged if I were looking for a job in a statistics department. I’m not looking for that job, but a lot of young applied jobbing statisticians are. Why is it necessary to have that crude categorisation of whole departments to a unit of assessment? It doesn’t strike me as making the assessment any easier for the REF staff, because they rate the individual submissions and then aggregate them across units. The work-around is to have joint appointments into different university departments, so applied work counts here and methodological there, except that REF would not allow that. You must belong to one unit. This may not matter so much to statisticians, who have the most under-supplied and sexiest job of the new century, because we can always up sticks and head for Silicon Valley or the City, but is it really the intention of the REF to promote professional ghettos free from methodologists throughout academia? We have seen from the psychology crisis of replication what happens when people get A Little Knowledge and only ever talk to others like themselves.

Leave a comment

Filed under Uncategorized

A bird’s eye view of statistics in two hours

Next week I am giving a two-hour talk and discussion for Kingston University researchers and doctoral students, with the aim of being an update on statistics for those who are not active in the field. That’s an interesting and quite challenging mission, not least of all because it must fit into two hours, with the first hour being an overview for newcomers like PhD students from health and social care disciplines, and the second hour looking at big current topics. I thought I would cover these points in the second half:

  • crisis of replication: what does it mean for researchers, and how is “good practice” likely to change?
  • GAISE, curriculum reform & simulation in teaching
  • data visualization
  • big data
  • machine learning

 
The first half warrants a revised version of this handout, with the talk then structuring the ideas around three traditions of teaching and learning stats:

  • classical, mathematically grounded, stats, exemplified by Snedecor, Fisher, Neyman & Pearson, and many textbooks with either a theoretical or applied focus. Likelihood and/or adding prior to get posterior distributions are the big concepts here.
  • cookbook, exemplified by many popular textbooks out there, especially if their titles make light of statistics as a ‘hard’ subject (you could count Fisher here as the first evangelical writer in 1925, though it is harsh to put him in the same camp as some of these flimsy contemporary textbooks)
  • reformist, exemplified by Tukey in the 70s but consolidated around George Cobb and Joan Garfield’s work for the American Statistical Association. The only books for this are “Statistics: Unlocking the Power of Data” by the Lock family and “Introduction to Statistical Investigations” by Tintle et al.

It’s worth remembering that there are other great thinkers who accept the role of computational thinking and yet insist that you can’t really do statistics without being skilled in mathematics, of whom David Cox springs to mind.

eagle-over-100-000-acre-plain-at-susaki-fukagawa-juman-tsubo.jpg!Large

Hiroshige’s Eagle over the 100,000 acre plain of statistics. Note the density plot of some big data in the background.

The topics to interweave with those three traditions are models, sampling distribution versus data distribution, likelihood, significance testing as a historic aide to hand calculation, and Bayesian principles. I’ll put slides on my website when they’re ready.

While I’m on this subject, I’ll tell you about an afternoon meeting at the Royal Statistical Society on 13 October, which I have organised. The topic is making computational thinking part of learning statistics, and we have three great speakers: Helen Drury (Mathematics Mastery) representing the schools perspective, Kari Lock Morgan (Penn State University) representing the university perspective, and Jim Ridgway (University of Durham) considering what the profession should do about the changing face of teaching our subject.

Leave a comment

Filed under learning

Logic, stats and Brexit

Lots of stats are being bandied about as we prepare for the famous Brexit vote. Not all of them are good, and there are conflicts of interest everywhere, perceived or real. It is tedious to demolish bad stats over and over, so I will take a different tack that caught my eye today, and that is the application of good, solid, old-fashioned logic. A few years ago, I recall being in a session at the RSS conference, in a room with about 50 people. Ian Hunt asked for a show of hands if the audience had ever studied any logic course at school or college, and mine was the only one to go up. I really enjoyed that course, and the textbook was an old one by Wilfred Hodges (“Logic”) which has been reprinted a zillion times since it first came out. It is pithy but engaging, a real exemplar of textbook writing at an introductory level. I commend it to all humans. Its benefits last a lifetime.

Let’s apply those skills, cobwebbed since the 1990s, to this webpage and this letter (paywalled) to The Times from the Institute for New Economic Thinking (INET) at Oxford. INET is in part funded by the European Commission (inet.ox.ac.uk/files/publications/INET%20Highlights%20Report%202012-14.pdf page 58) – let’s just put that fact out there and let you make of it what you will – personally, I don’t think it counts for all that much.

Rather surprisingly, they make three arguments and each is unsupported by the data they provide, and also logically fallacious. A tour de force of blundering.*

So, arguments in three parts:

  • No 1: “History is clear: things have gone very well for Britain as a member of the EU.”
    For this, you can see a chart that shows GDP per capita relative to 1973 going up, and faster than those blasted French, Germans and Americans. Ha! That’ll teach them. Or perhaps we were just in a really bad place in 1973, and were subsequently buoyed up by sales of Pink Floyd’s Dark Side of the Moon.** More importantly though, the fact that we did well while in the EU does not imply we did well because of the EU. Our GDP per capita went up 12.3 times in the 40 years that followed, but China’s went up 43.9 times, which, by the same logic, is clear evidence that we lost out by not having the Cultural Revolution. Damn! This fallacy is called post hoc ergo propter hoc, and is a staple of politicians everywhere. You could succinctly describe it as conflation of subsequence and consequence. Furthermore, even if we did well because of the EU, that still gives only a weak level of confidence in future performance, which is the real decision to be made here.
  • No 2: “Secondly, growth in the UK was more equally shared than in the USA”
    I’m not sure what this has got to do with Brexit, other than the unspoken suggestion that if we left, our Government (more right-wing than most EU countries in economic terms, but still verging on Trotskyite from an American perspective), would gradually erode policies that promote equality. INET say this:
    “Britain has had the best of both worlds while a member of the EU — not just strong growth, but more equal growth”,
    which still has a dose of post hoc about it, but also Weak Analogy: the suggestion is that we’d better stay in because someone outside is less equal than us, and if we leave we are sure to become like them. If that’s not the implication, then it must be irrelevant. There may also be some selective quoting going on here. Why the USA? They have a World Bank Gini coefficient of 41.1 to our 32.6, which means we are not as unequal as them. Note here that South Africa is not quoted as the inequality example, despite being statistically more striking (63.4), because there are well-known historical reasons why we would not become like them. To quote South Africa would be pushing it too far. Quote the USA and you might just get away with it. Norway, famous for leaving the EU, has 25.9. Just sayin’. (Forget about their gas because inequality is not the same as GDP per capita.)
  • No 3: “At present 45% of the UK’s exports go to other EU member countries. In response to the concern that the EU might impose high tariffs or punitive measures if the UK leaves, some Brexiteers have said that we can ‘just trade with Australia and Canada’. These two countries, however, only account for a meagre 2.9% of British exports.”
    Well, they would, wouldn’t they. That’s because we’re in the EU, not some Commonwealth trading bloc. The real question is how things might change, not what they are now. So, like no 2, an unwritten implication is being made here about the future. In no 2, it was that everything would change, and here, strangely, it is that nothing will change. Why this difference? Perhaps because it fits the prior beliefs of the authors, or perhaps it is just carelessness (oops). So, if the assumption is true that nothing will change, then we will trade little tomorrow with the same people we traded little with yesterday, which proves (wait for it) that nothing will change! What mastery of the argument, what skill with the pen. This is in fact a nice example of begging the question.

I’ve nothing against them making a strong case for what they believe, and I am delighted to see an attempt to use data to support any such argument, but I think one should not do the public the disservice of misleading them through repeated abuse of both logic and statistics.

* – my wife has told me not to antagonise people online, so I do not say this without first considering whether I am truly justified.
** – this is grotesque and silly hyperbole. But it is at least not post hoc ergo propter hoc, which makes it an improved explanation on the INET letter.

Leave a comment

Filed under Uncategorized

Borderline

Over the years I have gradually become more consistent in calling p-values from 0.02 to 0.08 borderline. There’s no reason for those cutoffs other than personal experience, also known as “making the same mistakes with increasing confidence [npi] over an impressive number of years“. Type 1 errors (which never really happen, but you know what I mean) have uniformly distributed p, so that’s where you’re going to find them. Just thought I’d say that.

Leave a comment

Filed under Uncategorized

Visiting Big Bang Data

I finally got a chance to visit the exhibition Big Bang Data at the Embankment Galleries, Somerset House this week. I had heard good things about it, and of course I am a big fan of Dear Data so I couldn’t pass up the chance to see those postcards in real life.

Screen Shot 2016-03-18 at 09.59.38

My GPS trace over 4 years around Somerset House from Google location history, visualised using theopolis.me/location-history-visualizer

Good stuff: it was really quite busy. An audience almost all younger than me were obviously enthusiastic and stimulated by it all. And I have to say it was Dear Data that held the most attentions for the longest. There is no patronising text helping people bridging art to science or vice versa; I think that’s less necessary nowadays. A broad church from activism to paranoia to fun. Of all exhibits, I got the Biggest Bang from (awooo) Networks Of London by Ingrid Burrington and Dan Williams. They mapped out the secretive ways in which data moves around the physical world in this town. More than any theory, this brings home what a big deal it is (or is perceived to be by Dilbert’s boss) because of the colossal cost of creating and maintaining all of this infrastructure, often for reasons that seem flimsy to us everyday folk, like selling access to stock market data that might get there a few milliseconds before your competitor gets it.

Not so good: I think process matters more with this than a bunch of paintings. Artists don’t like talking about How I Made Elastic Man but in this setting, it would be nice to have some videos with headphones that delved more into how it is done. However… there’s loads of stuff on the website, so go and look at that even if you’re not in London twiddling your thumbs and looking for intellectual fun this weekend, Personally, also, I knew about or had seen quite a lot of these projects before, but I guess that’s inevitable, and it’s nice to see them in the flesh.

If you want to go, you’d better hurry. It closes on Sunday.

Leave a comment

Filed under Visualization