# Explanation and inference with house sparrows

This time I’m going to take a closer look at another of my data visualisations I’ve been filling my spare time with for fun, not profit. I have two bird feeders in our garden and you can watch the consumption of seeds updated with every top-up at this page. This started when I wrote about Dear Data (viz o’ the year 2015) and recommended playing around with any odd local data you can get your hands on. I thought it would just be a cutesy dataviz exercise but it ended up as a neat microcosm of another issue that has occupied me somewhat this year: inference and explanation.

Briefly, statistical inference says “the rate of bird seed consumption is 0.41 cm/day, and if birds throughout suburban England consume at one rate, that should be between 0.33 and 0.50, with 95% probability”, or “the bird seed consumption has changed, in fact more than is plausibly due to random variation, so it is probably some systematic effect”. But explanation is different and is all about why it changed. Explanation doesn’t have to match statistics. A compelling statistical inference with a miniscule p-value could bug the hell out of you because you just can’t see why it should be in the direction it is, or of the strength is. Or an unconvincing, borderline statistical inference could cry out to you, “I am a fact. Just the way you hoped I would be!”

The problem here is that we try to be systematic in doing our statistical inferences, so that we don’t fall prey to cognitive biases: we pre-specify what we’re going to do and have to raise the bar if we start doing lots of tests. However, there’s no systematic approach like that to explanation. In fact, it’s not at all clear where these explanations come from, apart from thunderbolts of inspiration, and it’s only somewhat understood how we judge a good explanation from a poor one (as ever, I refer you to Peter Lipton’s book Inference To The Best Explanation for further reading).

When you get a great, satisfying explanation, it’s tempting to stop looking, but when you have compelling stats that don’t lead to a nice explanation, you might keep poking at the data, looking for patterns you like better, that suggest just such a nice explanation to you. Then, all the statistical work is no more sound than the explanatory thunderbolts.

Sad to relate, dear Reader, even Robert falls into these traps. On the web page, I wrote an explanation of the pattern of seed consumption, without giving too much thought to it:

I interpret the pattern along these lines: in mid-summer, the consumption increases massively as all the chicks leave the nest and start learning how to feed themselves. The sparrows in particular move around and feed in flocks of up to 20 birds. Once seeds and berries are available in the country though, it is safer for them to move out there than to stay in the suburbs with prowling cats everywhere. But as the new year arrives, the food runs out and they move back in gradually, still in large flocks, before splitting into small territories to build nests. Cycle of life and all that.

That was based on unsystematic observation before I started collecting the data. My hunches were informed by sketchy information about the habits of house sparrows, gleaned from goodness-knows-where, and they were backed up by the first year of data. I felt pretty smug. On this basis, one would feel confident predicting the future. But then, things started to unravel. The pattern no longer fit and there were multiple competing explanations for this. The data alone could not choose between them. Fundamentally, I realised I was sleepwalking into ignoring one of my own rules: don’t treat complex adaptive systems like physical experiments. An ecosystem — some gardens, parks and terrain vague, plus a bunch of songbirds, raptors, insects, squirrels, humans and cats — is a complex adaptive system, and the same issues beset any research in society. Causal relationships are non-linear, highly interdependent, and there are intelligent agents in the system. This all contributes to the same input producing very different outputs on different occasions, because the rules of the system change.

If it is foolish to declare an explanation on the basis of one year’s data that happen to match prior beliefs, it is equally so to declare an explanation for why 2016 shifted from 2015’s pattern after just a few months. It’s also foolish to say that after March 2016 consumption dropped and that coincided with a new roof being built on our garage, so that is the cause — yet I did just that:

The sharp drop in March 2016 was the result of work going on to replace the roof on our garage, which introduced scary humans into the garden all day. [emphasis added]

How embarrassing. Yes, it’s a nice explanation, but it doesn’t really have any competitors because there are no other observed causes, and there are no other observed effects either (I don’t sit out there all day taking notes like Thoreau). It’s only likely insofar as it is in a set of one and there are no likelier competitors. It’s only lovely insofar as it explains the data, and there’s nothing else to explain. And when we get later in the year, the congruence it enjoys with prior beliefs of casual mechanisms deteriorates: surely those birds would get used to the new roof and come back?

But we are all capable of remarkable mental gymnastics to keep our favourite explanation in the running. My neighbour had a tree cut down in June, so that would upset things further, and would be a permanent shift in the system. It was a good year for insects, following a frost-free winter, so there was less pressure to go feeding in comparatively dangerous gardens. And so on, getting ever more fanciful. The evidence for any of these mechanisms is thin to say the least. We can’t guard against this mental laxity because it’s the same process that helped our family members long ago to eat the springbok in the bushes but not get eaten by the corresponding lion, and now it’s hard-wired, but we can at least acknowledge that science* is somewhat subjective, even though we try our best to impose strict lab-notes-style hypothetico-deduction on it (this does not necessarily imply the use of decision theoretic devices like significance tests; the birdfeeder page only does splines which is basically non-parametric descriptive statistics), and not pretend to be able to know the Secrett of Nature by way of Experiment.

(More reading: Terry Speed’s article Creativity In Statistics.)

* – while simple physical sciences — of the sort you and I did in high school — might lead to Secretts, life and social sciences certainly don’t, and in fact the modern physical sciences involve pushing instruments to their limits and then statistics comes in to help sift the signal from the noise, so they too are a step removed from claiming to have proven physical laws.

Also, this exercise has something to say about data collection, or appropriation. I made some ground rules that were not very well specified, but in essence, I thought it best to write down how much seed I put in, rather than adjust for why it had come out. Spilled seeds were not to be separated from eaten seeds. But then in July 2015, I found a whole big feeder gone after filling it up in the morning, and ascribed this (on no evidence) to a squirrel. I thought about disregarding the day, but decided not to in the end. For one thing, it turned out in the cold light of analysis to be not so different to other high-consumption days. Maybe it was just especially ravenous sparrows. Or maybe Cyril the Squirrel had been at work all along. Once again, there was no way to choose one explanation from another. Now, the more you learn about the source of the data in all its messy glory, the more you question. But without that information, you wouldn’t. Another subjective, mutable aspect appears, one which is more relevant in this age of readily available and reused data.

All of these bird seed problems also appear in real research and analysis, but there, drawing the wrong conclusions can cause real harm. In each of the cock-ups above, it is the explanation that causes the problem, not the stats.

As I write this, I feel like I keep banging the same subjectivity and explanation drums, but, frankly, I don’t see much evidence of practice changing. I think the replication efforts of recent years in psychology are somewhat helpful, but are limited to fighting on a very narrow front. It probably helps to terrorise researchers more generally regarding poor practices but what we also need is a friendly acceptance of subjectivity and the role explanation plays. Science is hard.