Andrew Gelman has been writing on his excellent blog about how it is the constraint and the unexpected inspiration of real-life, tricky, dirty data problems that lead us to make useful new methods in stats (and probably other methodological fields too).
There is a lot to learn from in this post. The motivation for making new methods is important to their success:
We weren’t trying to shave a half a point off the predictive error … we were attacking new problems that we couldn’t solve in any reasonable way using existing methods
and he goes on to describe a situation where his published maps were shown to be faulty and criticised publicly. Far from shrinking embarrassed back into the ivory tower or explaining it away under a lot of esoteric jargon, he improved them and got a better quality result at the end of the process; that quality is the only thing that matters, not the statistician’s ego.
I was also struck by the mention of caring about the results. This is the central issue to me, more important than whether the inspiration or the exemplar data are real or artificial, current or historical, theoretical or practical. It is because we know how important it will be for future carpenters to construct better homes, that we work hard to make a better tool for the job, even if it is a struggle to get anyone to recognise the value of the new tool. (Last year I met Stef van Buuren at a conference and he told me it took 4 years to get any journal editors and reviewers to take fully conditional specification multiple imputation seriously – now it is everywhere.) Of course it is easier to care about real-life applications that make the world a better place, but Prof Gelman knew the implications of his 8 schools problem so well he cared about it too and that shows through.
In response to Gelman’s reference to Watership Down, I will give you the nugget of gold at the heart of JD Salinger’s underrated Franny & Zooey. Franny has come home from college – in what might have been termed, in the upper East side circa 1950, a blue funk – at what simpletons and dullards her professors turned out to be. All they want is to score points off one another and put the students down. Nobody really cares about literature, which makes her retreat to the sofa under a blanket and look for some higher meaning in the family’s eclectic spiritual and philosophical book collection. Nothing works until her brother Zooey gets frustrated by her high expectations and tells her about when he and their late brother Seymour were regular contestants on a radio quiz show for children. Even when Zooey was feeling rebellious, Seymour insisted they dress up formally and shine their shoes, though not even the audience in the studio could see their shoes.
He said to shine them anyway. He said to shine them for the Fat Lady. I didn’t know what the hell he was talking about, but he had a very Seymour look on his face, and so I did it. He never did tell me who the Fat Lady was, but I shined my shoes for the Fat Lady every time I ever went on the air again … This terribly clear picture of the Fat Lady formed in my mind. I had her sitting on this porch all day, swatting flies, with her radio going full-blast from morning till night. I figured the heat was terrible, and she probably had cancer, and – I don’t know. Anyway, it seemed goddam clear why Seymour wanted me to shine my shoes… but I’ll tell you a terrible secret – are you listening to me? There isn’t anybody out there who isn’t Seymour’s Fat Lady. [Penguin Books 1957]
See, what you can do with data is a valuable thing, and you might not have much time in which to do it, so I don’t have any time to spare for people who choose to waste their energies entering a data mining competition to see who can best predict next week’s NASDAQ, or crunching numbers for a bookmakers or any other parasites upon humanity. The process of making the tool that someone else will use to make the world better is its own reward.
If you crunch numbers, then, like me, I expect you have at some point put a lot of work into a project, got very near the end (publication, sending off the report, whatever) and then found a small error, maybe in your numbers or maybe in the wording and interpretation that accompanies them. Maybe you just found something that made you worry that there might be errors lurking somewhere. Yet, you knew that your non-statistician colleagues would never know, and also that you were extremely tired and just wanted to go home and have a beer. This, then, is the statistical litmus test: I can honestly tell you I have always dragged myself back to the task, sometimes unpicking all my work just to find there was nothing wrong after all. I do have the thought of leaving it and not saying anything – you’d not be human if you didn’t – but I have always gone back. I guess that is because I have always had the pleasure of working on material I care about. If you wake up tomorrow and find you don’t care, that you might fail the litmus test, do yourself a favour and get out of that job – immediately – because we need you to care and the clock is ticking.