People often say that Bland and Altman’s paper where they set out the eponymous plot for comparing two measures in medical statistics is the most-cited stats paper ever. I thought I would poke around on Google Scholar and see what the citations looked like there.
In terms of total citations, and given all the shortcoming of this as a measure of anything, there are two ahead of B&A, and they needn’t feel cheated, as we’re talking about titans of statistics here. Here’s the rankings for the seven papers I could think of testing:
- Cox (1972) Regression and life tables: 35,512 citations.
- DLR (1977) Maximum likelihood from incomplete data via the EM algorithm: 34,988
- Bland & Altman (1986) Statistical methods for assessing agreement between two methods of clinical measurement: 27,181
- Geman & Geman (1984) Stochastic relaxation, Gibbs distributions, and the Bayesian restoration of images: 15,106
- Efron (1979) Bootstrap methods: another look at the jackknife: 9686
- Tibshirani (1996) Regression shrinkage and selection via the lasso: 8744
- Nelder & Wedderburn (1972) Generalized linear models: 3818
Can anyone think of any other landmark papers to look up?
Nelder & Wedderburn invented GLMs, so you’d think they should be pretty darn near the top, but for two things, I suppose. Firstly, the most popular of these models, logistic regression and Poisson regression, are so commonplace that people no longer cite them, and secondly, the book by McCullagh and Nelder (following Robert Wedderburn’s tragic death at the age of 28) attracts most of the citations. Adding all the variants on it in Google Scholar, you get 24,297 citations, which would take GLMs up to third place, overtaking B&A, but then that is rather unfair on others with much-cited books like Little & Rubin or David Cox.
When considered per year since publication, you have to remember Google Scholar is not measuring the same thing each year. Since it got going, Google have put effort into going back into the archives and getting more books, reports and grey lit on the system. Recent years are going to produce more citations simply because of an inclusion bias, not to mention the fact that a lot more gets written and published each year now (most of it rubbish). But, given all that, B&A come out on top with 1007 citations per year, DLR second with 972, and Cox third with 866.