This page is just out from the New York Times, showing trends in the USA jobs report and how they could easily be an artefact of sampling error. It’s a magnificent piece of data reporting: clear, punchy and helpfully demystifying. Sampling error is just what happens when you don’t have all the data, just a slice of it. You might be unlucky and get a wild over- or under-estimate. If you hear someone say that statistics is a tool to help you make decisions under conditions of uncertainty, or that statistics is a missing data problem, this is typically what they are getting at. Sampling error is not the only type of uncertainty, but it’s the one most amenable to mathematical probing, and often is the biggest one too.
As a keen D3 hanger-on, I have seen didactic examples about which randomise data like this, but it never occurred to me that it could actually be used to show uncertainty by simulation of the data (rather than some general fuzzy wobbling). I am very impressed with the clarity of the writing too.