Today I’m sharing a nice little dataset that I think has some good features for teaching. Hope you like it.
I spotted this in the museum in Jasper, Alberta in 2012 and took a photo.
Later, I e-mailed the museum to find out who I should credit for it and we eventually found that it originated some time ago from Parks Canada, so thanks to them and I suggest you credit them as source if you use it.
No, I don’t have it in a file. I think working from the typewritten page is quite helpful as it keeps people out of stats software for this. They have to think. If you want to click buttons, there are a gazillion other datasets out there. This is a different kind of exercise.
Here we have the number of scars in tree rings that indicate fires in various years. If you look back in time through a tree’s rings, you can plot when it got damaged by fire but recovered. This could give an idea of the number of fires through the years, but only with some biases. It would be an interesting exercise for students who are getting to grips with the idea of a data-generating process. You could prompt them to think up and justify proposed biases, and hopefully they will agree on stuff like:
- there’s a number of fires each year; we might be able to predict it with things like El Nino/a years, arrival of European settlers and other data sources*
- the most ancient years will have few surviving trees, so more and more fires will get missed as you go back in time.
- This might not be random, if the biggest (oldest) trees were more likely to get felled for wood
- there will be a point (perhaps when Jasper became a national park) after which fires in the backwoods are actively prevented and fought, at which point the size of the fires, if not the number, should drop
- the bigger the fire area, the more scars will be left behind; they have to decide to work with number of fires, or size (or both…)
- the variables for size of the fire will be quite unreliable in the old days, but a good link from number of fires to number of scars otherwise
- can we really trust the area of burn in the older years? to 2 decimal places in 1665?
- and other things that are very clever and I haven’t dreamt of
* – once they are done with the data generating process, if they are confident enough with analysis, you could give them this dataset of Canada-wide forest fires, which I pulled together from a few years ago. It’s not without its own quirks, as you’ll see, but they might enjoy using it to corroborate some of their ideas.
I would ask them to propose a joint Bayesian model for the number of fires and area burnt over the years, including (if they want) predictions for the future (bearing in mind the data ends at 1971). You could also ask for sketched dataviz in a poster presentation, for example.
Finally, I highly recommend a trip to Jasper. What a beautiful part of the world!