This came up on Twitter and lots of people were outraged, as you see in the replies and retweets.
Let’s unpack a couple of things.
- appreciate – it’s not clear what he means by this. It could mean “Many software engineers will never be really good at data science using modern machine learning”, which seems like tautology (same goes for estate agents), but see software engineers below. It could mean “Many software engineers will never truly have an intuitive attraction to the elegant mathematical underpinnings of modern machine learning”, and in that case it is true that there is a connection between maths and, er, maths, but that’s not very interesting. Appreciating in this sense is an ivory tower luxury.
- love – lord above, are you trying to fool me in love? I think high-pressure rote learning in the Asian mould would do the trick too. It seems irrelevant.
- as a teen – this is what most people hated about it, the gatekeeping and stereotype-enforcement. It’s clearly bollocks, so let’s not waste time on Someone Said Something Wrong On The Internet. If you want to learn now, here’s my reading page.
- software engineers – if he really is talking about software engineers (isn’t that term, like, a bit 1990s?), then it sounds fair enough despite the inaccuracies and tautologies. Why would they want to or need to have anything to do with modern ML? I’m a statistician, but do enough programming to grasp what it is like to be a day-in, day-out coder. You just grab something that someone wrote — a random forests library perhaps — and plug it in. Why would you appreciate its theory? That’s a waste of time. You don’t go round appreciating the hell out of fibre broadband cables.
- modern machine learning – I don’t know what is meant by this, but it’s interesting to me that there are some things in ML and stats like logistic regression, which have strong, mathematical underpinnings, which is to say that their asymptotics are understood, and other things in ML and not stats, like deep learning with backprop, which are kind of greedy, heuristic and do not have guaranteed or even understood asymptotics. Depending on what he means by this phrase, there might be nothing to appreciate. If there is something to appreciate, then it might not be that modern — logistic regression was pretty much finished theoretically in the 70s, PCA in the 30s.
- math – this is the really interesting thing. Do you need maths to do data science well? It certainly helps with reading those tortuous theory papers (but they’re not that useful compared to messing about with software). It is not as useful as programming (hi, software engineers!) skills. The reason a lot of people get caught out is because they have done some analysis that ran, produced no error messages, but led to the wrong answer, and they had no mental tools to spot it. Maths will not give you that tool; you need to think about data and have messed around getting your hands dirty. I studied maths and enjoyed it and did pretty well, if I say so myself, but that has been of very little use to me. I’ve forgotten most of it.
If you really do intend to be a methodological stats prof, then you’d better get good with the old x’s and y’s, but otherwise, install R and play.
Perhaps the one really useful skill I acquired is imagining data as points in space, rotating, distorting, projecting. I had to do a lot of that when doing a Masters dissertation project with PCA, MCA, etc. That has genuinely helped me to develop ideas and think about where things are going wrong.
The other important thing to think about is metrics – different ways of quantifying the distance from this data point to that one, because that underpins a lot of stuff that follows, whether stats or ML (notably loss / log-likelihood functions). And I have another blog post on this very topic coming up.