The parts of social services that do child protection in England get inspected by Ofsted on behalf of the Department for Education (DfE). The process is analogous to the Care Quality Commission inspections of healthcare and adult social care providers, and they both give out ratings of ‘Inadequate’, ‘Requires Improvement’, ‘Good’ or ‘Outstanding’. In the health setting, there’s many years’ experience of quantitative quality (or performance) indicators, often through a local process called clinical audit and sometimes nationally. I’ve been involved with clinical audit for many years. One general trend over that time has been away from de novo data collection and towards recycling routinely collected data. Especially in the era of big data, lots of organisations are very excited about Leveraging Big Data Analytics to discover who’s outstanding, who sucks, and how to save lives all over the place. Now, it may not be that simple, but there is definitely merit in using existing data.
This trend is just appearing on the horizon for social care though, because records are less organised and electronic, and because there just hasn’t been that culture of profession-led audit. Into this scene came my colleagues Rick Hood (complex systems thinker) and Ray Jones (now retired professor and general Colossus of UK social care). They wanted to investigate recently open-sourced data on child protection services and asked if I would be interested to join in. I was – and I wanted to consider this question: could routine data replace Ofsted inspections? I suspected not! But I also suspected that question would soon be asked on the cash-strapped corridors of the DfE, and I wanted to head it off with some facts and some proper analysis.
We hired master data wrangler Allie Goldacre, who combed through, tested and verified and combined together the various sources:
- Children in Need census, and its predecessor the Child Protection and Referrals returns
- Children and Family Court Advisory and Support Service records of care proceedings
- DfE’s Children’s Social Work Workforce statistics
- SSDA903 records of looked-after children
- Spending statements from local authorities
- Local authority statistics on child population, deprivation and urban/rural locations.
Just because the data were ‘open’ didn’t mean they were useable. Each set had its own quirks and each local authority had its own problems and definitions in some cases. The data wrangling was painstaking and painful! As it’s all in the public domain, I’m going to add the data and code to my website here, very soon.
Then, we wrote this paper investigating the system and this paper trying to predict ‘Inadequate’ ratings. The second of these took all the predictors in 2012 (the most complete year for data) and tried to predict Inadequates in 2012 or 2013. We used the marvellous glmnet package in R and got down to three predictors:
- Initial assessments within the target of 10 days
- Re-referrals to the service
- The use of agency workers
Together they get 68% of teams right, and that could not be improved on. We concluded that 68% was not good enough to replace inspection, and called it a day.
But lo! Soon afterwards, the DfE announced that they had devised a new Big Data approach to predict Inadequate Ofsted scores, and that (what a coincidence!) it used the same three indicators. Well I never. We were not credited for this, nor indeed had our conclusion (that it’s a stupid idea) sunk in. Could they have just followed a parallel route to ours? Highly unlikely, unless they had an Allie at work on it, and I get no impression of the nuanced understanding of the data that would result from that.
Ray noticed that the magazine Children and Young People Now were running an article on the DfE prediction, and I got in touch. They asked for a comment and we stuck it in here.
A salutary lesson that cash-strapped Gradgrinds, starry eyed with the promises of big data after reading some half-cocked article in Forbes, will clutch at any positive message that suits them and ignore the rest. This is why careful curation of predictive models matters. The consumer is generally not equipped to make the judgements about using them.
A closing aside: Thomas Dinsmore wrote a while back that a fitted model is intellectual property. I think it would be hard to argue that coefficients from an elastic-net regression are mine and mine only, although the distinction may well be in how they are used, and this will appear in courts around the world now that they are viewed as commercially advantageous.