Centrelink’s recent debt recovery woes perfectly illustrate the human side of data modelling. The Department for Human Services issued 169,000 debt notices after automating its processes for matching welfare recipients’ reported income with their tax. Around one in five people are estimated not to owe any money. Over Christmas, stories abounded of people receiving erroneous debt notices up to thousands of dollars that caused real anguish.
Coincidentally, as this all unfolded over the break, one of the books on my reading pile was Weapons of Math Destruction by Cathy O’Neil. She is a mathematician turned quantitative analyst turned data scientist who writes about the bad data models increasingly being used to make decisions that affect our lives.
Guidelines are also being produced to help organisations be more transparent and accountable in how they use data to make decisions. For instance, The Open Data Institute in the UK has developed openness principles designed to build trust in how data is stored and used. Algorithmic transparency is being contemplated as part of the EU Free Flow of Data Initiative and has become a focus of academic study in the US.
However, we cannot rely on regulation alone. Legal, transparent data models can still be ‘bad’ according to O’Neil’s standards. Widely known errors in a model could still cause real harm to people if left unaddressed. An organisation's normal processes might not be accessible or suitable for certain people – the elderly, ill and those with limited literacy – leaving them at risk. It could be a data model within a sensitive policy area, where a higher duty of care exists to ensure data models do not reflect bias. For instance, new proposals to replace passports with facial recognition and fingerprint scanning would need to manage the potential for racial profiling and other issues.
Ethics can help bridge the gap between compliance and our evolving expectations of what fair and reasonable data usage is. O’Neil describes data models as “opinions put down in maths”. Taking an ethics based approach to data driven decision making helps us confront those opinions head on.