Big data: the only way is ethics

Posted: 21 July 2020

Data is everywhere and growing at an exponential rate. According to cloud software firm DOMO, there is forty times the amount of bytes of data than stars in the observable universe.1

Insurance companies have access to huge amounts of data, but this sector has not been one of the first to leverage its potential.

Data comes from myriad sources (including policy and claims records, social media, credit reference agencies, equipment sensors) but data on its own is useless. It only becomes effective once processed, analysed and/or modelled, and then used for a specific purpose. Making sense of so much information and using it in a timely manner (often near real-time) requires specialist infrastructure and skills. And the skills required are broad – they include IT skills, data science skills and legal/ regulatory expertise.

Companies which are able to effectively exploit data can reap diverse benefits, ranging from optimised underwriting to tailored proposition development, better customer segmentation and fraud reduction. Big Data presents huge opportunities for companies to gain competitive advantage, but attention must be paid to the following key subjects: consumer privacy, regulation and data ethics

Consumer privacy and regulation

Many consumers are happy to cede a level of privacy and share personal data if they stand to benefit; a Mulesoft survey revealed that nearly two thirds (62%) of 18-34 year olds stated they would be happy for their insurance provider to use third-party data from social media platforms in return for a more personalised service and better premiums.2 This willingness declined as age increased and UK consumers were more cautious about the use of third-party data (36%) than those in Singapore (63%) or the U.S. (49%).

As more and more customer data has become available to businesses, it's no surprise that these businesses have developed inconsistent views about how this data should be managed, and the appropriate level of privacy that should be attached. To address this issue, and to make it easier for EU consumers to understand and control how their personal data was stored and used, the General Data Protection Regulation (GDPR) was introduced in May 2018.

Customers’ data is not just protected by GDPR. The Equality Act 2010 exists to protect all individuals from discrimination based on characteristics including age, disability and race. Insurers are forbidden from instigating blanket policies whose terms uniformly disadvantage one such group.

The law does accept that certain exceptions may apply; for example, an insurance provider would be allowed to consider a disability as an influencing factor if the insurance risk was greater as a direct result.

Artificial intelligence (AI) can unlock the power of Big Data. It enables companies to make sense of massive data sets, but the use of AI and machine learning algorithms, which provide much deeper insight into Big Data, bring ethical challenges. Users must be aware of bias which may exist in the data and know how to minimise its effects on the models they are building. Although machines do not have human bias, they use data which does, so “AI models can embed human and societal biases and deploy them at scale”.3

One oft-quoted example of an algorithm with embedded bias is within the Correctional Offender Management Profiling for Alternative Sanctions (COMPAS) software used in the U.S. criminal justice system by some states to predict the likelihood of criminals reoffending. COMPAS has come under criticism for its alleged bias against minority ethnic groups.4

Insurance, an industry based on risk profiling, must also guard against risks of bias in the data, whether in relation to age, demographic factors, or even some random factors where there may be a chance correlation to risk but no rational explanation, or causation – such as drivers called ‘David’ being more risky than those named ‘Peter’.

Data analysis

AI can help improve data transparency and highlight bias where a process is continually tested, with a view to examining and explaining decisions made. This ‘explainable AI’ can help expose vulnerabilities and flaws and ultimately feed into data ethics models. For now, at least, it seems that a combination of AI and human involvement is the ideal approach.

The growing use of AI has prompted the Government to set up the Centre for Data Ethics and Innovation (CDEI) which advises on best practice for responsible use of data-driven technology. Further, in April 2019, the European Union High Level Expert Group on Artificial Intelligence presented the Ethics Guidelines for Trustworthy Artificial Intelligence. This states that machine learning models should be lawful, ethical and robust. This seems likely to be an area in which regulation will continue to evolve and develop.

James Tucker
Smart Technology Manager
Allianz Insurance plc
1 Data Never Sleeps 7.0. DOMO, Inc. July 2019. 
2 Mulesoft: Consumer Connectivity Insights 2018.
3 McKinsey:
4 ibid.