Anonymisation is a critical process for life sciences organisations to protect individuals' privacy while handling personal data. However, the specificity and detail of the personal data used in this sector poses significant challenges to effective anonymisation, sparking extensive academic and judicial debate.
This article delves into the concept of "anonymised" data within the data protection framework, addresses the complexities of the topic, and offers recommendations for organisations in the life sciences field.
The challenge for the life sciences sector
Life sciences organisations may collect personal data from many sources. They might collect human samples from sponsored studies, receive datasets that have already been processed outside of the organisation, or work with public data.
In many cases it may be possible to combine data held within the organisation, or that is easily accessible from collaborators, to identify an individual from the data. To avoid some of the obligations that come from processing personal data, many organisations would like to be able to treat data that they work with, and disclose to third parties, as anonymised data.
The legal background
Anonymous data falls outside the scope of EU and UK data protection laws, namely the GDPR and UK GDPR, which govern the processing of "personal data." Personal data is defined as information relating to an identifiable person.
Pseudonymous data, by contrast, is still personal data, and subject to data protection laws. The GDPR defines "pseudonymisation" as processing data such that it cannot be attributed to a specific individual without additional information, which must be kept separate and protected.
The crux of determining whether information is anonymous involves assessing if a living individual can be identified using "all the means reasonably likely to be used." This includes considering the costs, time, and available technology for identification. The law requires a binary test, to ascertain whether an individual can be identified using reasonably likely means, rather than the likelihood of identification.
One interesting issue that can arise is that the same dataset can be considered personal data or anonymised data depending on who holds it. For instance, if Person A has a pseudonymised dataset and a key to identify individuals, it constitutes personal data for them. However, if Person B holds the same dataset without the key, it constitutes anonymised data for them.
Issues arising
In most circumstances, the complexities are unlikely to present an issue, but it is necessary to draw attention to them, if only to be aware that this is not always a straightforward area of law.
For example, anomalies within a dataset such as a patient in an age demographic of 30 - 40 with dementia are valuable data points and cannot easily be excluded from the dataset without impacting the effectiveness of the research. However, it is much easier to re-identify outliers from limited data compared to routine cases.
Regulatory guidance in the UK
The UK's Information Commissioner's Office (ICO) produced its Anonymisation Code of Practice in 2014, which remains relevant despite being based on the pre-GDPR regime. In May 2021, however, the ICO began drafting new guidance on anonymisation, pseudonymisation and privacy enhancing technologies (PETs), and held a consultation that closed in December 2022. However, development of the aspects of the guidance relating to anonymisation and pseudonymisation is on hold pending the passage of the Data Protection and Digital Information Bill.
Risk-based steps, measures, and mitigations
Organisations can take several steps to mitigate the risks of identification and to comply with regulatory requirements:
- Reduce identifiability: Aim to make identification as remote as possible, using the ICO's "motivated intruder" test as a benchmark. Such an intruder is described as "a person who starts without any prior knowledge but who wishes to identify the individual from whose personal data the anonymised data has been derived".
- Technical and organisational measures: Implement measures such as access controls, secure data transfer methods, and encryption to reduce the risk of identification.
- Privacy-enhancing technologies (PETs): The ICO's June 2023 guidance on PETs includes measures such as differential privacy, synthetic data, encryption, and trusted execution environments.
- Risk assessments: Even when the processing involved might not strictly mandate Data Protection Impact Assessments (DPIAs), it would be sensible to undertake them. A DPIA brings potentially at least three benefits: it informs decision-making; it helps mitigate risk; and it serves as documentation to rely on in the event of any subsequent complaints or challenges. Again, the ICO has guidance on undertaking DPIAs.
- Person A/Person B Approach: This approach means that only one party, such as the collaborator, holds the key and the other party, such as the sponsor, does not. In these circumstances the collaborator would hold pseudonymised data, but it would be anonymised in the sponsor's hands.
Mishcon de Reya, with its deep-rooted expertise in working with life sciences organisations, can assist in reviewing and negotiating agreements to ensure responsibilities for data protection are appropriately allocated.