Mishcon de Reya page structure
Site header
Main menu
Main content section
Blue data abstract

Dutch Data Protection Authority releases guidance on "data scraping"

Posted on 8 August 2024

New guidance (currently only available in Dutch) has been released by the Dutch Data Protection Authority (Dutch DPA) in relation to "data scraping". The Dutch DPA advise that data scrapers with commercial interests cannot rely on "legitimate interests" as the basis for the collection and processing of scraped data. Although it is perhaps unlikely that a similar approach would be taken by the Information Commissioner, in the UK, it will be important to track how the issue develops.

Data scraping

Data scraping is where a "bot" or automated computer program captures information that is stored online on a mass scale. One of the uses of data scraping in recent years has been to collect data to train AI (Artificial Intelligence) and LLMs (Large Language Models) such as that used in ChatGPT.

The Dutch DPA's guidance

For processing of personal data to be lawful under the EU GDPR one of the bases listed in Article 6(1) must apply. Article 6(1)(f) provides such a basis where processing is necessary for the legitimate interests of the data controller, or any other person, as long as those interests are not overridden by the interests, rights or freedoms of data subjects.

In the Dutch DPA's view, commercial entities cannot rely on legitimate interests to collect and process data that has been scraped; rather, explicit consent is always required from the data subjects.

The Dutch DPA recognises that the collection of consent on this scale from data subjects who may not be contactable would be impractical. Given the impracticalities, the Dutch DPA says that there is, therefore, no legal basis under EU GDPR for the practice of data scraping by commercial entities.

The Dutch DPA does accept that scraping of personal data for personal, non-commercial, purposes may be compliant without consent. However, if that data was then shared on a public repository (such as GitHub) then it is unlikely that further processing would be lawful.

The guidance requires certain risk assessments and safeguards to be put in place to protect data subjects in the instances where data scraping is compliant. The Dutch DPA also suggests that if a high risk to privacy is identified when completing a DPIA (Data Protection Impact Assessment) then a controller would have an obligation under Article 36 of the GDPR to consult with the Dutch DPA directly, so that it can assess intended processing and what measures may be needed before any processing starts.

The guidance is particularly critical of data scraping that is used to train AI. This is because it identifies that information stored on the internet may contain incorrect, biased, and discriminatory information which may pose a risk to fundamental rights when the AI is later deployed.

The guidance does not address enforcement, which may raise questions, at least for the time being, about what realistic deterrent there is to poor compliance.

Is this indicative of other Regulators' direction of travel?

The approach taken by the Dutch DPA to whether commercial entities can rely on the "legitimate interests" basis is one which has been the subject of some criticism and which has been referred to the Court of Justice of the European Union (CJEU) by the Duch court in another case. It is certainly not shared by all (if any) other European data protection authorities.

Nonetheless, data scraping is an area of increasing scrutiny for some regulators, and we are aware that the Polish Data Protection Authority fined a Swedish scraper €220,000 in 2019. However, this was in response to a failure to notify individuals that their data had been scraped, as opposed to the scraping itself. Similarly, the Spanish regulator fined Equifax Inc. €1 million in 2021 for failing to inform data subjects, limit their collection to what was necessary, ensure the data's accuracy, and satisfy the balancing test required to rely on the legitimate interest basis for collection and processing.

It is difficult to say whether other data protection authorities and regulators will take similar positions to this guidance from the Dutch DPA (whilst probably not adopting the full "legitimate interests" analysis). It may be that other authorities wait to see what happens following this guidance, and how the CJEU rules on the "legitimate interests" point, before implementing their own.

How can we help you?
Help

How can we help you?

Subscribe: I'd like to keep in touch

If your enquiry is urgent please call +44 20 3321 7000

Crisis Hotline

I'm a client

I'm looking for advice

Something else