Share This Article
The European Data Protection Board (EDPB) has released its ChatGPT Taskforce Report, and it highlights privacy issues that might have an impact for any developer and deployer of GenAI solutions.
Hereโs what you need to know.
Web Scraping and Data Processing:
The report reveals the usage of legitimate interest for collecting and processing personal data to train ChatGPT and then sets out the limits in which according to the EDPB that would be acceptable.
According to the EDPB, legitimate interest can in theory be the legal basis BUT safeguards shall be put in place to mitigate undue impact on data subjects, potentially altering the balancing test in favor of the data controller such as
- Technical measures to filter data collection;
- Exclusion of certain data categories and sources (e.g., public social media profiles); and
- Deletion or anonymization of personal data before training
Critical Issues – Transparency Obligations:
When notifying data subjects isnโt feasible (as with scraping), controllers must make information publicly available to protect data subjects’ rights.
According to the EDPB, the current status is that
- ChatGPTโs data collection methods are not publicly transparent;
- Data subjects canโt easily exercise their rights (e.g., right to be forgotten);
- The system still outputs personal information, indicating that data isnโt fully anonymized.
- OpenAI is making deals with platforms like Reddit, which contain personal data. This suggests – according to the EDPB an ongoing reliance on personal data without sufficient safeguards.
In my view, the EDPB shall acknowledge the potentials of generative artificial intelligence for our society and find a manageable solution to balance compliance with proper exploitation of the technology. It would be a major change in the approach by privacy authorities that rarely have a business-oriented approach. However, the EU with the approval of the AI Act took a clear stance in favor of the proper usage of AI and authorities shall work with GenAI developers to find feasible solutions.
The current approach might not be in the general interest and open discussions with AI providers might help to find a solution that appropriately balances the interests of all the parties involved.
On the topic, you can read the article “The Italian case on ChatGPT benchmarks generative AIโs privacy compliance?“.