Share This Article
The European Data Protection Boardโs Opinion 28/2024 represents a landmark effort by the EDPB to clarify how the GDPR applies to AI models.
With organizations increasingly turning to Artificial Intelligence for decision-making, customer service, fraud detection, and personalization, the question of how to reconcile these technologies with stringent data protection laws has never been more pressing.
Previously, companies struggled to fit fast-evolving AI capabilities into a regulatory framework designed before widespread AI adoption. Opinion 28/2024 responds to a request from the Irish Supervisory Authority, addressing four key areas:
- Conditions for considering AI models โanonymousโ
- Using legitimate interest as a legal basis
- Handling the aftermath of unlawful data processing during model development
- Emphasizing documentation, accountability, and continuous risk management
This guidance is not limited to Large Language Models or generative AI. Any AI model that is trained on personal data, regardless of complexity or purpose, falls within its scope.
You can watch below episode of our podcast on the opinion of the EDPB on AI training:
You can also listen the episode as a podcast on Apple Podcasts,ย Google Podcasts,ย Spotifyย andย Audible
And you can read the article below:
AI Models and Anonymity: Meeting a High Standard
A significant point in the EDPBโs opinion concerns the criteria under which AI models can be considered truly anonymous. Merely stripping out direct identifiers no longer suffices. The EDPB raises the bar, insisting on a case-by-case analysis and warning that even aggregated data may be susceptible to re-identification attacks like model inversion or membership inference.
Key Takeaways:
- Case-by-case Analysis: Each AI model is unique, and organizations must demonstrate that re-identification risks are not โreasonably likely.โ
- High Threshold for Anonymity: Techniques like differential privacy can help inject โnoiseโ to prevent extracting personal data, but pseudonymization alone is insufficient.
- ICOโs Pragmatic Angle: The UK ICOโs position is slightly more pragmatic, urging organizations to implement realistic safeguards and justify potentially risky approaches, such as web scraping, with robust data minimization strategies.
By understanding these requirements, businesses can better align their data-handling practices with the EDPBโs strict standards on anonymity.
Relying on Legitimate Interest: A Strict Balancing Test
The EDPBโs opinion clarifies that legitimate interest (Article 6(1)(f) GDPR) is not a โquick fixโ for justifying data processing in AI models. Instead, it imposes a three-step test:
- Identify a Legitimate Interest: The interest must be concrete, lawful, and clearly defined.
- Assess Necessity: Is the personal data essential to achieve the stated goal? Could the same result be achieved with less intrusive methods?
- Balancing Test: The rights and freedoms of data subjects must not be overridden. Consider data sensitivity, transparency, and potential discrimination risks.
ICOโs Perspective on Legitimate Interest:
The ICO encourages specificity. By clearly defining the interest behind training an AI model, controllers can strengthen their case when balancing organizational goals against individual rights. Detailed documentation, possibly in the form of Data Protection Impact Assessments (DPIAs), is crucial.
Dealing With Unlawfully Processed Data in AI Development
What happens if personal data used to train an AI model was originally obtained unlawfully? The EDPBโs opinion outlines several scenarios:
- Same Controller, Tainted Data: If the controller that unlawfully processed data continues to deploy the model, they must re-examine the modelโs compliance and may need to halt use, retrain it, or apply remedial measures.
- Third-Party Acquisition of the Model: A company purchasing a pre-trained AI model must conduct due diligence. Ignorance of unlawful origins is no defense.
- Anonymization Before Deployment: If the modelโs personal data are truly anonymized before deployment, the GDPR may cease to applyโprovided this anonymization meets the high threshold set by the EDPB.
This framework stresses accountability across the AI supply chain. Every party involved must ensure data legality, not just the initial data collector.
Documentation and Accountability: The Cornerstones of Compliance
Throughout its opinion, the EDPB underscores the importance of documentation and accountability. Robust records are essential to demonstrate compliance at every stage of AI model development.
Documentation Essentials:
- DPIAs: Particularly for high-risk processing scenarios, DPIAs should identify risks, propose mitigations, and highlight safeguards.
- Records of Processing Activities (Article 30 GDPR): Keep detailed logs of data sources, purposes, and protective measures.
- Technical Reports & Vulnerability Assessments: If using differential privacy or other controls, maintain evidence to back up these claims.
A transparent record-keeping strategy not only helps in regulatory audits but also builds trust with users, clients, and partners.
Risk Management and Privacy-Enhancing Techniques for AI Models
A proactive risk management approach lies at the heart of EDPB-compliant AI development. This begins with privacy-by-design principles:
Core Strategies:
- Data Minimization: Only collect what the AI model truly needs.
- Pseudonymization and Differential Privacy: Consider injecting noise, encrypting data, or limiting access to reduce re-identification risks.
- Regular Testing and Updates: Continuously test the model against known attacks and update controls as technology evolves.
Supervisory authorities will likely want to see tangible evidence that organizations actively mitigate risks. Adopting state-of-the-art techniques and regularly assessing vulnerabilities demonstrates a firm commitment to privacy and security.
Special Categories of Data and Automated Decision-Making
If an AI model processes special categories of data (e.g., health, biometrics), the GDPR imposes even stricter rules. Valid exemptions or explicit consent become critical. The potential harm to individuals is higher, and so are the stakes for compliance.
Automated Decisions Under Article 22 GDPR:
For systems that significantly affect individualsโsuch as those determining credit eligibilityโtransparency and human oversight are non-negotiable. Controllers must provide understandable explanations of how the AI model makes decisions and offer meaningful avenues for individuals to challenge or request human intervention.
Meeting these standards may require close collaboration between legal and technical teams to translate complex decision-making logic into clear, intelligible information for data subjects.
Harmonization and Contextual Application of the EDPB Opinion
The EDPBโs opinion promotes harmonization across EU/EEA jurisdictions, but it also acknowledges that no two AI models are identical. Each AI model demands a tailored approach, factoring in data types, processing methods, and intended uses.
For businesses, this means there is no one-size-fits-all template. Expect to adjust your compliance strategy model by model, remaining agile as new guidance and regulations (like the upcoming EU AI Act) emerge.
Real-World Example: The GEDI Case
A recent case involving GEDI, a major Italian media group, and its arrangement with OpenAI offers a practical illustration of the EDPBโs opinion in action. GEDI intended to share large news archivesโcontaining personal dataโwith OpenAI for AI model training.
Regulatory Concerns:
- Legal Basis: Was legitimate interest or another legal basis clearly established?
- Transparency: Could individuals reasonably know their data might be used to train AI?
- Data Subjectsโ Rights: Were there mechanisms for data subjects to object or understand how their data was processed?
The Italian Data Protection Authority (Garante) raised red flags, showing that even well-established companies must rigorously justify data usage. The GEDI case underscores that vast datasets of seemingly โpublicโ information can still be personal data under the GDPR.
Practical Steps from the EDPB opinion for Businesses Implementing AI Models
To navigate these complex requirements set our by the EDPB opinion, businesses willing to exploit AI can adopt a robust, proactive approach:
- Perform DPIAs Early and Often: Identify high-risk areas and revisit assessments as models evolve.
- Choose and Document Your Legal Basis Carefully: Legitimate interest requires a thorough balancing test; consent demands transparency and meaningful choice.
- Implement Privacy-Enhancing Techniques: Differential privacy, anonymization, and pseudonymization can reduce re-identification risks.
- Maintain Detailed Documentation: Keep rigorous records of processing activities, justifications, and risk assessments to satisfy EDPB standards.
- Test for Vulnerabilities: Regularly evaluate your AI model against membership inference attacks, model inversion, and other potential exploits.
- Ensure Transparency: Make sure data subjects know how their data is used and what rights they have.
- Stay Informed: Track regulatory updates and evolving guidance to remain compliant and competitive.
Lessons from the opinion of the EDPB on AI models
The EDPBโs opinion on AI model compliance sets the stage for a future where innovation and data protection must walk hand-in-hand. By understanding the high bar set for anonymity, the conditions for using legitimate interest, the ramifications of unlawful data processing, and the need for rigorous documentation and risk management, businesses can navigate a complex landscape with confidence.
As AI technologies advance, so will regulatory expectations. Organizations that invest in privacy-by-design, ongoing audits, and clear communication will be better positioned to maintain trust, avoid enforcement actions, and remain at the forefront of responsible AI innovation.
On the same topic, you can read the articles available HERE.