On June 19, 2025, the French data protection authority (CNIL) published two new practical guidelines [1] [2] detailing its recommendations regarding the use of legitimate interest as a legal basis for developing artificial intelligence systems (“AIS”), particularly in cases involving the harvesting of data available online (web scraping). These guidelines, which are part of the CNIL’s ongoing work on the development of AI systems and the creation of databases used for their training, complement the practical guidelines published in 2024 [3].
The CNIL reiterates that legitimate interest will often be the appropriate legal basis for the development of AIS by a private organization. A public body may also rely on this legal basis “only when the activities in question are not strictly necessary for the performance of its specific missions but pertain to other legally implemented activities (such as processing for HR management)”.
The CNIL’s recommendations also emphasize that the use of legitimate interest requires fulfilling three conditions :
If the controller uses web scraping to build training databases, its legitimate interest may also be invoked, provided certain conditions are met. Indeed, given the risks that this practice poses to the rights and interests of the data subjects, who have no control over the reuse of their data accessible online, its implementation requires particular vigilance.
The CNIL thus specifies that the controller must, in particular :
The CNIL will soon publish further recommendations concerning the status of AI models under the GDPR, security aspects in AIS development, and data annotation practices.
[2] The legal basis of legitimate interest : focus sheet on measures to be taken in the event of data collection through web scraping, CNIL
[3] AI practical fact sheets, CNIL