When a person publishes their image online, many might think that the public image can be widely re-used by others, for new and unrelated purposes. It is true that there are very few privacy and confidentiality rights that protect these published images. Intellectual property rights, such as copyright, may sometimes be relevant. However, other important rights such as contractual rights, fair use rights and data protection rights must always be fully considered.

Clearview AI Inc has been fined by data protection regulators in the UK, Greece, Italy, France and Australia for misusing images and key technologies. In the USA, it has faced class action lawsuits.  This techAnalysis examines data scraping, web scraping, facial recognition technology, artificial intelligence and the complexities of re-using the online images of individuals.

Data Scraping, Facial Recognition Technology and Artificial Intelligence

Data scraping is the process of allowing a computer programme to extract data from the output generated from another programme. Web scraping is a popular form of data scraping in which a computer application is used to extract valuable information from a website, including copying the images of individuals.

Facial Recognition Technologies are technical methods used to identify an individual from a digital image. These technologies rely on personal data and biometric data to identify individuals. In the EU and UK, the General Data Protection Regulation (GDPR) defines biometric data as personal data relating to the physical, physiological or behavioural characteristics of a person that is used to confirm their identify.  Biometric data is included in the list of special categories of personal data in the GDPR. These are some of the most sensitive forms of personal data. The collection and use of these data are high risk processing and requires extra care, attention and often, explicit consent. Other special categories data include race or ethnic origin, political opinion, religious or philosophical beliefs, trade union membership, genetic data, health, sex life and sexual orientation. Data about criminal convictions and offences also attract similar special treatment.   

Artificial Intelligence (AI) is the ability of a computer or computer-controlled robot to perform tasks and analysis in ways that are like those carried out by intelligent human beings. AI includes several techniques such as machine learning and deep learning. AI is often applied to achieve a variety of outcomes including problem-solving, reasoning, knowledge representation, natural language processing, learning, planning, perception, motion and manipulation, social intelligence and general intelligence.

The Story of Clearview AI

Clearview AI is an identity intelligence solutions company that boasts about the superiority of the accuracy and reliability of its facial recognition technology, which is powered by artificial intelligence.  The company’s customers include the police, banks, transportation and governments.  Clearview’s customers could upload a person’s image to the company’s application interface which then checks for a match against the millions of images in the database. In order to provide its services, the company collected more than 20 billion images of people and data from publicly available information from the internet and social media platforms globally for its online facial recognition database. This was done without the knowledge or consent of the individuals or the companies that published the facial images online.    

Clearview AI have been fined £7.4 million (€8.75 million) by the UK Information Commissioner’s Office and £16.91 million (€20 million) by both the Greek and Italian data protection regulators for using images of people on its online database in breach of data protection laws.  The company has also been ordered to stop collecting and using the personal data it had unlawfully gathered and to delete this information from its systems. Clearview AI breached various laws around the world. Large technology companies and social media businesses have started to investigate these practices and take legal action against companies that scrape their data and copy their online information. Often these practices are in breach of the target business’ terms and conditions and fair use policies.

Five reasons why collecting and using images and data collected online breaches data protection laws

  1. Failure to collect and use personal data in a fair and transparent way

Data protection laws require the collection and use of personal data to be fair and processed in the ways that individuals expect. The use of personal data should not lead to unjustified and adverse effects on individuals. It is important to consider lawfulness and fairness of personal data use before data processing starts. Transparent data collection and use requires clarity, openness and honesty to the individuals involved and to ensure that they are properly informed, and where necessary, give their explicit consent.

2. Failure to have a lawful reason for collecting people’s online personal data

It is very important that those who collect and use personal data know and communicate the legal reason for processing data. Gaining the consent of users or those affected is one way of legally processing a person’s information, but there are other acceptable legal routes for data collection, such as:

  • Collecting or using personal data to fulfil a contract;
  • Collecting or using personal data to fulfil a legal obligation;
  • Collecting or processing personal data for public interest tasks or an official function;
  • Collecting or processing personal data for a legitimate personal or business interest or the interests of a third party;
  • Collecting or processing personal data to protect life or a vital interest

3. Failure to have a process in place so that information is not held indefinitely

If there are no processes in place to establish the length of time for retaining personal data, a data protection regulator could find a breach of data protection law. Data retention is important. Personal data should not be kept for longer than necessary.

4. Failure to meet the higher data protection standards for biometric data

When collecting biometric data, or any other form of special categories personal data or sensitive information, all parties must ensure that they meet the higher standards for processing these data. Collecting and using these data is called high risk processing because the potential harm to individuals affected by data misuse could be substantial and severe.

5. Making the process hard for those who wish to object to their information and images being used

If a person wishes to find out whether their image is being used or stored, they should have access to a user-friendly and accessible process. Individuals should be allowed to exercise their data protection rights at any time and at little or no expense.

Advisory: Collecting and Using Online Images of Individuals

There are many issues to consider before collecting images of people from the internet:

  • Ensure that there is compliance with the data protection principles in the EU and UK or similar data privacy legal requirements around the world. A clearly identified  lawful basis for data collection should be one of the first steps. This includes ensuring that all data extraction or copying is in line with the website or platform’s terms and conditions. One solution could be to get permission form the website owner. Though, individuals may still object to the copying and use of their image. Objections by individuals affected should be fully considered and actioned. 
  • Users affected should be properly informed about how their personal data will be used and allow them to exercise their right to access, rectify or delete the information, as necessary.
  • Working with a Data Protection Officer (DPO) or Data Protection Adviser to complete a Data Protection Impact Assessment (DPIA) or Privacy Impact Assessment (PIA) is crucial.  All parties should apply Privacy and Data Protection by Design techniques to reduce data protection risks. If the DPIA identifies risks that cannot be resolved, then businesses and organisations may need to consult with their data protection regulators, before starting to collect images from the internet.


Care and attention are needed to collect and use images from the internet for any new purpose and especially for facial recognition and artificial intelligence activities.  Full legal awareness, proper processes and procedures are very important, or regulators could impose fines and order data to be deleted. This would reduce trust, limit business opportunities, curb innovation, be costly and severely damage reputations.

This techAnalysis is produced in association with Johnson May.

PrivacySolved has years of expertise in UK, EU and global data protection and has worked with the key regulators. We also advise on new technology and artificial intelligence compliance. For advice, support, projects and programmes, contact PrivacySolved:

Telephone:  +44 (0) 207 175 9771 (London)

Telephone:  +353 1 960 9370 (Dublin)

Email: contact@privacysolved.com