Clearview AI, the facial recognition company that’s scraped the web for three billion faceprints and sold them all (or given them away) to 600 police departments so they could identify people within seconds, has received yet more cease-and-desist letters from social media giants.
The first came from Twitter. A few weeks ago, Twitter told Clearview to stop collecting its data and to delete whatever it’s got.
Facebook has also demanded that Clearview stop scraping photos because the action violates its policies, and now Google and YouTube are likewise telling the audacious startup to stop violating their policies against data scraping.
Clearview’s take on all this? Defiance. It’s got a legal right to data scraping, it says.
In an interview on Wednesday with CBS This Morning, Clearview AI founder and CEO Hoan Ton-That told listeners to trust him. The technology is only to be used by law enforcement, and only to identify potential criminals, he said.
The artificial intelligence (AI) program can identify someone by matching photos of unknown people to their online photos and the sites where they were posted. Ton-That claims that the results are 99.6% accurate.
Besides, he said, it’s his right to collect public photos to feed his facial recognition app:
There is also a First Amendment right to public information. So the way we have built our system is to only take publicly available information and index it that way.
Not everybody agrees. Some people think that their facial images shouldn’t be gobbled up without their consent. In fact, the nation’s strictest biometrics privacy law – the Biometric Information Privacy Act (BIPA) – says doing so is illegal. Clearview is already facing a potential class action lawsuit, filed last month, for allegedly violating that law.
YouTube’s Terms of Service explicitly forbid collecting data that can be used to identify a person. Clearview has publicly admitted to doing exactly that, and in response we sent them a cease and desist letter.
As for Facebook, the company said on Tuesday that it’s demanded that Clearview stop scraping photos because the action violates its policies. Clearview’s response to Facebook’s review of its practices might trigger the social media behemoth to take action, Facebook said. Its statement:
We have serious concerns with Clearview’s practices, which is why we’ve requested information as part of our ongoing review. How they respond will determine the next steps we take.
Clearview: It’s just like Google Search – for faces
Besides claiming First-Amendment protection for access to publicly available data, Ton-That also defended Clearview as being a Google-like search engine:
Google can pull in information from all different websites. If it’s public […] and it can be inside Google’s search engine, it can be in ours as well.
Um, no, Google said, your app isn’t like our search engine at all. There’s a big difference between what we do and the way you’re shanghaiing everybody’s face images without their consent. Its statement:
Most websites want to be included in Google Search, and we give webmasters control over what information from their site is included in our search results, including the option to opt-out entirely. Clearview secretly collected image data of individuals without their consent, and in violation of rules explicitly forbidding them from doing so.
When is public information not public?
Clearview isn’t the first company to make money off of scraping sites. It’s not the first to wind up in court over it, either.
Back in 2016, hiQ, a San Francisco startup, was marketing two products, both of which depend on whatever data LinkedIn’s 500 million members have made public: Keeper, which identifies employees who might be ripe for being recruited away, and Skills Mapper, which summarizes an employee’s skills.
It, too, was going after public information, grabbing the kind of stuff you or I could get on LinkedIn without having to log in. All you need is a browser and a search engine to find the data hiQ sucks up, digests, analyzes and sells to companies who want a heads-up when their pivotal employees might have one foot out the door or that are trying to figure out how their workforce needs to be bolstered or trained.
When is public information not public? When the social media firms that collect it insist that it’s not public.
LinkedIn sent a cease-and-desist letter to hiQ, alleging that it was violating serious anti-hacking and anti-copyright violation laws: the Computer Fraud and Abuse Act (CFAA), the Digital Millennium Copyright Act (DMCA), and California Penal Code § 502(c). LinkedIn (which had been exploring how to do the same thing that hiQ had achieved) also noted that it had used technology to block hiQ from accessing its data.
A done deal? Not in the eyes of the courts. In September 2019, an appeals court told LinkedIn to back off: no more interfering with hiQ’s profiting from its users’ publicly available data. The court protected data scraping of public data: what sounds like a major legal precedent but which is a lot muddier than that. From the Electronic Frontier Foundation (EFF):
While this decision represents an important step to putting limits on using the CFAA to intimidate researchers with the legalese of cease and desist letters, the Ninth Circuit sadly left the door open to other claims, such as trespass to chattels or even copyright infringement, that might allow actors like LinkedIn to limit competition with its products.
And even with this ruling, the CFAA is subject to multiple conflicting interpretations across the federal circuits, making it likely that the Supreme Court will eventually be forced to resolve the meaning of key terms like ‘without authorization.’
Those cases of data scraping pitted the lovers of an open internet against the companies trying to control (and make money from) their own data. During the fight with hiQ, LinkedIn was accused of chilling access to information online. Some said that LinkedIn’s position would impact journalists, researchers, and watchdog organizations who rely on automated tools – including scrapers – to support their work, much of which is protected First Amendment activity.
Muddy as it was, the EFF hailed the September verdict as a win for the right to scrape public data.
But while groups such as the EFF were all for data scraping to get at publicly available data in the case of hiQ, they’re not on Clearview’s side. On Thursday, the EFF said that when it comes to biometrics, companies should be getting informed opt-in consent to collect our faceprints:
We need to require private companies that collect, use, retain, or share information about us—including our face pr… twitter.com/i/web/status/1…
(@EFF) February 06, 2020
In fact, Clearview is the latest example of why we need laws that ban, or at least pause, law enforcement’s secretive use of facial recognition, according to the EFF’s surveillance litigation director, Jennifer Lynch. She cited numerous cases of what she called law enforcement’s – and Clearview’s own – abuse of facial recognition, stating:
Police abuse of facial recognition technology is not theoretical: it’s happening today. Law enforcement has already used ‘live’ face recognition on public streets and at political protests.
Latest Naked Security podcast
Click-and-drag on the soundwaves below to skip to any point in the podcast. You can also listen directly on Soundcloud.