RICK WEISS: Hi, everyone. Rick Weiss here, director of SciLine. Welcome to this latest media briefing. I want to take one minute upfront just for those of you who may not be familiar with SciLine to know who you are working with here. We are a philanthropically funded, entirely free nonprofit service based at the American Association for the Advancement of Science in Washington, D.C., with one overarching goal, which is to help get more research-based evidence into your news stories. We know that journalism is changing fast. We know that not everyone who’s covering science today has a deep science background or has time to find the science sources that would help you get more evidence into your stories.
We offer a variety of free services to help you do that, including our matching service by which you just get in touch with us, tell us what you’re working on and we will help connect you to one or more experts who are not only excellent in their research discipline but have been vetted for their communication skills as well. We also run all-expenses-paid boot camps for journalists to get you up to speed on different areas of topics and provide on our website a variety of fact sheets that are designed just for you – hair-on-fire, on-deadline journalists who just want to get the facts fast. They’re produced in-house. They’re all vetted by outside experts before we post them, so you can trust them and use that information in your stories. We also, of course, run these media briefings, which we will start in just a moment here. They include, typically, three experts who will speak briefly because we know that the most important thing you want to do is ask questions. And then we’ll open it up for Q&A. As Josh mentioned, you can type in your questions at the end of the briefing, and I will be reading them aloud to our experts.
The bios for today’s experts are on the SciLine website, so I’m not going to really spend time to go through them now. I will just mention briefly that the order of events will be that – we will hear first from Alice O’Toole, who is a professor in the school of behavioral and brain sciences at the University of Texas at Dallas, who will give us a little bit of an introduction to facial recognition technology and some of the evolution of it and where we’re at today, some of the issues arising from that technology. Dr. Patrick Grother, next, is a scientist at NIST, the National Institute of Standards and Technology, a division of the U.S. Commerce Department that is all about metrics and has, over the last several years, been doing some incredible work to measure the accuracy of facial recognition and will help us understand what the word accuracy really means in that case. And we’ll talk in particular somewhat about the issue of bias, which is a common subtheme within stories on this technology. And last, Dr. Nita Farahany, professor of law and philosophy at Duke University School of Law, founding director of Duke’s Science & Society and chair of Duke’s bioethics and science master’s program there – science policy master’s program there, who will talk about some of the social, legal, ethical issues that are being raised by this new technology. And with that, let’s just turn it over to Dr. O’Toole. Alice, it’s all yours.
ALICE O’TOOLE: OK, so I’m sharing my screen now. Let me know when you can see that.
RICK WEISS: Looks good.
Facial Recognition: Humans vs. Machines
ALICE O’TOOLE: OK. So I’m going to address a question that we hear asked a lot these days. How accurate is computer-based face recognition? So we actually know a lot about this, primarily from tests that have been done at the National Institute of Standards and Technology, and Dr. Grother’s going to talk about some of that, I’m sure. My perspective is a bit different. My work compares computer-based face recognition to humans, which is a system we all sort of know something about. So what I’m going to try and do in the short time here is first introduce how we make these comparisons. Then we’re going to go back in time 10 years and talk about how machines were 10 years ago vis-a-vis human metrics and then fast-forward to the present, do an update on that. And then I’ll just end with some – just a very quick perspective on how we think about the challenges of recognizing faces of different races by machines and a little bit by humans. OK.
So with that, let’s start with the comparisons. So a subject in my lab would be in an experiment like this. They would see pairs of images of this sort. So some of them are pictures of the same people. Some are pictures of different people. And they would be required here to give an answer as to how sure they are that the person is the same person or different people. And so this would be their rating. So here we see a picture – a pair of pictures of the same person, and so hopefully, they would say they’re the same. So the machine would take an image – take the first image, for example – push it through its processing and produce a representation. And then it would take the second image in the pair, push that through and produce a second representation and suffice to simply measure the similarity of the representations produced by these two – by the two images through the machine. And you have a measure that’s really easily comparable to the measure that humans give, OK? And so you would need to set a threshold or a criterion and say, anything more than this number we’ll call the same person. Anything less than this number we’ll call different people. OK, so we’ve used this technique many, many times in my lab.
Let’s just focus in on a test we did about 10 years ago on humans and algorithms at the time. So what you’re looking at on your screen is six pictures of the same person. And these NIST was able to divide – National Institute of Standards and Technology was able to divide into pictures that were thought to be very – or not very challenging at all for machines circa 10 years ago. And you can see these are very similar images. More challenging images change the illumination here and the expression, and really very challenging images change lots of things all at the same time. So these are quite challenging to compare. So when we did the human-machine comparison 10 years ago, what we found were the good – the pairs of images that were rated to be relatively easy to compare – is that the machine was far better than the humans at these. As the task got a little bit more difficult, the machine was still quite a bit better than humans at this task, and only when we went to these really difficult images was the performance of humans and machines about equal. OK, so let’s update that to 2018. And what I’ll say before I put up that graph is, first of all, over many years of doing these comparisons, we were able to pick out image pairs that are really challenging. And so we used very challenging image pairs in this comparison we did in 2018.
So the other thing I’ll say is circa, you know, 2012, 2013, 2014, a new algorithm came onto the scene. And it has been applied to many tasks in computer vision. And I won’t say anything else about the algorithm except to say that very quickly, performance got quite a bit better than previous generation algorithms. So let me show you the comparison we did recently. And so to orient you to the graph, random performance is at 0.5. Perfect performance is at one. And we tested several groups of human subjects – and I’ll talk about those in a second – plus a bunch of algorithms, four algorithms. OK, so beginning with data on students – so this is our proxy for the standard, normal person. What we see in this graph is every black dot is the performance of a single subject in the task. This red dot is the median performance. And so you can see this is, in fact, a challenging task. Normal people do better than chance, but they’re far from perfect. So then we tested a group of specialists, face – forensic face recognition examiners. These are people who testify in court. They hail from five continents. And we see that these guys are really very good at this task. The median is way up here. There’s lots of people up in this range. And then the new and disturbing thing about these data is that we do have some national professionals who – super-recognizers. You may have heard of these in recent years, yeah?
RICK WEISS: Alice, your audio is going a little bit bad here.
ALICE O’TOOLE: Oh, is that a little better?
RICK WEISS: That’s better.
ALICE O’TOOLE: I’ll sit closer to the camera. OK, so super-recognizers are people with no forensic training but with some talent for face recognition. They actually do very well on this task as well. We also tested a group of fingerprint examiners, and the reason for testing them is they have forensic training but they don’t know anything in particular about faces. And they’re certainly better than the students but not as accurate as the experts on faces. So when we look at the algorithms beginning with an algorithm in 2015, it performed about at the level of students. An algorithm from 2016 hops up to fingerprint examiners. An algorithm from 2017 is really at the level of super-recognizers. And then the latest algorithm we were able to test in this comparison is actually at the level of the best humans. So this was very interesting to us, but honestly, the most interesting result from this set of experiments and simulations was what happened when we combined the human and the computer. So if you allowed the judgments of the humans to be combined with the judgments of the best algorithm here – this one – performance there actually gave us a median very close to perfect.
So the combination of the human and the machine working together was better than the best machine alone and any of the best examiners alone. OK. So that’s our sort of best performance right now. So the last thing I’ll end with are some myths on face recognition accuracy across race. And so I don’t have long here, so I’m just going to put these up, and I guess we can talk a little bit about them in the questions. So the first myth is that face identification would be fair if we eliminated the machines. So as a psychologist – and that’s actually my training. As a psychologist, I can assure you we know for 50 years that humans doing face recognition are not fair. It’s long been known that people are more accurate on faces of their own race. This has been replicated dozens if not over 100 times. So getting rid of the machines doesn’t solve the bias problem. The second myth is that face recognition systems prior to 2015, which is the new generation of algorithms, were fair. And this I know for sure is not true from many, many studies, including several of these in my lab. Every generation of face recognition algorithm since the first paper we wrote on it in 1991 shows some differential performance as a function of the race of a face. The third myth is that race is categorical and we know what those categories are. So in biological terms, certainly, race is not categorical. And it is tempting to think of, you know, faces of, you know, categories that may or may not be representative of people as a whole. There are many individuals in the world of mixed-race descent.
And so my concern in trying to make categories where categories are very much artificial – if you engineer your systems too much to particular categories, you very much risk missing out on or disadvantaging people who are of mixed-race background or people whose category was not selected to be optimized. And the last myth is that one face is as recognizable as any other. And we know, as psychologists, since the late ’70s here, some faces are simply easier to remember and recognize than others. Think Mick Jagger, think Meryl Streep – they have lots of distinguishing features that make them more – quote-unquote, “more unique, more distinctive” than other people. And so it will never be the case that any face recognition system, be it human or machine, will be equally good at every person’s face that we can think about. And then I’d like to put this slide in. There’s lots of published work on this. The ones that – so this summarizes the recent human-machine comparisons. And that’s a big group of people, including people at NIST, people at Maryland and people in my group – a recent paper on demographic challenges and then a lot of these other sources for your reference if they’re helpful. So I should stop sharing correct, right?
RICK WEISS: Right. Thank you, Alice. That was great. And I want to remind everyone that all these slides will be available on our website afterwards. So you’ll be able to check out those references as well. OK. Patrick, you’re up.
Error and Bias in Facial Recognition Algorithms
PATRICK GROTHER: OK. Let me make this full screen. Hopefully you can see that. Hello. This is – I’ll cover some of the same material that Alice just covered and maybe in a slightly different way. The first point on this slide is that I worked for the Department of Commerce in a lab. We do not do policy and regulation; we do measurement. In the context of today’s talk, it’s about face recognition performance. Face recognition algorithms, face recognition systems, as they’re deployed, are tasked with two different things. So this slide is on the first one, which is this idea of one-to-one verification. So you get two photographs of me at the top, and if a face recognition system doesn’t successfully match those together, then you would have a false negative. So the system is answering, is it the same person or not? And that decision theory, that decision process occurs in all sorts of domains that we encounter pretty much every day. So a radiologist would look at a CT scan and say cancer or not or appendicitis or not. A soldier might look at some kind of photograph on a battlefield and say, is that a Russian tank or not? Pharmaceutical purity might be assessed – is it a fake drug or not? Biometrics facial recognition is saying, is it the same face or not?
So the top part is – would be a false negative if it didn’t match those two. At the bottom, we’ve got two different individuals – in this case, sisters. And if they – if a system puts those two together and says it is the same person, that’s the other kind of mistake, the type 1 error, which is a false positive. So the key point about this slide is in any kind of decision theory, you end up with false negatives and false positives. Now, there are systems fielded like this in border control and in your cellphone, perhaps, that make these kind of decisions every day using face recognition. The larger marketplace segment for face recognition is these so-called one-to-many identification systems, where in the top row here, a search photograph – in this case, me – is searched against the gallery of photographs, which in this case is quite small, but in real-world operations extends to the tens or hundreds of millions. And the idea is that you should find me – the needle in the haystack. If you don’t find me, then that would be a false negative. You didn’t find something when you should have.
The police successfully used such software in the investigation of the Annapolis newspaper office shooting, which is – I think, is an ongoing case. And the suspect didn’t carry documents. They searched his face against a Maryland database, and in that case, they got the right answer. So it wasn’t a false negative. The other kind of error that you can get from one-to-many identification systems is when you search somebody against the database that they’re not in. So if you want to search me against, say, a French passport database, I’m not in there. If the system returned anything, that would be a false positive, and that is a mistake. Both of these kinds of mistakes occur – false negatives and false positives. There was a case, I think, in 2017 – oh, in 2011, sorry – in Massachusetts, where now sort of an older system mistakenly matched somebody to a fraud list in the driver’s license domain and made a false positive. So false positives do occur. When we get to a reporting of face recognition – one of the bullets on this slide is talking about, to be – to ensure that we talk about false positives and false negatives. Before we get to that, we’ll sort of back up a little bit and say if we’re talking about face recognition, are we actually talking about recognition or about some other application, like classification? If you’re trying to guess the age or gender of somebody, that’s a classification task; it’s not recognition.
There’s been some confusion on that in the press over the last couple of years. How is face recognition being used? Sometimes there’s confusion. Is it verification, this one-to-one task? Is it identification? Or is it actually something else? Another point that we sometimes see is that blanket statements are usually wrong. So if somebody says face recognition doesn’t work or face recognition does work, well, that needs to be qualified by because algorithms vary in their capabilities. It’s a buyer beware circumstance. These are not commoditized technologies. Accuracy also varies by the kind of images used. If we turn off the lights in the room, quality will degrade. Recognition performance will degrade. Accuracy also varies by demographic group, and I’ll cover that again in a second. So what we shouldn’t do is average across multiple algorithms. Typically, systems are fielded with one algorithm on board, so we should talk about that algorithm. False negatives, false positives – there are other kinds of errors that we talk about – failed detection, failed quality assessment. That’s happened in operations. Quoting one number is not usually enough.
Way back in 2002, The New York Times covered a report that we’d written that said that accuracy was 52%. Because of the trade-off between false positives and false negatives, it’s usually not sufficient to talk about one number. You have to report two and to differentiate between false positives and false negatives. Another key point is talking about the impact of an error. So we can say face recognition, say, is inaccurate in a certain case, but what are the impacts of that? If we talk about that CAT scan – the sort of the medical domain, a false negative on a CT scan might go to somebody’s cancer being missed. A false positive would go to some kind of, you know, worry and resolution of some, you know, case for the patient. So the impact is heavily application-dependent. That applies in biometrics as well as medicine. And false consequences tend to have radically different implications than do false negatives. The last point on this slide is that magnitude matters. Some errors are really quite small. Are they small enough? And that is application-dependent. The last line I’ve got here is that there’s been a massive expansion of the industry. There is a vibrant sort of developer community vying for supremacy in this. Japan, China, U.S. and Russia are the foremost developers leading contemporary algorithms. They’re very accurate on high-quality images, but you can always degrade images. The algorithms tolerate poor-quality images, but only so far. Some applications are sensitive to false negatives, and that quality things – the quality maintenance – becomes important.
Many cameras are still being used that don’t know what a face is. They’re not aware of the signal that they’re looking for. That’s unfortunate. In the demographic realm, this is a newsworthy thing. We wrote a report December of last year that showed briefly that false positives are much more significant and much larger in magnitude than false negative variations between demographics. So we see higher false positives in women and the old and the young. We see large variations in false positives by country of origin, which I’ve used the word race here, but it’s by country of origin – where those people were born. And sort of a forgotten demographic is that twins are not separable by most contemporary algorithms. They cause false positives. So the algorithm matters. Better accuracy overall will give you small inequities. Some Chinese algorithms don’t exhibit the same bias against Asian faces that Caucasian-, Western-developed algorithms do. Some one-to-many algorithms – the search algorithms do mitigate differentials like this. So the watch – the key sort of takeaway is that any user or prospective user – they should know their algorithm. And with that, I’ll stop.
RICK WEISS: Thank you, Patrick. Lots of things we can dig into there in the Q&A. And I’ll remind people here that even in advance of the end of this, if you want to start entering any questions, feel free to use the Q&A box at the bottom – I believe at the bottom of your screen – to get those into the queue. And we’ll turn it over now to Dr. Nita Farahany. Nita.
The Legal and Ethical Implications of Facial Recognition Technology
NITA FARAHANY: Yes, hi. Let me just make sure you guys can see. I want that one. I want slideshow. All right.
RICK WEISS: Perfect.
NITA FARAHANY: Is that OK? Great. So just briefly, I’m going to touch on some of the ethical, legal and social implications of facial recognition technology, which obviously has been a big source of conversation in recent days as Clearview Technology has become kind of front and center in the news after the New York Times article that was there on it. I want to talk just about three broad categories One is this idea of policing by consent and why facial recognition technology seems to run at odds with that for people, the unintended consequences of facial recognition technology, and then just spend a little bit of time on the patchwork of laws and regulations because they’re really kind of all over the map – no pun intended – about the way in which people, states, cities, police departments, et cetera are actually approaching the issues. So to begin with, this concept of policing by consent – so the idea is that the power of the police, at least within this country, in the United States, comes from the people. Constitutionally, the people consent to the use of force, the use of surveillance, all in the name of public safety.
And so police have to negotiate daily the tightrope between regard for liberty and the use of coercive powers, all while maintaining the trust of the people and public support. And as we see areas in which that kind of tightrope that they don’t walk well, that’s where you see riots and pushback and problems because of concerns about the legitimacy of the police force. Facial recognition technology has – for many people, it seems to cross the boundaries of policing by consent. And in particular, we expect, for example, that any one of us could be a suspect if we have committed a crime or we might end up in a lineup if we’ve been convicted of a crime and could be hauled in for that purpose. But you don’t expect to be hauled into the police station and to be part of a lineup unless there’s some reason to suspect that you committed a crime.
And the idea of being able to search a database in which every single one of us may be present instead of having a lineup in which we’re brought in based on suspicion makes all of us part of suspicionless searches in a virtual lineup all day long. And there are significant concerns over the deployment of this technology not only because it brings all of us into these virtual lineups but because of some of the things that my co-panelists are talking about, like algorithmic bias, its accuracy and reliability, especially in matching diverse facial characteristics, together with this idea of intruding into the public sphere. That is, people’s private lives suddenly become part of their public sphere even though they intended only for it to be part of their private lives. Much of this is also happening without transparency. For example, Maine is one of two states to have a specific law which was inspired by Cold War-era secrets that says that officials neither have to confirm nor deny the use of digital technologies that may help solve crimes. And so when there have been questions posed to police – Maine State Police – they have responded by saying they refuse to answer and they don’t need to confirm nor deny those. That’s problematic for this idea of policing by consent.
The second is this broad concern about unintended consequences. And here, there are the concerns that it’ll be used in ways well beyond just finding a high-priority suspect, so, for example, chilling effects. For people who may be in demonstrations, they’re afraid to be in demonstrations for fear of being recognized, for fear of that leading to retaliation simply for exercising their First Amendment rights, or fear of stigmatization, when facial recognition technology is used in public spaces everywhere that a person goes, such as to visit a mental health professional, to a pharmacy or a clinic – are all recorded by facial recognition technology, in addition to recording the people who they associate with, revealing, for example, that a person is LGBT even though they haven’t come out formally to anybody because of the association of the people with whom they keep company. The combination of location plus association makes it so troubling for people. So the fact that GPS technology enables the tracking at all times, plus the use of these public cameras means that you get not just where a person is but who they’re with and what they’re doing in those places. It also leads to some interesting consequences, like driving the digital world to the physical world.
For example, a lot of sex workers have moved their activities online, where it’s a safer space because it allows them to be compensated in ways that exist within a digital medium rather than a physical medium. As facial recognition technology is used for dragnets in populations such as that, it’s driving them back into the physical world where they’re less safe. So it can have unintended safety consequences for individuals, including for immigrant populations who are afraid that, for example, databases of IDs are being used to scan for and identify people who are illegally within the country, using it for much broader purposes for dragnet purposes for identifying those individuals. It may be used in settings such as education and employment, leading to persecution of individuals. It’s already deployed in a number of educational settings to try to identify people who shouldn’t be there.
But the worry is that that creates an even greater chilling effect of association and other types of things in those settings. So the last area I want to talk on briefly is to talk about the patchwork of laws and regulations, with each of these photographs representing a different way in which this is being regulated. So for states, there are 21 states and D.C. which allow federal investigators to scan driver license photos. The FBI has access to more than 641 million faces across local state and federal databases. And while many states are enabling this, three states have gone forward to ban facial recognition technology used in police body cameras, like Oregon, New Hampshire and, most recently, California. Cities and city-level restrictions may exist in San Francisco and Oakland, Calif. Brookline, Cambridge, Northampton and Somerville, Mass., have all banned the use of facial recognition technology by city agencies. The city council in Portland, Ore., has proposed going a step further, banning the technology in both public and private sectors. Consent may play a part in a number of these different states. So Texas law, like Illinois, requires individuals or companies who collect biometric data to inform individuals before capturing the biometric identifier for them. And some states are capturing facial recognition technology under their privacy laws or their data protection laws. Like GDPR, for example, in Europe would apply to it, but also the more recent California privacy law includes biometric information and limitations on its collection.
Some states have just biometric-based laws where they specifically call out Biometric Information Privacy Act, for example, or other types of biometric limitations which say specific – create specific rules about how they can be used. Law enforcement agencies in many other cities have also taken a stance on technology. Some, like the Seattle Police Department, have stopped using facial recognition technology. Locally where I am, just a few days ago, Raleigh decided to discontinue its relationship with Clearview after the public outcry about it. Also, a number of companies are issuing terms of service violations, claiming things like cease and desist rules apply for breach of contract. The final thing I want to touch on just briefly to introduce it so that we can talk about it more in Q&A is the role that the Fourth Amendment of the U.S. Constitution plays in regulating this. Traditionally, the Fourth Amendment has said that things that are identifying features of an individual that they voluntarily expose to the public, like their face, are not things that are subject to search. That means that the police don’t have to have a warrant to search your face because you voluntarily expose it to the public all day long, so you don’t have a privacy interest in your face. That is the existing law, which means that searching things like a facial recognition database may not be considered a search for Fourth Amendment purposes and therefore not violating a person’s Fourth Amendment rights.
But that will be an interesting area to kind of follow and watch because that was true about our location as well, where it used to be the case that tracking your location or following you because you voluntarily go from place to place isn’t something subject to Fourth Amendment protections until GPS technology enabled attaching a small device to a vehicle following a person 24 hours a day, seven days a week such that the U.S. Supreme Court said this difference of decree, this ubiquitousness (ph) of the ability to follow a person changes the Fourth Amendment analysis such that now GPS technology, when done 24 hours a day, seven days a week following you from public to private places is now a Fourth Amendment privacy interest and does require the police to actually get a warrant before attaching a GPS device to your car. That may be what we see happens with the broadening scope of facial recognition technology, but we’re not there yet. So as of today, there’s a big question mark about how the U.S. Supreme Court and how other courts would address the question of the applicability of the Fourth Amendment to facial recognition scans. And I look forward to your questions. Thanks.
RICK WEISS: Thank you, Nita – really interesting, another issue the Founding Fathers could never have imagined. And we’ll see how the courts decide to pull that together in the years ahead. OK, we’re going to go to Q&A here. Again, if you have questions, please type them into your Q&A box, and we will start reading those to the group here.
Are there ways to determine where and when facial recognition is being used in a community?
RICK WEISS: I have one right off the bat from Jill Draper, a freelancer who asks that – various media sources have encountered government officials being secretive and sometimes lying about their use of facial recognition. The ACLU is suing federal agencies over this. Are there any ways to determine where or when facial recognition is being used in one’s community?
NITA FARAHANY: Yeah. So, I mean, the problem is – the answer is yes, but the difficulty is the timeline of your story and getting the answers. So for longer investigative journalism pieces, it’s possible to submit FOIA requests, and I’ll include an example that Rick can share with you guys. There was an interesting exchange with the Durham Police Department here in North Carolina where somebody submitted – a journalist submitted a FOIA request. The request was not worded carefully enough, and so the police were able to respond and say, we have nothing responsive to your request. When the language of the request was changed slightly such that it would encompass any kind of contract negotiated, pending or being contemplated and discussed, it captured a broader set of communications that were then released to the journalist. So really, I think the best check and the only check is all of you being willing to place media pressure on federal agencies, on states, on public entities that are subject to disclosure laws and requesting that they disclose them. As I mentioned, Maine is one of two states that has these nondisclosure laws, but every other state is subject to disclosure of the use of this technology by police forces, including the federal government.
How are police departments using the 2015 NIST report on facial recognition technology?
RICK WEISS: Thank you. Question here directed to Chris – sorry – to Dr. Grother. Christopher Damien from the Desert Sun says, I had a municipal police department point me at NIST’s 2015 publication data format for the interchange of fingerprint, facial and other biometric information when I asked how he advises businesses on which security cameras to install. Is that the intended purpose of this NIST publication? If not, what is it intended to be used for? Does this reference mean that the police department is using security camera footage with any particular type of biometric analysis technology?
PATRICK GROTHER: That standard is something that we developed in consultation with industry and other government agencies over many years going back to the mid-1980s. And it really is a biometric data interchange format originally for fingerprints, and it was extended later to have faces and tattoos and latent fingerprints and some other things – DNA. That standard doesn’t, as far as I know, regulate anything to do with video surveillance. There’s no format within that standard for video cameras or video interchange formats. So it’s mostly to do with bread-and-butter routine fingerprinting and mug shot capture. Yeah.
Do some cameras have a greater potential for use in facial recognition than others?
RICK WEISS: Maybe I’ll just add to that question because you made a reference earlier to the fact that some police departments or other institutions are using cameras that still cannot recognize faces. Is that a built-in software feature in some cameras and not others – that they have the potential to do a better job with face recognition than others?
PATRICK GROTHER: So you’re familiar with, you know, most contemporary cellphones and point-and-shoot cameras or aware of what they’re looking at. And they’re trying to take a nice photograph for social media or for whatever. They’re not really trying to take a photograph for biometrics. So contemporary face recognition algorithms are using low resolution. And they want frontal, sort of passport-style photos, but quite low resolution. And there is potential benefits for doing a better job with cameras that are able to collect at higher resolution and potentially to distinguish between twins, which contemporary algorithms and systems don’t do.
What do scientists know about the ability of other species to recognize human faces?
RICK WEISS: Question here for Dr. O’Toole – what do scientists know about the ability of other species to recognize human faces? We heard about humans and machines.
ALICE O’TOOLE: They know that in a lot of cases – I mean, it’s always a question of degree. So face recognition has been shown in other species. And certainly, there’s a fair bit of data on macaques and so on about their ability to distinguish faces by – faces of other animals of their species and human faces. But it’s always a question of degree. I mean, human recognition ability is remarkable in the sense that I can show a picture of someone you know from the front, from a side view, laughing, smiling, 10 years ago, and we are able to do a great deal more generalization. It is possible other species are able to do this, but those tests have not really been done.
What is the legal level of privacy people can expect when out in public?
RICK WEISS: Question here from The New York Times – the Supreme Court cited one of my stories in the Jeep GPS case that Nita mentioned as per the expectation of privacy. Is there any analogous expectation of privacy when walking down a street?
NITA FARAHANY: Meaning, do you have an expectation of privacy similar to the Carpenter case? That’s how I’ll interpret that. And in general, the answer is no, right? So what was unique and interesting about the Carpenter case and what was unique and interesting about the decision by the court to say that the Fourth Amendment applies is to say that a difference in degree can change what has been the traditional standard. The traditional standard is, anything you do in public in plain view is not subject to the protections of the Fourth Amendment of the U.S. Constitution. And it’s in this kind of strange legalistic way, which is, it’s not a search if it’s something that you can see in plain view. So it’s a very – of course it’s a search in common parlance, but it’s not a search according to the Supreme Court. And the Carpenter case said, no, no, it is a search if it happens all the time.
If you’re just walking down the street and a picture is captured of you in a single instance and not everywhere you go and it’s not complete surveillance 24/7, it may not arise to the level of difference in degree that the court said made GPS tracking a search for Fourth Amendment purposes. I think the more widespread the technology becomes, the more likely it will violate an expectation of privacy, which is a little counterintuitive because the more common something becomes, the less we expect it to be private. But in this instance, if everywhere you go at all times is being tracked and everyone you associate with is at all times being photographed, I think the average person would feel like that’s quite intrusive. That’s how it rises to the level of a Fourth Amendment challenge. And I think even this court, a conservative court, would be likely to say that under the Carpenter precedents that that’s a Fourth Amendment violation. But it remains to be seen.
Are there scenarios where surveillance using facial recognition technology constitutes a Fourth Amendment violation?
RICK WEISS: Is there any sense from the litigation so far what it would take to hit a threshold like that? Would you have to be – you know, some camera on you at least 10 hours of every waking day to hit that line of, yeah, you are just always being tracked?
NITA FARAHANY: I suspect it’s not so much time as it was the GPS. I think the 24/7 – part of it was time. But part of it is that people were being followed into private places, when they were home or when they weren’t, which is traditionally part of your private domain. It’s – the most sensitive place is the home, traditionally. And so when the police come into your home through GPS tracking, it violates this public-private divide. And I think when facial recognition technology goes from street corners to more private locations – you know, every time you walk into a mental health clinic, every time you go into a place that you traditionally thought you had a more private right of access – I think it’s – when that transition occurs is when I think we’re more likely to see courts saying this really does violate the Fourth Amendment.
What is being done to address face-recognition algorithm biases related to gender and race?
RICK WEISS: Thank you. A question here from Tasha Williams (ph), a freelance writer. What is being done to acknowledge and address algorithm biases related to gender and race? How does the status quo impact already marginalized communities? Maybe, Allison and Patrick, you might want to each address that.
ALICE O’TOOLE: I can just say it’s a very active area of research to be able to address these issues and the algorithms, for sure. That said, I – from what I know, but maybe Patrick will agree with me, I don’t think an easy solution is around the corner. When models were really simple, people thought it was just a question of training the networks with the right proportion of Category A and Category B and Category C, and then you’d get equal performance. I think most people now think that these more complicated algorithms, that’s going to be maybe part of it, but it is not going to be the entire story. So the newer algorithms require quite a bit of training with very, very large numbers of identities and pictures of identities, and sometimes it has been difficult to get those kinds of datasets to do the training in equal measure. But there’s also a sense, probably, that there – we may have to do other things to address the problem. I agree with what Patrick said at the end of his talk. I mean, right now I think the biggest stress has to be on being sure you’re testing the algorithms for performance that is appropriate for the venue you intend to use them. So I’ll let Patrick talk as well.
PATRICK GROTHER: Yeah, I’ll agree with that. The – it’s a research issue. The report that we wrote sort of put some sunlight on the various performance variations that the developer community wouldn’t have been aware of or were not – certainly not uniformly aware of. And a number of them have sort of started to work on this to try and mitigate differentials by race, by age and by sex. It’s a tough problem. Though the one thing that our report did show is that the Chinese developers, one of them disclosed that they’d used up to 500 million images from a Chinese dating website in their development. Their algorithms don’t show these higher false positive rates in a Chinese population, which is what many Western-developed algorithms do. So that is some kind of existence proof that training can matter and can help. But right now it’s – there’s a problem, and it needs to be addressed.
ALICE O’TOOLE: If I can just add one more thing along the lines of the myths I was talking about is that, for example, if you were able to make your algorithm equal on males and females, what does that mean for transgender individuals or for people in transition? You have no assurance that people who do not typically fit exactly the stereotypical characteristics of male, female, or, you know, Vietnamese or Japanese or Chinese or Ethiopian. You know, the middle ground is what I’m very concerned that we will overengineer to the center of a category and disadvantage people who don’t fit that category very well.
Are U.S. companies required to submit facial recognition algorithms and information to NIST for assessment?
RICK WEISS: And Patrick, just to end, even if training – training set size can help, my understanding is that some of the biggest players have not actually submitted their materials to NIST for assessment, right? Some of the Googles or others in the world.
PATRICK GROTHER: That’s correct. For various reasons, you know, participation in the evaluations that we do is entirely voluntary. It’s open worldwide. And we do this free, but each developer has got their own reasons to participate or not to participate. You mentioned Google. I’m sure they’re capable of doing their own tests internally. They’re not really trying to sell biometrics in the commercial marketplace. But again, that would be a question for them.
ALICE O’TOOLE: That said you should probably say that the NIST – that your report, I believe, if I have it right, tested some 188 algorithms. Do I have that number right? Maybe it was a hundred and…
PATRICK GROTHER: It goes up every day (laughter).
ALICE O’TOOLE: It goes up every day. So a lot of people, a lot of companies and academicians voluntarily participate to know how well they do. I mean, those are really the gold standard tests. So they want to know where they are.
RICK WEISS: Right. Nobody measures like NIST, so that’s – it’s a wonderful service.
What personal steps can people take to protect their privacy against unwanted facial recognition?
RICK WEISS: Question from Marie – sorry, Maria Temming from Science News. I’m concerned about facial recognition technology invading my privacy. Are there any steps that I personally can take, besides pushing for legal restrictions, to protect myself? For example, are there any studies that show whether making a strategic alteration to my appearance can confuse algorithms? Likewise, have researchers developed any adversarial algorithms that can scramble facial recognition systems if they’re being used in an unauthorized way?
ALICE O’TOOLE: Patrick, that’s probably you.
PATRICK GROTHER: Yeah, there’s – you can casually degrade your sort of facial presentation to a camera, and that will be effective, but it’s something that’s difficult to do 24/7. So if a persistent surveillance gets a good photo of you, then a facial recognition is very likely to succeed. The better approach to doing this is a set of technologies for – that would require some regulation on the use. But it would use cameras, say, in a department store, but it would replace people’s faces with fake faces. And the purpose in doing that is that, yes, you could follow people, you could count people, but you wouldn’t actually identify people. You wouldn’t expose the real face of an individual to a government or to somebody. So these de-identification technologies that do subtle damage to a face so that you can’t be identified later, they are worthy of evaluation. And there is a commercial marketplace for that.
RICK WEISS: I bet. Nita or Alice, anything to add there?
NITA FARAHANY: Yeah, which is – so the one thing I would just encourage you to think about as you all write about the privacy concerns of individuals is to help people articulate what the fears are because I think people in general are afraid of surveillance technologies. And I laid out some of them – right? – some of the unintended consequences, which are the potential for misuse in certain settings like employment settings to hire or fire, make decisions like that or in societies in which there is less freedom of expression, freedom of association and the likelihood of persecution by being identified and so the chilling effects that that could have, both in those societies and also in our society if you have fear of association and implications for that.
But that’s where, ideally, we can start to encourage people to articulate, what are the fears we have? Is it really the virtual searches that are being – that are occurring by the police department, which may not harm an individual, right? So to be hauled into a police station is a very different experience than virtually being searched and never being aware that you are part of a search set. Is it the discrimination and misuse? – in which case we should be advocating for people and for legislator and for states to come up with policies that safeguard against those misuses. Is it the lack of transparency? Then we should be encouraging transparency of the types of uses to which it’s being put. I think there can be legitimate uses of this technology, and narrowly used and clearly identified and made transparent, it may not be as concerning for people. And that’s where I think your role of helping people understand, like, what is it that they’re afraid of and what is it that people should be advocating for states and police departments and cities and other governments to be doing can really help.
RICK WEISS: That’s very helpful.
RICK WEISS: Trying to mute my mic.
NITA FARAHANY: With the police in the background just for sound effect (laughter).
RICK WEISS: I’m pretty sure they’re after me now.
NITA FARAHANY: Yes.
Do machines find certain faces more easily recognizable than others?
RICK WEISS: Great. I want to have maybe time for one last question here, and this is one referencing Dr. O’Toole’s statement earlier that some people I think will never forget this part of your presentation, that Mick Jagger and Meryl Streep seem so easy. So the question is, you know, is it the same for machines? Do the machines find those same faces that we find easy to recognize to be easy to recognize, or is it something else altogether?
ALICE O’TOOLE: So that’s a really good question. These are tests that we did on algorithms years ago and, yes, definitely they’re – the ones that were tested years ago do find certain faces more recognizable than other faces, and you can see that by, you know, the similarity scores the machine produce. Either they’re totally certain it’s the same person because there’s a match there that is very informative diagnostic of the identity. So think the really – you know, the thick lips of Mick Jagger or whatever. The newer algorithms, I think, are still not well understood. They’re very large computational engines that do sometimes hundreds of millions of nonlinear local computations and produce a result that is very accurate or supports very accurate face recognition. But we’re not quite sure about what the representation is. And that’s also a very active area of research. Those machines are designed to kind of mimic the human visual system, which does in fact compute with hundreds of millions of neurons that go between the retina and higher levels of cortex where face recognition seems to be done. And we don’t really understand the nature of the code we use. And I think we’re now at a level where these machines are complicated enough that it will take some study to figure exactly what it is about faces the machines are remembering and whether it’s the same types of things we as humans hold on to in distinctive faces.
RICK WEISS: Patrick, anything to add there?
PATRICK GROTHER: No. It is a research issue to try and understand how current face recognition algorithms work. It’s worth noting that algorithms vary in their capability, but they also vary in their sensitivity to demographics – we talked about that – but also in their sensitivity to other things. Like, if you’re not looking at the camera, do you get more false positives? If you’ve got a very low-resolution image, do you get more false positives? And these kind of sensitivities, it’s incumbent on sort of end users to understand those as best they can.
Are there currently any U.S. laws that prevent the use of this technology in school admissions, hiring decisions, loan applications, etc.?
RICK WEISS: Wonderful. One last question – New York Times again, John Quinn – are there any laws in the United States to prevent the use of this technology in school admissions, employment, loan applications, et cetera – Nita?
NITA FARAHANY: Yes and no. So the states that have placed bans on facial recognition technologies, those can apply to those settings as well. The states that have biometric laws and that have general privacy laws, like California, specifically call out biometrics and the use of biometrics and the limitations of biometrics. But you know, as for really specific legislation that says it would be misused in the following settings – you know, educational settings, employment settings – for use discrimination, etc., there is no state that calls it out specifically right now. It’s just if it falls within the general categories of these other sets of privacy legislation. It’s likely that if federal legislation does develop – and there’s been some proposals both specific to facial recognition technology as well as the general privacy conversation that’s happening at the federal level – it’s likely that there would be specific mention of biometrics and what different institutions could do with biometrics. And presumably, some of those institutions would be covered by those types of legislation. But as of today, the answer is unless it’s within those general privacy laws now.
What are the key takeaways reporters should keep in mind when covering facial recognition technology in the news?
RICK WEISS: This has been so interesting, and we’re just about out of time. I want to just give each speaker, you know, half a minute to make a final point, reiterate a take-home point. You know, if there’s one or two things you want our reporter class to walk away with today, what would you like to leave them thinking about? I’ll start with you, Dr. O’Toole.
ALICE O’TOOLE: So I guess I would say it – from my perspective, it’s been very interesting to see how much more accurate algorithms have gotten over the last number of years, to the point where I definitely – in the tests that we’ve done, there are a number of cases – if I could look at the two images and say, you know, I trust the machine over the human in that case if the quality of the image is excellent and the quality of the training data and so on. So they have gotten very good. That said, there’s still a lot of questions about how these machines operate and understanding, you know, none of them is perfect, so they still make mistakes. And understanding these mistakes are super important. I think we should also remember that humans make mistakes as well. And humans make mistakes of all different kinds. And you know, the best of face recognition accuracy would be to combine the good things that humans can do that machines don’t do well and vice versa. And so think of it more as a tool than as something – especially in law enforcement – than as something anybody wants operating alone, either the human or the machine. So…
PATRICK GROTHER: I would second that. Yeah, I would second that. The errors coming out of face recognition systems invariably end up in the lap of a human. And so the ability of humans to adjudicate pairs of images is an important topic and subject to some of the same kind of reporting of metrics that the automated algorithms are. So to – my takeaway point is that coverage or face recognition accuracy can be improved by talking about false positives and false negatives, about if one-to-one systems – one-to-many systems and to talk about sort of the domain of use. How are systems being used? Is it for access control? Is it for video surveillance? Is it for recording your immigration status when you get on a plane? Some specificity into the various different applications and how they’re used and the impact of errors is important.
RICK WEISS: Great take on the accuracy. Really, relevance of it has everything to do with how it’s being used. Nita, to wrap up.
NITA FARAHANY: So I mean, I think it’s a great point to build off of, which is – even assuming a world in which you have perfectly accurate technology – which we’re far from – it can, in certain settings, be perfectly wrong to use it. And it’s trying to figure out, what are the appropriate uses, and what do we as a society consider to be misuses of technology? So facial recognition technology, for many people, has raised significant concerns. For many of the average public, hearing that Clearview is scraping images that they have online and that they have used for very different purposes than the one that is now being applied to is really concerning. I hope that as you all write stories about this, that you help people both to understand the limitations of the technology as it exists today but also, in a world in which those limitations become fewer and fewer and the accuracy improves, that there will still be significant ethical, legal and social implications to this technology, which we need to evaluate to decide, what are the uses to which we as a society will be comfortable for it being applied?
RICK WEISS: Fantastic. I want to thank our three panelists today for wonderful presentations and answers to questions. I want to remind everyone – all the reporters before you log off – when you log off, you will get a prompt for a very short, 30-second, three-question survey. It’s so helpful to us to hear your quick responses to those three questions. I hope you’ll take a moment to do that for us and help us keep these briefings as beneficial and useful to you as possible. Please follow SciLine at @realsciline on Twitter and go to our website sciline.org and see what else we can do to help you in your jobs as reporters to put as much scientific evidence into your stories as possible. And with that, we’ll end this briefing. And thank you very much for attending.