We Need to Teach AI to not be Racist or Sexist, Because Humanity is Both

Reading Time: 12 minutes.

Are your machines learning?

Tell me if you’ve heard this one before: “Do as I say, not as I do.” Your parents likely told you that at some point. Perhaps they just said a curse word they don’t want you repeating, or you caught them speeding while they were teaching you to drive. If you are a parent, or if you ever become one, you’ll likely use the phrase often as well. Humanity has become a parent to something new: artificial intelligence (AI), and we’ll need that phrase if we want our AI to be remotely objective.

Through machine learning (ML), a type of AI, we’ve created something that can parrot our ideas back at us, recognize patterns in faces, language, and data, and help us make decisions. The problem is, we base these machines on a dataset that includes all of humanity, unfiltered. We’re an extremely flawed bunch, as a species. We exhibit racism, sexism, and xenophobia in alarming quantities, and many choose to ignore that. Ignoring these problems has lead us to pass them on to our children and—in this case—our AI. Now we’re struggling to control our own technology, fighting to keep it from exhibiting the same sexism and racism already present in humanity. Unfortunately, we’re not doing that.

Machine Learning for Non-Programmers

Arrows flowing in a circle. Text reads Humans > Create > Data > Fed into > Algorithms > Discover > Patters > Returned to > Humans

Humans create data, fed into man-made algorithms, which detect patterns to return to people.

Artificial intelligence, machine learning, deep learning, neural networks, it can all seem a bit confusing, right? I’m a software engineer, and I still sometimes have to stop and ask myself, wait, do I really know how this works? However, you don’t need to be a computer scientist, mathematician, or data scientist to understand how machine learning works on a basic level.

This is the part where someone normally shows you a neural network. Data pathways through layers of little boxes. Unless you’re a developer, “parameters” and “returns” won’t make much sense yet. Simply put, parameters go into a function, which returns results. So, think of a math problem, if you add one and two together, it will always return three. The parameters are one and two, the returned item is three. The act of adding those two values is an algorithm. A process that is the same every time, and produces the same results for the same input.

1 + 2 = 3. One and two are parameters, the act of addition and returning 3 is the algorithm.

Addition is a simple algorithm.

Making a peanut butter and jelly sandwich is an algorithm. Take two pieces of bread, apply peanut butter to one side of one slice, jelly to one side of the other slice, and slap them together. Delicious algorithms! The parameters are bread, peanut butter, jelly, and a butter knife to apply everything, and the algorithm is the act of putting it all together.

Machine Learning as a PB&J Sandwich

Machine learning is that on a grand scale. We pass in huge amounts of data (the peanut butter and jelly), sometimes labeled to help the algorithm, and it finds patterns in this data (putting the sandwich together). It can then take those patterns and perform more computations on it, which is called “deep learning.” This is where the idea of a neural network comes in. One program passes the results of an algorithm to another program, and so on, until a result is passed back to humans (that’s when we eat the sandwiches). We design the algorithms, we create the data, and they give us the results as recognizable patterns.

Basically, data goes in, we do something with that data, and patterns emerge.

Has AI Exhibited Racism and Sexism?

Microsoft’s Bot Tay

Microsoft’s Tay started off so kind.

AI doesn’t know the difference between “good” and “bad” human behavior. Take Tay, a Microsoft chat bot created to imitate an inquisitive teenage girl and interact with users on Twitter. Tay was an experiment in deep learning. It had basic conversational skills, and it would learn from conversations with people to make those skills better.

Tay tweets calling for attacking women and Jewish people

Tay turned to hate very quickly. These were actually more tame examples

Within 24 hours of being activated, Tay became a racist, antisemitic, misogynistic bot. What happened? The alt-right and trolls flooded Tay with racist, holocaust denying, and misogynistic conversations. The bot then reflected the same bad behavior it was shown. It did not know that racism, sexism, and antisemitism are terrible things, it just was imitating the humans from its data set, those found on Twitter.

Tay is a perfect example of what can happen to a machine learning algorithm when presented with bad data. The AI does not know anything of the world but the data it is fed. If the data has bias of any kind, it will be reflected in the AI. Algorithms do not sanitize bias out of data, and they can sometimes make it worse, by pointing out biased patterns.

Job Offers

Women need not apply?

Let’s say you want to take people out of your hiring process to make hiring completely objective. You work at a tech company, and you want to grow. You might make the false assumption that an algorithm could help you find the perfect candidate without racism or sexism. So, you take the skills, interests, and abilities of your current workforce, prefer the employees who have gotten promotions and have performed well, and then give it your pool of applicants.

Due to previous biases in hiring and promotion, it will suggest you hire more successful men with the same hobbies, backgrounds, and information that you’ve currently employed. You’ll end up with candidates who look exactly like your predominately white, young, straight, and male workforce.

Office Space Meme: "If you could just upload your resume to apply and then enter every single detail on your resume into our online form that would be great."

Computers prefer forms

This is already a tool used by companies. Corporations often get hundreds of applicants, and use algorithms to sort through the pile. Have you ever wondered why many companies have you fill out a form with your information after submitting your resume, which has all of that information? It’s because an algorithm will sort the applications long before a human ever looks at it.

This has also happened with job listings. A Carnegie Mellon University study showed that a woman browsing the internet is less likely to see ads for high paying job offers than a man. This is thanks to bias in Google’s advertising platform, as well as bias in the data that is fed to that algorithm by companies looking for applicants. This is a perfect example of a place where AI is already working to enforce strict gender roles and stereotypes, fighting to keep women out of the workplace and out of STEM.

Sentencing

Vernon Prater, white, low risk (3). Brisha Borden, black, high risk (8).

Brisha had 4 juvenile misdemeanors. She never committed another crime. Vernon had two armed robberies, one attempted armed robbery, and later committed one grand theft.

This is a particularly dangerous area of AI. What if the rest of your life could be decided by a machine, and that machine was racially biased? A ProPublica study looked into a popular risk assessment tool used in courtrooms across the country, Northpointe. Courts should only use the algorithm to predict recidivism rate, that is, whether or not a person is likely to commit a crime and end up back in prison. However, they’re using it across the country for: sentencing, whether or not someone will go to prison or do parole, parole hearings, and alternatives to prison time. It’s being used to define the future of someone’s life, and it’s racially biased.

The ProPublica study proved that Northpointe’s software was equally inaccurate between white and black defendants. However it found that when it was inaccurate with white people, it was predicting a lower chance that they’d commit another crime. With black people, it was falsely predicting a higher chance they’d commit another crime. The end result is devastating: black people may do longer amounts of time behind bars and may not be offered parole as often as white people.

Northpointe didn’t design their algorithm to be racist. However, it fails to account for systematic racism in the job market, in housing, and in policing. As a result, while its creators had the best of intentions, the software enforces and compounds decades of racism onto one person.

Policing Areas

White people and black people smoke marijuana at the same rate. However, black people are far more likely to be arrested for it. This shows bias in policing. Decades of institutionalized racism and confirmation bias combine to create a racist system where those perpetuating it may not even realize they’re doing it.

Let’s examine a math teacher. This math teacher believes that boys are better at math than girls. He thinks this because, when he asks the class a question, he only picks boys to answer. The first boy might not always have the answer, but the second or third boy he picks always has it. Since he can always get a correct answer from a boy, he believes that boys are good at math. Obviously you can see the problem here. By never asking girls, he’s never testing their ability, he’s only confirming his long held bias that boys are better at math. So how does that apply to policing?

The Rookie Cop

Yes, this is from a comedy show. Watch it anyway.

Let’s examine a new police officer. A rookie cop might be told to “look out for neighborhood ‘A,’ pay special attention to it.” What he doesn’t know is that, long ago, neighborhood ‘A’ was intentionally set up as a minority neighborhood through redlining, and a racist cop or a group of them over-patrolled it. The rookie cop might not carry that racism, but they admit that, when they patrol this area, they get many arrests. They don’t pay as much attention to neighborhood ‘B,’ but the arrest rate could be similar there, if they patrolled it.

The rookie cop could also represent a new algorithm. Police fed it information on previous arrests, and it predicted where crimes would occur. Technology like this is used in police departments across the United States, and it carries with it the racism of the past. Drug deals frequently go down in suburbs (the opiate crisis proves this), however, police have not patrolled these areas as heavily because the racism of the past told them they didn’t have to. Suburbs were for white people. Once again, the systematic racism of the past and present is compounded in an algorithm that knows nothing of that racism, and therefore continues the patterns of racism it’s shown.

Facial Recognition

Recently, Amazon’s Rekognition facial recognition software flagged 28 members of congress as people matching mug shots. These people could have then been arrested based on that alone, had it not been a study by the ACLU. Of course, the mug shots were false positives. Guess what? Those false positives were split up along racial lines. Despite making up only 20% of congress, people of color made up 40% of the false positives. Yup, even facial recognition is racist and sexist. Because the AI had been trained with more white male faces than people of color or women, we ended up with facial recognition that struggles to identify women or people of color, but excels with white men.

Word Association

Man is to woman as boss is to…? The AI responded “housewife.” That’s because, frankly, there’s still a lot of sexism in our society, and, if you’ve ever tried to play an online game as a woman, you’ll know, insults are frequently thrust upon us to “go back to the kitchen.”

A Stanford study created an internet trained AI. They found that, not only was it sexist, it also was racist and ageist. The AI was trained by giving it words to find context online. It would then find sentences that contain those words, names, and descriptors, and derive meaning for them given the context surrounding them.

After training, the AI connected white sounding names with positive words like “love” and “happy,” while black sounding names were given negative connotations, like “hatred” and “failure.” The AI would also return “Illegal” when trying to find a similar word to “Mexican.” The AI even concluded that young people were more pleasant than old people.

How AI Becomes Racist and Sexist

Limited Data Sets

In the example of facial recognition software, the biases of the tech industry are obvious. Walk through a tech company’s office. What do you see? Sure, plenty of mechanical keyboards, multi-monitor desks, Pop figurines and desk toys. What else? Men. It’s mostly young, white men.

When those guys get together to feed data into their new facial recognition program, what are they going to test and train it with? Likely each other. They’ll use their own faces, their family photos, their friends, and their coworkers. Those faces will mostly belong to white men or white people. They won’t account for darker skin tones or makeup, because the men weren’t thinking about those things when they made their software. They didn’t realize that a woman’s skin tone can depend on the makeup she’s wearing, or that cameras often underexpose dark skin on automatic settings (who do you think made those settings?), leaving details out of photos. A limited data set and a homogeneous view of an incredibly diverse populace lead to sexist and racist facial detection AI.

Naïve Researchers and Twitter

Twitter is rife with Nazis, the Alt-Right, TERFs, and other hate speech. In the U.S., Twitter does little to stop it.

Another issue is naïvety. Straight white men don’t know that Twitter is a wasteland of harassment against women, people of color, and LGBTQ people. They just see a social network where people chat with each other, share news stories, and discuss important matters. They don’t realize that mentioning their sexuality, race, or simply being an outspoken woman can make them a target for hate speech, harassment, doxxing, and character assassination.

This is what happened with Tay, Microsoft’s chat bot. Ask a target of hate speech, such as a woman, LGBTQ person, or person of color, if Twitter should be used to train an artificial intelligence in human speech, and she’d tell you “no way!” Victims of hate speech on these platforms know it’s far more common than those who don’t regularly receive it. Someone like this might recommend Tay receive training from teen literature. Books like the Harry Potter series, Hunger Games, or coming-of-age stories would be better choices. Literature is less likely to be riddled with hate speech, and can be carefully selected to remove these biases. Then test it on a social network with more accountability, like verified Facebook profiles in the U.S. that include people’s real names and places of employment. The Microsoft team wasn’t diverse enough, and therefore naïve to the toxic content present on Twitter.

The Black Box Code

In software, we refer to a black box as a piece of code or system that hides its algorithms. We pass data to it, and get a result, but we don’t know what it did to that data to arrive at the result. Many companies, from Amazon to Northpointe, sell their software without revealing any information on how their algorithms work. There’s a very good reason for this: it’s impossible to profit off of something that a software engineer can easily duplicate for free. Therefore, they protect their proprietary code.

However, proprietary algorithms, while profitable, are difficult manage. Say you’re a police department looking to use an algorithm to help you protect your streets. How can you be sure what you’re getting won’t have racial bias if you can’t see how it works? You conduct a study. You use the software in the real world, analyzing the results, and seeing if the software had hidden bias. The problem with this? There’s a human cost to those studies. People are imprisoned when they perhaps shouldn’t have been, areas patrolled that shouldn’t have been, and biases reinforced. Lives are affected.

Saving AI from Humans

The title of this post sums up the issue succinctly. We need to teach AI not to be racist or sexist, because humanity is both. How do we go about doing that? We cannot erase the racism and sexism in humanity’s past, and we have much work to do to prevent it from tainting our future. But we can make AI aware of these biases in an attempt to scrub them from its data and we can also ensure that the data we feed out little AI children doesn’t turn them into racist, sexist monsters. Humanity isn’t perfect, but we have the recipe to make our AI more objective than us, and that, in time, will improve our own views as well.

Diversity in STEM

This is an easy goal. More diversity in STEM (Science, Technology, Engineering, and Mathematics) means fewer mistakes made from lack of life experience. We, as humans, live just one life. We cannot possibly contain all the knowledge of living as every person. However, there are generalities in our lives, things that can unite us, make us similar. Our lives are affected by these similarities, and cause us to have different experiences and views. Contributing those views to a project means making a project work better for more people. Diversity of thought and experience improves any project, and it certainly improves AI. People can recognize biases that will exist in software before it reaches the public because they’ll see its bias effect on themselves.

Acknowledge Biases

Whenever I talk to someone about small, inadvertent occurrences of sexism, racism, or homophobia, I’m always quick to point out that it’s okay if they’ve broken their newly learned rules of decency in the past. It’s okay if they mess up a little while trying to do better. Change is hard, and it takes a long time. But if we recognize small biases, and work to change them, we can eventually eliminate them. The first step is acknowledging bias. The next step is a bit more difficult: doing something about it.

Center for Policing Equity (CPE) is an organization that I first heard about during Google I/O this year, where the focus was on machine learning. They use a police department’s data to find places where they’re over-policing, and prove that there is a racial root to this bias. Most officers genuinely do want do do the right thing and remove bias from the badge, but it can be difficult to prove it’s happening, and even more difficult to find solutions. CPE works with departments to help them discover their problem areas and improve. They’re using machine learning to find and eliminate racial bias, rather than perpetuate it. We can use AI to do great things.

Make Data Race-Aware

Part of the problem is we make these algorithms with no knowledge of racism. However, we can make them aware of systematic oppression. Much in the way affirmative action helps those who had to work against systematic racism level the playing field, we could help elevate this data.

Note Boston’s Roxbury, a predominately black neighborhood.

Take this map of 1-day Amazon deliveries to Boston. Roxbury is a predominately black neighborhood. Something in Amazon’s black box algorithm told them not to offer 1-day Amazon delivery to this area. Critics quickly pointed out that this was racist. Amazon swore that it wasn’t, but wouldn’t reveal why their algorithm told them to avoid this area. They’ve since corrected it.

What could have prevented this? U.S. census data. Amazon could have boosted areas based on their racial makeup to counteract decades of systematic racism against those regions. This would allow them to counter whatever secret sauce is compounding racist beliefs. The same goes for NYC’s Bronx, or Chicago’s South Side. If Amazon had made their algorithms aware of humanity’s extremely racist past and ongoing oppression, then they could have balanced their algorithms before this affected people’s lives.

Question Uses of Data

Finally, we should question large data grabs and indiscriminate usage of large sums of information. Data scientists could use this information in the future without context. It’s why Google’s unprecedented data tracking is unnerving. We can’t be sure Google’s uses of this data will always be beneficial to consumers because, frankly, it isn’t now either. We should always question why data is being collected and how it’s being used.

Algorithms Carry Bias: Question Them

“In 2009, for instance, San Francisco police handcuffed a woman and held her at gunpoint after a license-plate reader misidentified her car. All they had to do to avoid the confrontation was to look at the plate themselves, or notice that the make, model, and color didn’t match. Instead, they trusted the machine.”

– Brian Barrett, Wired.

Algorithms do not sanitize bias from data, it compounds it. We must always scrutinize the data we put into algorithms the methods we use to find patterns in that data, and the results those algorithms produce. If we blindly trust them, we end up making decisions that are unintentionally racist, sexist, ageist, or otherwise discriminatory. If someone you didn’t know passed you on the street and told you to tackle someone and hold them down, you would want to know why. You’d want to verify their explanation. We must do the same with our machine learning algorithms and the data we’re pumping into them. If we don’t, we will only exemplify and magnify the discriminatory attitudes of our past.

Sources/Further Reading:

Julia Angwin, Jeff Larson, Surya Mattu, and Lauren Kirchner, ProRepublica: “Machine Bias.”
Brian Barrett, Wired: “Lawmakers Can’t Ignore Facial Recognition’s Bias Anymore.”
Ali Breland, The Guardian: “How white engineers build racist code – and why it’s dangerous for black people.”
Philip Bump, The Washington Post, “The facts about stop-and-frisk in New York City.”
Kate Crawford, The New York Times: “Artificial Intelligence’s White Guy Problem.”
Hannah Devlin, The Guardian: “AI Programs exhibit racial and gender biases, research reveals.”
Ben Dickson, The Next Web: “Stopping racist AI is as difficult as stopping racist people.”
Nick Douglas, LifeHacker: “What Makes an Artificial Intelligence Racist and Sexist.”
Matthew Hutson, Science Magazine: “Even artificial intelligence can acquire biases against race and gender.”
David Ingold and Spencer Soper, Bloomberg: “Amazon Doesn’t Consider the Race of Its Customers. Should It?“
Dylan Matthews, The Washington Post: “The black/white marijuana arrest gap, in nine charts.”
Lauren Murrow, Wired: “This psychologist could stop police racism before it happens.”
Cathy O’Neil, Ted: “The era of blind faith in big data must end.”
Osonde Osoba, TechCrunch: “Keeping artificial intelligence accountable to humans.”
Byron Spice, Carnegie Mellon University: “Questioning the Fairness of Targeting Ads Online.”
Karsten Strauss, Forbes: “More Evidence That Company Diversity Leads To Better Profits.”