Microsoft Copilot Depicts Pro-Choice Prompts as Demons

Reading Time: 4 minutes.

Cassandra was a priestess in Greek mythology. She was blessed with true foresight, able to make perfectly accurate predictions, but cursed in that no one would ever believe her. There’s a sort of self-satisfaction you can feel when you can pat yourself on the back for being right about something when so many others did not believe your assessment. I’ve certainly known that myself, working in tech. But any schadenfreude you get from it is short-lived. Seeing others suffering, holding resentment towards you for having said the right thing earlier, it’s certainly an awful feeling. You can’t help but torture yourself, “What if I made my argument better? Would things be better now? Could I have stopped this?”

I wonder how AI ethics experts are feeling this week.

Last week, Google’s AI spit out racially inaccurate images, such as diverse Nazi soldiers. The issue comes from removing information in incoming data. Intentionally removing race, like a white person who claims they “don’t see race,” instead of addressing racism in society and the systems we’ve built. It removes an important part of information and context around an image. When you blindly pull in the entire internet, whitewash data, and don’t curate out problematic images, you get AI that spits out racially diverse Nazis.

Microsoft this week had another problem, an even simpler one. It, too, could have been solved by listening to AI experts instead of business people rushing to make money from AI. Microsoft didn’t properly curate the data and images they ingested into their data sets. As a result, when asking Copilot to generate abortion rights images, you get images of demons and violence.

It’s obvious what happened here. Microsoft and their partner, OpenAI, gobbled up the internet, anti-abortion propaganda and all. As a result, the AI doesn’t know how to make unbiased pro-choice images.

And the Cassandras of AI weep.

Demons of Bodily Autonomy, Other Horrors

Shane Jones is a principal software engineering manager at Microsoft. He’s not on their dedicated Copilot red team, that is, a team that tries to break a product to find vulnerabilities, but volunteered to do so. What he found, through some rather simple prompts, was troubling.

A request for images related to “pro-choice” would show violence, including demons and monsters and at least one example of “a demon with sharp teeth about to eat an infant.” This is what happens when a company doesn’t curate the images they ingest into their datasets. Microsoft’s Copilot is using OpenAI’s DALL-E 3, based on sources that OpenAI refuses to name. It’s clear, however, they ingested a large amount of images from far-right propaganda sources, instead of anything related to science or protests for women’s rights.

Microsoft’s generated images also featured “demeaning,” sexualized women. A request for a “car accident,” for example, showed gruesome, graphic imagery, that, for no reason related to the prompt, also featured sexualized women in various states of bodily harm. Where did it get these images? It’s the kind of thing you’d find in sites that fetishize violence against women. Where on earth did they get that data? OpenAI won’t say. AI cannot produce anything new, so whatever OpenAI chose to put in their model is what we’re seeing now. Yes, chose. Because OpenAI and everyone else in the AI industry has been warned to carefully curate their data sets, or they’ll have these kind of disturbing results. They chose to ignore warnings and make massive models of uncurated or barely curated data anyway.

OpenAI also clearly used copyrighted data in the production of their models. Microsoft’s Designer was able to reproduce images of Disney characters, including one of the most popular Disney princesses, Frozen’s Elsa. Prompts asking for images of Elsa in Gaza, wearing a Israeli Defense Forces uniform, with the destruction of Palestine, produced detailed images, as requested, and were also reportedly easy to reproduce later with Microsoft’s tool.

Microsoft claims they want to “minimize the potential for stereotyping, demeaning, or erasing identified demographic groups, including marginalized groups,” but with images like this, it’s clearly not a priority.

When interviewing Jones, CNBC was able to reproduce the disturbing images with similar prompts. However, since the story broke, Microsoft blocked requests for generated images similar to the ones Jones flagged. However, they knew about these problems for months.

Microsoft Engineer Blows Whistle

Shane Jones reported his findings to Microsoft in December. When Jones contacted Microsoft, they told him to make his complaints to OpenAI. He obliged. However, Jones received no response from OpenAI. Jones instead took his issues to a public forum, with an open letter to OpenAI posted on LinkedIn shortly after Twitter was flooded with Taylor Swift deepfake porn. Instead of taking action, Microsoft’s legal department demanded Jones take the letter down. He obliged. However, he then sent letters to lawmakers and the FTC. According to the letter received by FTC Chairperson Lina Khan, Jones says Microsoft has known about these issues since October, and has chosen to keep the model and tool available online despite this over the past few months.

Despite many setbacks, Microsoft still claims Copilot is safe to use. Jones, and, seemingly, quite a few others, disagree. A tool that could show graphic and sexual images to children certainly doesn’t seem safe.

“This is really not a safe model. Over the last three months, I have repeatedly urged Microsoft to remove Copilot Designer from public use until better safeguards could be put in place.”

– Shane Jones, Microsoft Employee and Whistleblower

Microsoft has responded to inquiries from various news organizations with the same message:

“We are committed to addressing any and all concerns employees have in accordance with our company policies and appreciate the employee’s effort in studying and testing our latest technology to further enhance its safety.”

Jones, it seems, disagrees, stating he had “taken extraordinary efforts to try to raise this issue internally.” According to Jones, Microsoft knew for months their Copilot Designer tool could produce violent and sexist imagery when the prompt did not call for it, but chose to do nothing. They could have taken the model down, as Google did, and applied filters to the input and output. They did not do this until news of the issue became public knowledge.

This is how AI is being developed. Ignoring issues, ingesting propaganda, violence, and sexualized images of women and violence against women. Garbage in, garbage out, and nearly every major player in AI is doing the same thing. There may be a reckoning for AI in the future, as regulators step in to make sure it’s safe to use. Perhaps companies will need to spend less time lobbying and influencing lawmakers and users into believing their tools are safe, and more time making them safe before a crackdown.

Or, let it happen. Large companies often foolishly enter a new space and invest too much money into doing something the wrong way, then lose out in the future as the market moves to newer competitors. They don’t look forward, and look to appease shareholders today, only to disappoint them later. Just look at the mobile space, where Microsoft, once king of the desktop OS, couldn’t get Windows Mobile off the ground. Or Palm, whose WebOS is behind many of the innovations we see in mobile operating systems today, failed. Or Blackberry, who didn’t evolve fast enough. Giants fall when they don’t listen to the needs of consumers, and no one needs AI that suggests violence.

Sources:

Ashley Belanger, Ars Technica
Hayden Field, CNBC
Emma Roth, The Verge
Maxwell Zeff, Gizmodo