AI

The Taylor Swift deepfake debacle was frustratingly preventable

Comment

Taylor Swift performs onstage during the 2018 American Music Awards at Microsoft Theater on October 9, 2018 in Los Angeles, California.
Image Credits: Kevin Winter / Getty Images

You know you’ve screwed up when you’ve simultaneously angered the White House, the TIME Person of the Year and pop culture’s most rabid fanbase. That’s what happened last week to X, the Elon Musk-owned platform formerly called Twitter, when AI-generated, pornographic deepfake images of Taylor Swift went viral.

One of the most widespread posts of the nonconsensual, explicit deepfakes was viewed more than 45 million times, with hundreds of thousands of likes. That doesn’t even factor in all the accounts that reshared the images in separate posts — once an image has been circulated that widely, it’s basically impossible to remove.

X lacks the infrastructure to identify abusive content quickly and at scale. Even in the Twitter days, this issue was difficult to remedy, but it’s become much worse since Musk gutted so much of Twitter’s staff, including the majority of its trust and safety teams. So, Taylor Swift’s massive and passionate fanbase took matters into their own hands, flooding search results for queries like “taylor swift ai” and “taylor swift deepfake” to make it more difficult for users to find the abusive images. As the White House’s press secretary called on Congress to do something, X simply banned the search term “taylor swift” for a few days. When users searched the musician’s name, they would see a notice that an error had occurred.

This content moderation failure became a national news story, since Taylor Swift is Taylor Swift. But if social platforms can’t protect one of the most famous women in the world, who can they protect?

“If you have what happened to Taylor Swift happen to you, as it’s been happening to so many people, you’re likely not going to have the same amount of support based on clout, which means you won’t have access to these really important communities of care,” Dr. Carolina Are, a fellow at Northumbria University’s Centre for Digital Citizens in the U.K., told TechCrunch. “And these communities of care are what most users are having to resort to in these situations, which really shows you the failure of content moderation.”

Banning the search term “taylor swift” is like putting a piece of Scotch tape on a burst pipe. There are many obvious workarounds, like how TikTok users search for “seggs” instead of sex. The search block was something that X could implement to make it look like they’re doing something, but it doesn’t stop people from just searching “t swift” instead. Copia Institute and Techdirt founder Mike Masnick called the effort “a sledge hammer version of trust & safety.”

“Platforms suck when it comes to giving women, non-binary people and queer people agency over their bodies, so they replicate offline systems of abuse and patriarchy,” Are said. “If your moderation systems are incapable of reacting in a crisis, or if your moderation systems are incapable of reacting to users’ needs when they’re reporting that something is wrong, we have a problem.”

So, what should X have done to prevent the Taylor Swift fiasco?

Are asks these questions as part of her research, and proposes that social platforms need a complete overhaul of how they handle content moderation. Recently, she conducted a series of roundtable discussions with 45 internet users from around the world who are impacted by censorship and abuse to issue recommendations to platforms about how to enact change.

One recommendation is for social media platforms to be more transparent with individual users about decisions regarding their account or their reports about other accounts.

“You have no access to a case record, even though platforms do have access to that material — they just don’t want to make it public,” Are said. “I think when it comes to abuse, people need a more personalized, contextual and speedy response that involves, if not face-to-face help, at least direct communication.”

X announced this week that it would hire 100 content moderators to work out of a new “Trust and Safety” center in Austin, Texas. But under Musk’s purview, the platform has not set a strong precedent for protecting marginalized users from abuse. It can also be challenging to take Musk at face value, as the mogul has a long track record of failing to deliver on his promises. When he first bought Twitter, Musk declared he would form a content moderation council before making major decisions. This did not happen.

In the case of AI-generated deepfakes, the onus is not just on social platforms. It’s also on the companies that create consumer-facing generative AI products.

According to an investigation by 404 Media, the abusive depictions of Swift came from a Telegram group devoted to creating nonconsensual, explicit deepfakes. The members of the group often use Microsoft Designer, which draws from OpenAI’s DALL-E 3 to generate images based on inputted prompts. In a loophole that Microsoft has since addressed, users could generate images of celebrities by writing prompts like “taylor ‘singer’ swift” or “jennifer ‘actor’ aniston.”

A principal software engineering lead at Microsoft, Shane Jones, wrote a letter to the Washington state attorney general stating that he found vulnerabilities in DALL-E 3 in December, which made it possible to “bypass some of the guardrails that are designed to prevent the model from creating and distributing harmful images.”

Jones alerted Microsoft and OpenAI to the vulnerabilities, but after two weeks, he had received no indication that the issues were being addressed. So, he posted an open letter on LinkedIn to urge OpenAI to suspend the availability of DALL-E 3. Jones alerted Microsoft to his letter, but he was swiftly asked to take it down.

“We need to hold companies accountable for the safety of their products and their responsibility to disclose known risks to the public,” Jones wrote in his letter to the state attorney general. “Concerned employees, like myself, should not be intimidated into staying silent.”

OpenAI told TechCrunch that it immediately investigated Jones’ report and found that the technique he outlined did not bypass its safety systems.

“In the underlying DALL-E 3 model, we’ve worked to filter the most explicit content from its training data including graphic sexual and violent content, and have developed robust image classifiers that steer the model away from generating harmful images,” a spokesperson from OpenAI said. “We’ve also implemented additional safeguards for our products, ChatGPT and the DALL-E API – including declining requests that ask for a public figure by name.”

OpenAI added that it uses external red teaming to test products for misuse. It’s still not confirmed if Microsoft’s program is responsible for the explicit Swift deepfakes, but the fact stands that as of last week, both journalists and bad actors on Telegram were able to use this software to generate images of celebrities.

Jones refutes OpenAI’s claims. He told TechCrunch, “I am only now learning that OpenAI believes this vulnerability does not bypass their safeguards. This morning, I ran another test using the same prompts I reported in December and without exploiting the vulnerability, OpenAI’s safeguards blocked the prompts on 100% of the tests. When testing with the vulnerability, the safeguards failed 78% of the time, which is a consistent failure rate with earlier tests. The vulnerability still exists.”

As the world’s most influential companies bet big on AI, platforms need to take a proactive approach to regulate abusive content — but even in an era when making celebrity deepfakes wasn’t so easy, violative behavior easily evaded moderation.

“It really shows you that platforms are unreliable,” Are said. “Marginalized communities have to trust their followers and fellow users more than the people that are technically in charge of our safety online.”

Updated, 1/30/24 at 10:30 PM ET, with comment from OpenAI
Updated, 1/31/24 at 6:10 PM ET, with additional comment from Shane Jones

Swift retaliation: Fans strike back after explicit deepfakes flood X

Ahead of congressional hearing on child safety, X announces plans to hire 100 moderators in Austin

More TechCrunch

The tests indicate there are loopholes in TikTok’s ability to apply its parental controls and policies effectively in a situation where the teen user originally lied about their age, as…

TikTok glitch allows Shop to appear to users under 18, despite adults-only policy

Lhoopa has raised $80 million to address the lack of affordable housing in Southeast Asian markets, starting with the Philippines.

Lhoopa raises $80M to spur more affordable housing in the Philippines

Former President Donald Trump picked Ohio Senator J.D. Vance as his running mate on Monday, as he runs to reclaim the office he lost to President Joe Biden in 2020.…

Trump’s VP candidate J.D. Vance has long ties to Silicon Valley, and was a VC himself

Hello and welcome back to TechCrunch Space. Is it just me, or is the news cycle only accelerating this summer?!

TechCrunch Space: Space cowboys

Apple Intelligence features are not available in the developer beta, which is out now.

Without Apple Intelligence, iOS 18 beta feels like a TV show that’s waiting for the finale

Apple released the public betas for its next generation of software on the iPhone, Mac, iPad and Apple Watch on Monday. You can now test out iOS 18 and many…

Apple’s public betas for iOS 18 are here to test out

One major dissenter threatens to upend Fisker’s apparent best chance at offloading its unsold EVs, a deal that would keep the startup’s bankruptcy proceeding alive and pave the way for…

Fisker has one major objector to its Ocean SUV fire sale

Payments giant Stripe has delayed going public for so long that its major investor Sequoia Capital is getting creative to offer returns to its limited partners. The venture firm emailed…

Major Stripe investor Sequoia confirms $70B valuation, offers its investors a payday

Alphabet, Google’s parent company, is in advanced talks to acquire Wiz for $23 billion, a person close to the company told TechCrunch. The deal discussions were previously reported by The…

Google’s Kurian approached Wiz, $23B deal could take a week to land, source says

Name That Bird determines individual members of a species by identifying distinguishing characteristics that most humans would be hard-pressed to spot.

Bird Buddy’s new AI feature lets people name and identify individual birds

YouTube Music is introducing two new ways to boost song discovery on its platform. YouTube announced on Monday that it’s experimenting with an AI-generated conversational radio feature, and rolling out…

YouTube Music is testing an AI-generated radio feature and adding a song recognition tool

Tesla had internally planned to build the dedicated robotaxi and the $25,000 car, often referred to as the Model 2, on the same platform.

Elon Musk confirms Tesla ‘robotaxi’ event delayed due to design change

What this means for the space industry is that theory has become reality: The possibility of designing a habitation within a lunar tunnel is a reasonable proposition.

Moon cave! Discovery could redirect lunar colony and startup plays

Get ready for a prime week of savings at TechCrunch Disrupt 2024 with the launch of Disrupt Deal Days! From now to July 19 at 11:59 p.m. PT, we’re going…

Disrupt Deal Days are here: Prime savings for TechCrunch Disrupt 2024!

Deezer is the latest music streaming app to introduce an AI playlist feature. The company announced on Monday that a select number of paid users will be able to create…

Deezer chases Spotify and Amazon Music with its own AI playlist generator

Real-time payments are becoming commonplace for individuals and businesses, but not yet for cross-border transactions. That’s what Caliza is hoping to change, starting with Latin America. Founded in 2021 by…

Caliza lands $8.5 million to bring real-time money transfers to Latin America using USDC

Adaptive is a platform that provides tools designed to simplify payments and accounting for general construction contractors.

Adaptive builds automation tools to speed up construction payments

When VanMoof declared bankruptcy last year, it left around 5,000 customers who had preordered e-bikes in the lurch. Now VanMoof is up and running under new management, and the company’s…

How VanMoof’s new owners plan to win over its old customers

Mitti Labs aims to transform rice farming in India and other South Asian markets by reducing methane emissions by 50% and water consumption by 30%.

Mitti Labs aims to make rice farming less harmful to the climate, starting in India

This is a guide on how to check whether someone compromised your online accounts.

How to tell if your online accounts have been hacked

There is a general consensus today that generative AI is going to transform business in a profound way, and companies and individuals who don’t get on board will be quickly…

The AI financial results paradox

Google’s parent company Alphabet might be on the verge of making its biggest acquisition ever. The Wall Street Journal reports that Alphabet is in advanced talks to acquire Wiz for…

Google reportedly in talks to acquire cloud security company Wiz for $23B

Featured Article

Hank Green reckons with the power — and the powerlessness — of the creator

Hank Green has had a while to think about how social media has changed us. He started making YouTube videos in 2007 with his brother, novelist John Green, at a time when the first iPhone was in development, Myspace was still relevant and Instagram didn’t exist. Seventeen years later, posting…

Hank Green reckons with the power — and the powerlessness — of the creator

Here is a timeline of Synapse’s troubles and the ongoing impact it is having on banking consumers. 

Synapse’s collapse has frozen nearly $160M from fintech users — here’s how it happened

Featured Article

Helixx wants to bring fast-food economics and Netflix pricing to EVs

When Helixx co-founder and CEO Steve Pegg looks at Daisy — the startup’s 3D-printed prototype delivery van — he sees a second chance. And he’s pulling inspiration from McDonald’s to get there.  The prototype, which made its global debut this week at the Goodwood Festival of Speed, is an interesting proof…

Helixx wants to bring fast-food economics and Netflix pricing to EVs

Featured Article

India clings to cheap feature phones as brands struggle to tap new smartphone buyers

India is struggling to get new smartphone buyers, as millions of Indians don’t go for an upgrade and continue to be on feature phones.

India clings to cheap feature phones as brands struggle to tap new smartphone buyers

Roboticists at The Faboratory at Yale University have developed a way for soft robots to replicate some of the more unsettling things that animals and insects can accomplish — say,…

Meet the soft robots that can amputate limbs and fuse with other robots

Featured Article

If you’re an AT&T customer, your data has likely been stolen

This week, AT&T confirmed it will begin notifying around 110 million AT&T customers about a data breach that allowed cybercriminals to steal the phone records of “nearly all” of its customers. The stolen data contains phone numbers and AT&T records of calls and text messages during a six-month period in…

If you’re an AT&T customer, your data has likely been stolen

In the first half of 2024 alone, more than $35.5 billion was invested into AI startups globally.

Here’s the full list of 28 US AI startups that have raised $100M or more in 2024

Whistleblowers have accused OpenAI of placing illegal restrictions on how employees can communicate with government regulators, according to a letter obtained by The Washington Post. Lawyers representing anonymous whistleblowers sent…

Whistleblowers accuse OpenAI of ‘illegally restrictive’ NDAs