OpenAI Promises to 'Watermark' Generated Images, It Won't Work

Reading Time: 6 minutes.

A robot holding an AI-generated picture of a cat. It reads "I made this."

Well, I made the robot and the frame. The cat was made with DALL-E, by OpenAI

It feels as though it wasn’t that long ago that I was stating generative AI needed to watermark their creations. My comments were driven by a deluge of deepfake porn of Taylor Swift that flooded Twitter. Watermarking could hold the companies that generated it, as well as the people who made it, responsible. This week, OpenAI promised to do just that. Watermarking could help track down people who make non-consensual deepfake porn of others, as well as prevent the creation and distribution of digitally generated child sexually abusive material (CSAM). It will be essential as we move forward with AI. Generative AI isn’t going away, and we need to prevent the issues of humanity from plaguing what could be a useful technology, if done properly.

However, OpenAI’s watermarks won’t be enough, and they know it. The new watermarks will be a part of images generated with OpenAI’s DALL-E 3, the very software at the heart of Microsoft’s Designer, which may have been behind a number of the Taylor Swift deepfake porn images that spread across Twitter last week. Watermarks could have tracked down who created these images and where they came from. However, the watermarks OpenAI have planned for their service aren’t nearly capable of preventing misuse. These can be removed with simple software, or even just by taking a screenshot and sharing that instead of the original. This isn’t at all the method I had hoped AI companies would go for. I wanted them to embed the data in the image, not alongside it. Steganography allows an image to hide data inside it, in plain sight. While data hidden with steganography could also be removed, it would be more difficult. Watermarking will never be enough, but it’s the right first step towards a safer future, one with AI. OpenAI has the right idea, but not the right tools on hand. Call me hopeful, but perhaps that’s why they’re discussing it, to make watermarking better. Or maybe they just want to shoo away legislators after the Taylor Swift fiasco.

OpenAI’s New Watermarks

GUI of Content Credentials Verify, screenshot

DALL-E’s watermarks flag their images as AI-generated. Source: OpenAI

OpenAI announced they’ll add watermarks to the images they generate with DALL-E 3. However, the watermarks… aren’t watermarks. A watermark, in the traditional sense, is a marking on a piece of paper that helps verify that the document something is printed on is legitimately from the source it claims to be from. It can also sit over stock photos so the image can be identified as having come from that service. OpenAI won’t modify the image at all. Instead, they’ll add information to the metadata for an image, signaling that it was generated by AI.

Metadata is information in a file for something that isn’t the contents of the file. For example, metadata on an image could be information about the camera that took the photo, the author who made the image, location, notes, pixel size, color profile, and other forms of information. In this case, the metadata would indicate that the image was generated by AI, not created by a human. It will tell you what AI tool was used to create it, but, seemingly, not who created it.

The same generated image from OpenAI's example, but now the tool doesn't know it's AI

With just a screenshot, you can easily fool Content Credentials Verify (screenshot via CCR).

You can bypass metadata with apps that remove it from a file or simply by taking a screenshot. Most services strip metadata from uploaded files so other people can’t download them to find your location. If you upload a generated image to a social network, the network itself will likely remove the metadata. I tested it by using the very generated image OpenAI used to show off their watermark. Not only did it fool Content Credentials Verify, it couldn’t even find possible matches, which would show other images where it does still have the watermarking, which could suggest it’s AI-generated. This image had already been uploaded to their service, by all means, they should have been able to find a match with watermarking, but they couldn’t.

Metadata watermarking is a poor method for verifying genuine art. However, this appears to be a nascent implementation as well. It’s just not up to the task yet.

Twitter spread deepfake porn of Taylor Swift. That’s likely what kicked off this initiative. However, OpenAI’s measures wouldn’t have changed anything there. Not only would the people who created it not be foolish enough to include the metadata when they uploaded it, Twitter would have removed the metadata as part of the upload process. OpenAI’s watermarking looks to appease those mad about deepfake porn, without having the power to actually prevent it.

Maybe That’s the Point

Now for a cynical angle. OpenAI admits that their plan is full of flaws, that it would be easy to bypass this metadata. They still think it would lead to trustworthy data. I disagree. If you find their metadata on an image, it most likely wasn’t made by a human. Though anyone could add the metadata to an image, if they wanted to make a real image look as though it was generated by AI, for some reason. It’s basically a text file attached to an image, you can just edit it. OpenAI says they’re following the Coalition for Content Provenance and Authenticity (C2PA) standards. You can use their Content Credentials Verify tool to check the metadata for AI creation.

When I read how bad these standards are, I thought it would be funny if they were made by companies with investments in AI. Metadata watermarks? Who would that protect? If I didn’t know better, I’d say they were made bad to create confusion about AI-generated images. It would be like the wolf making a fence to protect the sheep and making it low enough for a wolf to jump over, with a latch only on the outside, about wolf nose height.

Well, oddly enough, the C2PA consists of a number of companies, including Microsoft, the lead backer of OpenAI’s for-profit wing, Adobe, who has AI in their photo editing products, Intel, who makes processors AI runs on… and you’re probably getting the point. Many of the companies creating these ridiculously low standards are the ones trying to sell AI products. That doesn’t necessarily mean anything nefarious, but don’t expect large investments from companies that could introduce accountability and costs into their own profit margins if it’s easier to hold them accountable.

But what would good standards look like?

What We Actually Need

An image of a tree and an image of a cat. The cat was actually hidden in the image of the trees. Text in image reads: "Image of a tree with a steganographically hidden image. The hidden image is revealed by removing all but the two least significant bits of each color component and a subsequent normalization. The hidden image is shown below." "Image of a cat extracted from the tree image above."

Images/screenshot via Wikipedia, Cyp, CC3.0

Once again, steganography. With steganography you can hide information inside the image. Rather than being something like an attached text file, this would be information hidden in the image itself. Computers are very good at this. A simple bit shift, just a slight difference in a color in a specific position, could be a piece of data. The millions of pixels in an image can easily become authorship data, a simple hashed ID representing the company that generated the model that made the AI, the person who asked for the image to be generated, and a unique identifier for the image, which could link to copyright data. Make the image itself proof of where the image came from and who’s to credit—or blame—for it.

You’ve likely already seen a crude version of this. AI can generate images with a mask that, if you look at the full-sized image, it looks like some random scene, but if you squint your eyes or see a thumbnail, it’s something else. Now imagine that, but at a layer at the pixel-level that a computer can “see” instead of a human. That’s a huge oversimplification, but the practice, hiding data in plain sight, is the same.

Steganography is an ancient technique, going back to ancient Greece. It’s even been referenced in popular media, like the Doctor Who special Spyfall. It’s a well-known technology, and generated AI images are perfect uses for it.

This technique could be defeated by saving an image, removing its metadata, making small tweaks to it, or resizing it. Although, the latter could be resolved by making the metadata repeat through the image using a lattice pattern, so even if shrunken, the oversimplified “four pixels that become one” technique would still potentially contain the same data. This wouldn’t be foolproof, but it also would make the digital watermark far more difficult to remove. If applied to deepfaked videos, it would force someone to go through every frame to remove the watermarks, which may change position in each frame, making it near impossible to do easily without help of, likely, more AI. It would add a significant step to remove a real watermark, enhancing the chances of a mistake that leads to someone getting caught. Steganography can even add a watermark to audio, which, when views in a waveform visualizer, could display information. This could prevent deepfake calls, and help track down who is making the now illegal political deepfake calls. Hiding information in plain sight is easy for a computer.

There’s no perfect solution, but there are many better than the ones proposed by C2PA and OpenAI. Even OpenAI admits their current solution isn’t foolproof. I hope that’s because they have a better solution in mind.

Telling the difference between real art and the fake images generated by AI using—often without permission—the work of real artists, is important. It allows us to separate truth from fiction online, as well as protects people from having their image used for deepfaked porn. As the internet becomes flooded with fakes, we’ll need a quick way to tell who’s generating the images, what tools they’re using, and what went into each image generated. It’ll be important for protecting children from CSAM, celebrities and politicians from deepfaked images and videos, and us, from digitally-generated lies.

Sources:

Emilia David, The Verge, [2]
Thomas Germain, Gizmodo
Janus Rose, Vice