Is trying to watermark AI images a losing battle?

A new paper pokes holes in the ways companies are trying to tag generated visuals.
article cover

Francis Scialabba

· 4 min read

With a flood of sophisticated AI imagery loosening the internet’s already-shaky grip on media reality, one of the most-discussed possible fixes is the tried-and-true watermark.

Concerned parties from Google to the White House have floated the idea of embedding signifiers in AI-generated images, whether perceptible to the human eye or not, as a way to differentiate them from unaltered photos and art.

But a new preprint paper from researchers at the University of Maryland casts some doubt on that endeavor. The team tested how easy it was to fool various watermarking techniques as well as how the introduction of false positives—real images incorrectly watermarked as AI-generated—could muddy the waters.

After laying out the many ways watermarking can fail, the paper concluded that “based on our results, designing a robust watermark is a challenging, but not necessarily impossible task.” In a conversation with Tech Brew, however, co-author Soheil Feizi, an associate professor of computer science at the University of Maryland, was perhaps even more pessimistic.

“I don’t believe that by just looking at an image we will be able to tell if this is AI-generated or real, especially with the current advances that we have in pixel image models,” Feizi said. “So this problem becomes increasingly more difficult.”

Marking momentum

Conclusions like these, however, don’t seem to have put much of a damper on enthusiasm around watermarking as a guardrail in the tech and policy world.

At a meeting at the White House in July, top companies in the space like OpenAI, Anthropic, Google, and Meta committed to developing ways for users to tell when content is AI-generated “such as a watermarking system,” though academics at the time were skeptical of the agreements as a whole.

More recently, Google Deepmind unveiled a watermarking system called SynthID that embeds a watermark that is imperceptible to the human eye into the pixels of an image as it’s being created.

Feizi and his co-authors classified systems like SynthID as “low-perturbation” techniques, or those where the watermark is invisible to the naked eye. The problem with these methods, according to the paper, is that if the images are subjected to a certain type of tampering, there is a fundamental trade-off between the number of AI-generated images that can slip through undetected and the portion of real images falsely tagged as AI creations.

Keep up with the innovative tech transforming business

Tech Brew keeps business leaders up-to-date on the latest innovations, automation advances, policy shifts, and more, so they can make informed decisions about tech.

This tampering, used in the experiment detailed in the University of Maryland paper, involves something the authors call a “diffusion purification attack,” which basically involves adding noise to an image and then denoising it with an AI diffusion model—the same type of tech at play in most modern image-generation systems.

“What we show theoretically, is that [low-perturbation techniques] will never be able to become a reliable approach,” Feizi said. “Basically, results suggest that reliable methods won’t be possible even in the future.”

Then there are the actually visible watermarks—or “high-perturbation” techniques, in the technical parlance. Getting in the mindset of potential bad actors, Feizi and his team designed attacks that could successfully remove those watermarks.

They were also able to realistically fake this type of watermark, meaning nefarious parties could pass off real, potentially obscene images as the product of an image-generation model, something that could damage the reputation of the developers, for example, the paper pointed out.

If viewers can’t trust that watermarks correspond with whether or not an image is actually AI-generated, it becomes effectively useless, Feizi said. “Some people argue that even though watermarking is not reliable, it provides ‘some information,’” Feizi said. “But the argument here is that it actually provides zero information.”

“Be skeptical of the content”

This paper isn’t the first to question the feasibility of watermarking systems. Another recent paper from UC Santa Barbara and Carnegie Mellon University demonstrated that imperceptible watermarks can be removed.

Feizi also co-authored a paper in June that argued watermarking AI-generated text, a method that was being discussed by companies like OpenAI at the time as a way to distinguish language generated by programs like ChatGPT with subtle patterns, is “not reliable in practical scenarios.”

But the development of effective guardrails is taking on more urgency as the US barrels toward a presidential election that experts worry could be plagued by all sorts of new AI-generated misinformation.

“It is a bit scary to think of the implications,” Feizi said. “We’ve got to make sure that we educate people to be skeptical of the content—either text or images or videos that may be released…in terms of their authenticity.”

Keep up with the innovative tech transforming business

Tech Brew keeps business leaders up-to-date on the latest innovations, automation advances, policy shifts, and more, so they can make informed decisions about tech.