Goto

Collaborating Authors

 nightshade


Interpreting Structured Perturbations in Image Protection Methods for Diffusion Models

Martin, Michael R., Chan, Garrick, Ma, Kwan-Liu

arXiv.org Artificial Intelligence

Recent image protection mechanisms such as Glaze and Nightshade introduce imperceptible, adversarially designed perturbations intended to disrupt downstream text-to-image generative models. While their empirical effectiveness has been demonstrated, the internal structure, detectability, and representational behavior of these perturbations remain poorly understood. In this study, we demonstrated a systematic explainable AI analysis of image protection perturbations using a unified framework that integrates white-box feature-space inspection and black-box signal-level probing. Through latent-space clustering, feature-channel activation analysis, occlusion-based spatial sensitivity mapping, and frequency-domain spectral characterization, we revealed that modern protection mechanisms operate as structured, low-entropy perturbations that remain tightly coupled to underlying image content across representational, spatial, and spectral domains in all evaluated cases. We showed that protected images preserve content-driven feature organization with protection-specific substructure rather than inducing global representational drift. Detectability is governed by interacting effects of perturbation entropy, spatial deployment, and frequency alignment as revealed through combined synthetic and spectral analyses, with sequential protection amplifying detectable structure rather than suppressing it. Frequency-domain analysis further demonstrated that Glaze and Nightshade redistribute energy along dominant image-aligned frequency axes rather than introducing spectrally diffuse noise. These results suggested that contemporary image protection operates through structured feature-level deformation rather than semantic dislocation, providing mechanistic insight into why protection signals remain visually subtle yet consistently detectable. This work advances the interpretability of adversarial image protection and informs the design of future defenses and detection strategies for generative AI systems.


This tool strips away anti-AI protections from digital art

MIT Technology Review

To be clear, the researchers behind LightShed aren't trying to steal artists' work. They just don't want people to get a false sense of security. "You will not be sure if companies have methods to delete these poisons but will never tell you," says Hanna Foerster, a PhD student at the University of Cambridge and the lead author of a paper on the work. And if they do, it may be too late to fix the problem. AI models work, in part, by implicitly creating boundaries between what they perceive as different categories of images.


The AI lab waging a guerrilla war over exploitative AI

MIT Technology Review

On the call, artists shared details of how they had been hurt by the generative AI boom, which was then brand new. At that moment, AI was suddenly everywhere. The tech community was buzzing over image-generating AI models, such as Midjourney, Stable Diffusion, and OpenAI's DALL-E 2, which could follow simple word prompts to depict fantasylands or whimsical chairs made of avocados. But these artists saw this technological wonder as a new kind of theft. They felt the models were effectively stealing and replacing their work.


2024 Innovator of the Year: Shawn Shan builds tools to help artists fight back against exploitative AI

MIT Technology Review

Now artists are fighting back. And some of the most powerful tools they have were built by Shawn Shan, 26, a PhD student in computer science at the University of Chicago (and MIT Technology Review's 2024 Innovator of the Year). Shan got his start in AI security and privacy as an undergraduate there and participated in a project that built Fawkes, a tool to protect faces from facial recognition technology. But it was conversations with artists who had been hurt by the generative AI boom that propelled him into the middle of one of the biggest fights in the field. Soon after learning about the impact on artists, Shan and his advisors Ben Zhao (who made our Innovators Under 35 list in 2006) and Heather Zheng (who was on the 2005 list) decided to build a tool to help. They gathered input from more than a thousand artists to learn what they needed and how they would use any protective technology.


Poisoning Data to Protect It

Communications of the ACM

After they released a tool designed to foil facial recognition systems in 2020, computer scientist Ben Zhao and his colleagues at the University of Chicago received a confusing email. Their solution, Fawkes, subtly alters the pixels in digital portraits, rendering images incomprehensible to automated facial recognition systems. So when an artist emailed Zhao to ask whether Fawkes might be used to protect her work, he did not see the connection. Then news of revolutionary generative artificial intelligence (AI) solutions like Midjourney and Dall-E began to spread. Digital illustrations, photographs, and other visual works had been scraped from the Internet to train various generative models without the consent of the creators.


Poison pill tool could break AI systems stealing unauthorized data, allowing artists to safeguard their works

FOX News

AI image generators Midjourney and Stable Diffusion trained their models with the works of countless artists without their permission or compensation, artist says. A new image protection tool was designed to poison AI programs that are trained using unauthorized data, giving creators a new way to safeguard their pieces and harm systems they say are stealing their works. Nightshade, a new tool from a University of Chicago team, puts data into an image's pixels that damage AI image generators that scour the web looking for pictures to train on, causing them to not work properly. An AI program might interpret a Nightshade-protected image of a dog, for example, as a cat, a photo of a car could be seen as a cow, and so on, causing the machine to malfunction, according to the team's research. A visitor takes a picture with his mobile phone of an image designed with artificial intelligence by Berlin-based digital creator Julian van Dieken inspired by Johannes Vermeer's painting "Girl with a Pearl Earring" at the Mauritshuis museum in The Hague on March 9, 2023.


New tool lets artists fight AI image bots by hiding corrupt data in plain sight

Engadget

From Hollywood strikes to digital portraits, AI's potential to steal creatives' work and how to stop it has dominated the tech conversation in 2023. The latest effort to protect artists and their creations is Nightshade, a tool allowing artists to add undetectable pixels into their work that could corrupt an AI's training data, the MIT Technology Review reports. University of Chicago professor Ben Zhao and his team created Nightshade, which is currently being peer reviewed, in an effort to put some of the power back in artists' hands. They tested it on recent Stable Diffusion models and an AI they personally built from scratch. Nightshade essentially works as a poison, altering how a machine-learning model produces content and what that finished product looks like.


This new tool could give artists an edge over AI

MIT Technology Review

Some have united in protest against the tech sector's common practice of indiscriminately scraping their visual work off the internet to train their models. Artists have staged protests on popular art platforms such as DeviantArt and Art Station, or left the platforms entirely. Right now, there is total power asymmetry between rich and influential technology companies and artists, says Ben Zhao, a computer science professor at the University of Chicago. "The training companies can do whatever the heck they want," Zhao says. But a new tool developed by Zhao's lab might change that power dynamic.


This new data poisoning tool lets artists fight back against generative AI

MIT Technology Review

Meta, Google, Stability AI, and OpenAI did not respond to MIT Technology Review's request for comment on how they might respond. Zhao's team also developed Glaze, a tool that allows artists to "mask" their own personal style to prevent it from being scraped by AI companies. It works in a similar way to Nightshade: by changing the pixels of images in subtle ways that are invisible to the human eye but manipulate machine-learning models to interpret the image as something different from what it actually shows. The team intends to integrate Nightshade into Glaze, and artists can choose whether they want to use the data-poisoning tool or not. The team is also making Nightshade open source, which would allow others to tinker with it and make their own versions.


Laura Ingraham torches woke 'Transformers': 'Shoving' pronouns down kids' 'throats' with 'corrosive lies'

FOX News

Fox News host Laura Ingraham reacts to a segment from the children's show'Tranformers' where two animated characters revealed that they are non-binary. During Friday's edition of "The Ingraham Angle," host Laura Ingraham spotlighted the children's show "Transformers: EarthSpark" for its decision to roll out non-binary characters that go by they/them and she/they pronouns. "As a parent, you often ask yourself, What do my kids need? A loving family, a roof over their heads, a life grounded in faith and freedom. But you probably ever thought what they really, really need is a non-binary robot. But that's exactly what Paramount thinks they need, because in Transformers: Earth Spark, that's what they're giving them," Ingraham began during Friday's show.