nudity
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (0.93)
- Europe > Switzerland > Zürich > Zürich (0.14)
- Asia > South Korea > Seoul > Seoul (0.04)
- Europe > Switzerland > Zürich > Zürich (0.14)
- North America > United States > Minnesota (0.04)
- North America > United States > Michigan (0.04)
- Asia > Middle East > Jordan (0.04)
- Information Technology > Security & Privacy (1.00)
- Law (0.67)
Elon Musk's Grok 'Undressing' Problem Isn't Fixed
X has placed more restrictions on Grok's ability to generate explicit AI images, but tests show that the updates have created a patchwork of limitations that fail to fully address the issue. Elon Musk's X has introduced new restrictions stopping people from editing and generating images of real people in bikinis or other "revealing clothing." The change in policy on Wednesday night follows global outrage at Grok being used to generate thousands of harmful non-consensual "undressing" photos of women and sexualized images of apparent minors on X. However, while it appears that some safety measures have finally been introduced to Grok's image generation on X, the standalone Grok app and website seem to still be able to generate "undress" style images and pornographic content, according to multiple tests by researchers, WIRED, and other journalists. Other users, meanwhile, say they're no longer to create images and videos as they once were.
- North America > United States > Minnesota (0.05)
- North America > United States > California (0.05)
- Europe > United Kingdom (0.05)
- (12 more...)
- Information Technology > Security & Privacy (0.47)
- Media > News (0.35)
Defensive Unlearning with Adversarial Training for Robust Concept Erasure in Diffusion Models
The techniques of machine unlearning, also known as concept erasing, have been developed to address these risks. However, these techniques remain vulnerable to adversarial prompt attacks, which can prompt DMs post-unlearning to regenerate undesired images containing concepts (such as nudity) meant to be erased. This work aims to enhance the robustness of concept erasing by integrating the principle of adversarial training (AT) into machine unlearning, resulting in the robust unlearning framework referred to as AdvUnlearn. However, achieving this effectively and efficiently is highly nontrivial. First, we find that a straightforward implementation of AT compromises DMs' image generation quality post-unlearning. To address this, we develop a utility-retaining regularization on an additional retain set, optimizing the trade-off between concept erasure robustness and model utility in AdvUnlearn.
Memory Self-Regeneration: Uncovering Hidden Knowledge in Unlearned Models
Polowczyk, Agnieszka, Polowczyk, Alicja, Waczyńska, Joanna, Borycki, Piotr, Spurek, Przemysław
The impressive capability of modern text-to-image models to generate realistic visuals has come with a serious drawback: they can be misused to create harmful, deceptive or unlawful content. This has accelerated the push for machine unlearning. This new field seeks to selectively remove specific knowledge from a model's training data without causing a drop in its overall performance. However, it turns out that actually forgetting a given concept is an extremely difficult task. Models exposed to attacks using adversarial prompts show the ability to generate so-called unlearned concepts, which can be not only harmful but also illegal. In this paper, we present considerations regarding the ability of models to forget and recall knowledge, introducing the Memory Self-Regeneration task. Furthermore, we present MemoRa strategy, which we consider to be a regenerative approach supporting the effective recovery of previously lost knowledge. Moreover, we propose that robustness in knowledge retrieval is a crucial yet underexplored evaluation measure for developing more robust and effective unlearning techniques. Finally, we demonstrate that forgetting occurs in two distinct ways: short-term, where concepts can be quickly recalled, and long-term, where recovery is more challenging. Code is available at https://gmum.github.io/MemoRa/.
- North America > United States > Virginia (0.04)
- Asia (0.04)
- Information Technology (0.67)
- Health & Medicine > Therapeutic Area > Neurology (0.46)
- Information Technology > Artificial Intelligence > Vision (1.00)
- Information Technology > Artificial Intelligence > Natural Language (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
- Information Technology > Sensing and Signal Processing > Image Processing (0.93)
Erasing Undesirable Concepts in Diffusion Models with Adversarial Preservation
Diffusion models excel at generating visually striking content from text but can inadvertently produce undesirable or harmful content when trained on unfiltered internet data. A practical solution is to selectively removing target concepts from the model, but this may impact the remaining concepts. Prior approaches have tried to balance this by introducing a loss term to preserve neutral content or a regularization term to minimize changes in the model parameters, yet resolving this trade-off remains challenging.
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (0.93)
- Information Technology > Artificial Intelligence > Vision (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Natural Language (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
- Europe > Switzerland > Zürich > Zürich (0.14)
- Asia > South Korea > Seoul > Seoul (0.04)
- Europe > Switzerland > Zürich > Zürich (0.14)
- North America > United States > Minnesota (0.04)
- North America > United States > Michigan (0.04)
- Asia > Middle East > Jordan (0.04)
- Information Technology > Security & Privacy (1.00)
- Transportation (0.69)
- Law (0.67)