AITopics | Gandikota, Rohit

Collaborating Authors

Gandikota, Rohit

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

SliderSpace: Decomposing the Visual Capabilities of Diffusion Models

Gandikota, Rohit, Wu, Zongze, Zhang, Richard, Bau, David, Shechtman, Eli, Kolkin, Nick

arXiv.org Artificial IntelligenceFeb-3-2025

We present SliderSpace, a framework for automatically decomposing the visual capabilities of diffusion models into controllable and human-understandable directions. Unlike existing control methods that require a user to specify attributes for each edit direction individually, SliderSpace discovers multiple interpretable and diverse directions simultaneously from a single text prompt. Each direction is trained as a low-rank adaptor, enabling compositional control and the discovery of surprising possibilities in the model's latent space. Through extensive experiments on state-of-the-art diffusion models, we demonstrate SliderSpace's effectiveness across three applications: concept decomposition, artistic style exploration, and diversity enhancement. Our quantitative evaluation shows that SliderSpace-discovered directions decompose the visual structure of model's knowledge effectively, offering insights into the latent capabilities encoded within diffusion models. User studies further validate that our method produces more diverse and useful variations compared to baselines. Our code, data and trained weights are available at https://sliderspace.baulab.info

artificial intelligence, machine learning, sliderspace, (16 more...)

arXiv.org Artificial Intelligence

2502.01639

Country: Europe > Middle East > Cyprus (0.14)

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.94)

Add feedback

Model Tampering Attacks Enable More Rigorous Evaluations of LLM Capabilities

Che, Zora, Casper, Stephen, Kirk, Robert, Satheesh, Anirudh, Slocum, Stewart, McKinney, Lev E, Gandikota, Rohit, Ewart, Aidan, Rosati, Domenic, Wu, Zichu, Cai, Zikui, Chughtai, Bilal, Gal, Yarin, Huang, Furong, Hadfield-Menell, Dylan

arXiv.org Artificial IntelligenceFeb-3-2025

Evaluations of large language model (LLM) risks and capabilities are increasingly being incorporated into AI risk management and governance frameworks. Currently, most risk evaluations are conducted by designing inputs that elicit harmful behaviors from the system. However, a fundamental limitation of this approach is that the harmfulness of the behaviors identified during any particular evaluation can only lower bound the model's worst-possible-case behavior. As a complementary method for eliciting harmful behaviors, we propose evaluating LLMs with model tampering attacks which allow for modifications to latent activations or weights. We pit state-of-the-art techniques for removing harmful LLM capabilities against a suite of 5 input-space and 6 model tampering attacks. In addition to benchmarking these methods against each other, we show that (1) model resilience to capability elicitation attacks lies on a low-dimensional robustness subspace; (2) the attack success rate of model tampering attacks can empirically predict and offer conservative estimates for the success of held-out input-space attacks; and (3) state-of-the-art unlearning methods can easily be undone within 16 steps of fine-tuning. Together these results highlight the difficulty of removing harmful LLM capabilities and show that model tampering attacks enable substantially more rigorous evaluations than input-space attacks alone. We release models at https://huggingface.co/LLM-GAT

large language model, machine learning, natural language, (14 more...)

arXiv.org Artificial Intelligence

2502.05209

Country:

North America > Canada (0.14)
North America > United States (0.14)

Genre: Research Report > New Finding (0.96)

Industry:

Information Technology > Security & Privacy (1.00)
Government (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Unified Concept Editing in Diffusion Models

Gandikota, Rohit, Orgad, Hadas, Belinkov, Yonatan, Materzyńska, Joanna, Bau, David

arXiv.org Artificial IntelligenceOct-22-2024

Text-to-image models suffer from various safety issues that may limit their suitability for deployment. Previous methods have separately addressed individual issues of bias, copyright, and offensive content in text-to-image models. However, in the real world, all of these issues appear simultaneously in the same model. We present a method that tackles all issues with a single approach. Our method, Unified Concept Editing (UCE), edits the model without training using a closed-form solution, and scales seamlessly to concurrent edits on text-conditional diffusion models. We demonstrate scalable simultaneous debiasing, style erasure, and content moderation by editing text-to-image projections, and we present extensive experiments demonstrating improved efficacy and scalability over prior work. Our code is available at https://unified.baulab.info

artificial intelligence, diffusion model, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2308.14761

Country: North America > United States (0.93)

Genre: Research Report (1.00)

Industry:

Health & Medicine (0.67)
Government (0.67)
Law (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.69)

Add feedback

Erasing Conceptual Knowledge from Language Models

Gandikota, Rohit, Feucht, Sheridan, Marks, Samuel, Bau, David

arXiv.org Artificial IntelligenceOct-3-2024

Concept erasure in language models has traditionally lacked a comprehensive evaluation framework, leading to incomplete assessments of effectiveness of erasure methods. We propose an evaluation paradigm centered on three critical criteria: innocence (complete knowledge removal), seamlessness (maintaining conditional fluent generation), and specificity (preserving unrelated task performance). Our evaluation metrics naturally motivate the development of Erasure of Language Memory (ELM), a new method designed to address all three dimensions. ELM employs targeted low-rank updates to alter output distributions for erased concepts while preserving overall model capabilities including fluency when prompted for an erased concept. We demonstrate ELM's efficacy on biosecurity, cybersecurity, and literary domain erasure tasks. Comparative analysis shows that ELM achieves superior performance across our proposed metrics, including near-random scores on erased topic assessments, generation fluency, maintained accuracy on unrelated benchmarks, and robustness under adversarial attacks. Our code, data, and trained models are available at https://elm.baulab.info

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2410.0276

Country:

Asia (0.67)
North America > United States (0.28)

Genre: Research Report (1.00)

Industry:

Information Technology > Security & Privacy (1.00)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Therapeutic Area > Immunology (1.00)
(3 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

DC-Art-GAN: Stable Procedural Content Generation using DC-GANs for Digital Art

Gandikota, Rohit, Brown, Nik Bear

arXiv.org Artificial IntelligenceMar-13-2023

Art is an artistic method of using digital technologies as a part of the generative or creative process. With the advent of digital currency and NFTs (Non-Fungible Token), the demand for digital art is growing aggressively. In this manuscript, we advocate the concept of using deep generative networks with adversarial training for a stable and variant art generation. The work mainly focuses on using the Deep Convolutional Generative Adversarial Network (DC-GAN) and explores the techniques to address the common pitfalls in GAN training. We compare various architectures and designs of DC-GANs to arrive at a recommendable design choice for a stable and realistic generation. The main focus of the work is to generate realistic images that do not exist in reality but are synthesised from random noise by the proposed model. We provide visual results of generated animal face images (some pieces of evidence showing a blend of species) along with recommendations for training, architecture and design choices. We also show how training image preprocessing plays a massive role in GAN training.

artificial intelligence, machine learning, stable procedural content generation, (3 more...)

arXiv.org Artificial Intelligence

2209.02847

Genre: Research Report (0.66)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback