AITopics | Torr, Philip

Collaborating Authors

Torr, Philip

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Attacking Multimodal OS Agents with Malicious Image Patches

Aichberger, Lukas, Paren, Alasdair, Gal, Yarin, Torr, Philip, Bibi, Adel

arXiv.org Artificial IntelligenceMar-13-2025

Recent advances in operating system (OS) agents enable vision-language models to interact directly with the graphical user interface of an OS. These multimodal OS agents autonomously perform computer-based tasks in response to a single prompt via application programming interfaces (APIs). Such APIs typically support low-level operations, including mouse clicks, keyboard inputs, and screenshot captures. We introduce a novel attack vector: malicious image patches (MIPs) that have been adversarially perturbed so that, when captured in a screenshot, they cause an OS agent to perform harmful actions by exploiting specific APIs. For instance, MIPs embedded in desktop backgrounds or shared on social media can redirect an agent to a malicious website, enabling further exploitation. These MIPs generalise across different user requests and screen layouts, and remain effective for multiple OS agents. The existence of such attacks highlights critical security vulnerabilities in OS agents, which should be carefully addressed before their widespread adoption.

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2503.10809

Country:

Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
Europe > Austria > Upper Austria (0.14)

Genre: Research Report (0.81)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Human Computer Interaction > Interfaces (1.00)
Information Technology > Communications > Social Media (1.00)
(4 more...)

Add feedback

Mixture of Experts Made Intrinsically Interpretable

Yang, Xingyi, Venhoff, Constantin, Khakzar, Ashkan, de Witt, Christian Schroeder, Dokania, Puneet K., Bibi, Adel, Torr, Philip

arXiv.org Artificial IntelligenceMar-5-2025

Neurons in large language models often exhibit \emph{polysemanticity}, simultaneously encoding multiple unrelated concepts and obscuring interpretability. Instead of relying on post-hoc methods, we present \textbf{MoE-X}, a Mixture-of-Experts (MoE) language model designed to be \emph{intrinsically} interpretable. Our approach is motivated by the observation that, in language models, wider networks with sparse activations are more likely to capture interpretable factors. However, directly training such large sparse networks is computationally prohibitive. MoE architectures offer a scalable alternative by activating only a subset of experts for any given input, inherently aligning with interpretability objectives. In MoE-X, we establish this connection by rewriting the MoE layer as an equivalent sparse, large MLP. This approach enables efficient scaling of the hidden size while maintaining sparsity. To further enhance interpretability, we enforce sparse activation within each expert and redesign the routing mechanism to prioritize experts with the highest activation sparsity. These designs ensure that only the most salient features are routed and processed by the experts. We evaluate MoE-X on chess and natural language tasks, showing that it achieves performance comparable to dense models while significantly improving interpretability. MoE-X achieves a perplexity better than GPT-2, with interpretability surpassing even sparse autoencoder (SAE)-based approaches.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2503.07639

Country: North America > United States (0.28)

Genre: Research Report > New Finding (0.46)

Industry: Leisure & Entertainment > Games > Chess (0.51)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

AnnoCaseLaw: A Richly-Annotated Dataset For Benchmarking Explainable Legal Judgment Prediction

Sesodia, Magnus, Petrova, Alina, Armour, John, Lukasiewicz, Thomas, Camburu, Oana-Maria, Dokania, Puneet K., Torr, Philip, de Witt, Christian Schroeder

arXiv.org Artificial IntelligenceFeb-28-2025

Legal systems worldwide continue to struggle with overwhelming caseloads, limited judicial resources, and growing complexities in legal proceedings. Artificial intelligence (AI) offers a promising solution, with Legal Judgment Prediction (LJP) -- the practice of predicting a court's decision from the case facts -- emerging as a key research area. However, existing datasets often formulate the task of LJP unrealistically, not reflecting its true difficulty. They also lack high-quality annotation essential for legal reasoning and explainability. To address these shortcomings, we introduce AnnoCaseLaw, a first-of-its-kind dataset of 471 meticulously annotated U.S. Appeals Court negligence cases. Each case is enriched with comprehensive, expert-labeled annotations that highlight key components of judicial decision making, along with relevant legal concepts. Our dataset lays the groundwork for more human-aligned, explainable LJP models. We define three legally relevant tasks: (1) judgment prediction; (2) concept identification; and (3) automated case annotation, and establish a performance baseline using industry-leading large language models (LLMs). Our results demonstrate that LJP remains a formidable task, with application of legal precedent proving particularly difficult. Code and data are available at https://github.com/anonymouspolar1/annocaselaw.

computational linguistic, large language model, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2503.00128

Country:

North America > United States (1.00)
Europe (1.00)
Asia > Middle East > UAE (0.14)

Genre: Research Report > New Finding (1.00)

Industry:

Law > Litigation (1.00)
Government > Regional Government > North America Government > United States Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.70)

Add feedback

Do Sparse Autoencoders Generalize? A Case Study of Answerability

Heindrich, Lovis, Torr, Philip, Barez, Fazl, Thost, Veronika

arXiv.org Artificial IntelligenceFeb-27-2025

Sparse autoencoders (SAEs) have emerged as a promising approach in language model interpretability, offering unsupervised extraction of sparse features. For interpretability methods to succeed, they must identify abstract features across domains, and these features can often manifest differently in each context. We examine this through "answerability"-a model's ability to recognize answerable questions. We extensively evaluate SAE feature generalization across diverse answerability datasets for Gemma 2 SAEs. Our analysis reveals that residual stream probes outperform SAE features within domains, but generalization performance differs sharply. SAE features demonstrate inconsistent transfer ability, and residual stream probes similarly show high variance out of distribution. Overall, this demonstrates the need for quantitative methods to predict feature generalization in SAE-based interpretability.

artificial intelligence, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2502.19964

Country: Europe > United Kingdom > England (0.14)

Genre: Research Report (0.84)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Implicit Neural Representations for Chemical Reaction Paths

Ramakrishnan, Kalyan, Schaaf, Lars L., Lin, Chen, Wang, Guangrun, Torr, Philip

arXiv.org Artificial IntelligenceFeb-20-2025

We show that neural networks can be optimized to represent minimum energy paths as continuous functions, offering a flexible alternative to discrete path-search methods like Nudged Elastic Band (NEB). Our approach parameterizes reaction paths with a network trained on a loss function that discards tangential energy gradients and enables instant estimation of the transition state. We first validate the method on two-dimensional potentials and then demonstrate its advantages over NEB on challenging atomistic systems where (i) poor initial guesses yield unphysical paths, (ii) multiple competing paths exist, or (iii) the reaction follows a complex multi-step mechanism. Results highlight the versatility of the method -- for instance, a simple adjustment to the sampling strategy during optimization can help escape local-minimum solutions. Finally, in a low-dimensional setting, we demonstrate that a single neural network can learn from existing paths and generalize to unseen systems, showing promise for a universal reaction path representation.

artificial intelligence, implicit neural representation, machine learning, (1 more...)

arXiv.org Artificial Intelligence

2502.15843

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.53)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.44)

Add feedback

PSyDUCK: Training-Free Steganography for Latent Diffusion

Channing, Georgia, Mahfuz, Aqib, van der Wilk, Mark, Torr, Philip, Pizzati, Fabio, de Witt, Christian Schroeder

arXiv.org Artificial IntelligenceJan-31-2025

Recent advances in AI-generated steganography highlight its potential for safeguarding the privacy of vulnerable democratic actors, including aid workers, journalists, and whistleblowers operating in oppressive regimes. In this work, we address current limitations and establish the foundations for large-throughput generative steganography. We introduce a novel approach that enables secure and efficient steganography within latent diffusion models. We show empirically that our methods perform well across a variety of open-source latent diffusion models, particularly in generative image and video tasks.

artificial intelligence, diffusion model, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2501.19172

Country: North America > United States (1.00)

Genre:

Research Report > New Finding (0.46)
Research Report > Promising Solution (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Vision-Language Models Do Not Understand Negation

Alhamoud, Kumail, Alshammari, Shaden, Tian, Yonglong, Li, Guohao, Torr, Philip, Kim, Yoon, Ghassemi, Marzyeh

arXiv.org Artificial IntelligenceJan-16-2025

Many practical vision-language applications require models that understand negation, e.g., when using natural language to retrieve images which contain certain objects but not others. Despite advancements in vision-language models (VLMs) through large-scale training, their ability to comprehend negation remains underexplored. This study addresses the question: how well do current VLMs understand negation? We introduce NegBench, a new benchmark designed to evaluate negation understanding across 18 task variations and 79k examples spanning image, video, and medical datasets. The benchmark consists of two core tasks designed to evaluate negation understanding in diverse multimodal settings: Retrieval with Negation and Multiple Choice Questions with Negated Captions. Our evaluation reveals that modern VLMs struggle significantly with negation, often performing at chance level. To address these shortcomings, we explore a data-centric approach wherein we finetune CLIP models on large-scale synthetic datasets containing millions of negated captions. We show that this approach can result in a 10% increase in recall on negated queries and a 40% boost in accuracy on multiple-choice questions with negated captions.

artificial intelligence, natural language, vision-language model, (1 more...)

arXiv.org Artificial Intelligence

2501.09425

Genre: Research Report (0.69)

Technology:

Information Technology > Artificial Intelligence > Natural Language (0.89)
Information Technology > Artificial Intelligence > Vision (0.60)

Add feedback

Open Problems in Machine Unlearning for AI Safety

Barez, Fazl, Fu, Tingchen, Prabhu, Ameya, Casper, Stephen, Sanyal, Amartya, Bibi, Adel, O'Gara, Aidan, Kirk, Robert, Bucknall, Ben, Fist, Tim, Ong, Luke, Torr, Philip, Lam, Kwok-Yan, Trager, Robert, Krueger, David, Mindermann, Sören, Hernandez-Orallo, José, Geva, Mor, Gal, Yarin

arXiv.org Artificial IntelligenceJan-8-2025

As AI systems become more capable, widely deployed, and increasingly autonomous in critical areas such as cybersecurity, biological research, and healthcare, ensuring their safety and alignment with human values is paramount. Machine unlearning -- the ability to selectively forget or suppress specific types of knowledge -- has shown promise for privacy and data removal tasks, which has been the primary focus of existing research. More recently, its potential application to AI safety has gained attention. In this paper, we identify key limitations that prevent unlearning from serving as a comprehensive solution for AI safety, particularly in managing dual-use knowledge in sensitive domains like cybersecurity and chemical, biological, radiological, and nuclear (CBRN) safety. In these contexts, information can be both beneficial and harmful, and models may combine seemingly harmless information for harmful purposes -- unlearning this information could strongly affect beneficial uses. We provide an overview of inherent constraints and open problems, including the broader side effects of unlearning dangerous knowledge, as well as previously unexplored tensions between unlearning and existing safety mechanisms. Finally, we investigate challenges related to evaluation, robustness, and the preservation of safety features during unlearning. By mapping these limitations and open challenges, we aim to guide future research toward realistic applications of unlearning within a broader AI safety framework, acknowledging its limitations and highlighting areas where alternative approaches may be required.

arxiv preprint arxiv, large language model, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2501.04952

Country:

Europe (1.00)
North America > United States > California (0.14)
Asia > Middle East > Israel (0.14)

Genre: Overview (1.00)

Industry:

Law (1.00)
Information Technology > Security & Privacy (1.00)
Health & Medicine (0.88)
Government > Military (0.87)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.97)

Add feedback

Towards Multi-Modal Animal Pose Estimation: A Survey and In-Depth Analysis

Deng, Qianyi, Deb, Oishi, Patel, Amir, Rupprecht, Christian, Torr, Philip, Trigoni, Niki, Markham, Andrew

arXiv.org Artificial IntelligenceJan-4-2025

Animal pose estimation (APE) aims to locate the animal body parts using a diverse array of sensor and modality inputs (e.g. RGB cameras, LiDAR, infrared, IMU, acoustic and language cues), which is crucial for research across neuroscience, biomechanics, and veterinary medicine. By evaluating 176 papers since 2011, APE methods are categorised by their input sensor and modality types, output forms, learning paradigms, experimental setup, and application domains, presenting detailed analyses of current trends, challenges, and future directions in single- and multi-modality APE systems. The analysis also highlights the transition between human and animal pose estimation, and how innovations in APE can reciprocally enrich human pose estimation and the broader machine learning paradigm. Additionally, 2D and 3D APE datasets and evaluation metrics based on different sensors and modalities are provided. A regularly updated project page is provided here: https://github.com/ChennyDeng/MM-APE.

artificial intelligence, machine learning, pose estimation, (16 more...)

arXiv.org Artificial Intelligence

2410.09312

Country: Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)

Genre:

Research Report (1.00)
Overview (1.00)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.34)

Technology:

Information Technology > Artificial Intelligence > Vision > Video Understanding (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Responsible AI Governance: A Response to UN Interim Report on Governing AI for Humanity

Kiden, Sarah, Stahl, Bernd, Townsend, Beverley, Maple, Carsten, Vincent, Charles, Sampson, Fraser, Gilbert, Geoff, Smith, Helen, Deshmukh, Jayati, Ross, Jen, Williams, Jennifer, del Rincon, Jesus Martinez, Lisinska, Justyna, O'Shea, Karen, Abreu, Márjory Da Costa, Bencomo, Nelly, Deb, Oishi, Winter, Peter, Li, Phoebe, Torr, Philip, Lau, Pin Lean, Iniesta, Raquel, Ramchurn, Gopal, Stein, Sebastian, Yazdanpanah, Vahid

arXiv.org Artificial IntelligenceDec-31-2024

This report presents a comprehensive response to the United Nation's Interim Report on Governing Artificial Intelligence (AI) for Humanity. It emphasizes the transformative potential of AI in achieving the Sustainable Development Goals (SDGs) while acknowledging the need for robust governance to mitigate associated risks. The response highlights opportunities for promoting equitable, secure, and inclusive AI ecosystems, which should be supported by investments in infrastructure and multi-stakeholder collaborations across jurisdictions. It also underscores challenges, including societal inequalities exacerbated by AI, ethical concerns, and environmental impacts. Recommendations advocate for legally binding norms, transparency, and multi-layered data governance models, alongside fostering AI literacy and capacity-building initiatives. Internationally, the report calls for harmonising AI governance frameworks with established laws, human rights standards, and regulatory approaches. The report concludes with actionable principles for fostering responsible AI governance through collaboration among governments, industry, academia, and civil society, ensuring the development of AI aligns with universal human values and the public good.

ai system, artificial intelligence, governance, (16 more...)

arXiv.org Artificial Intelligence

2412.12108

Country: Europe > United Kingdom > England (0.14)

Genre: Research Report (0.50)

Industry:

Law > Civil Rights & Constitutional Law (1.00)
Information Technology > Security & Privacy (1.00)
Health & Medicine (1.00)
(2 more...)

Technology: Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (1.00)

Add feedback