AITopics | Buesser, Beat

Plotting

Buesser, Beat

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Adversarial Prompt Evaluation: Systematic Benchmarking of Guardrails Against Prompt Input Attacks on LLMs

Zizzo, Giulio, Cornacchia, Giandomenico, Fraser, Kieran, Hameed, Muhammad Zaid, Rawat, Ambrish, Buesser, Beat, Purcell, Mark, Chen, Pin-Yu, Sattigeri, Prasanna, Varshney, Kush

arXiv.org Artificial IntelligenceFeb-21-2025

As large language models (LLMs) become integrated into everyday applications, ensuring their robustness and security is increasingly critical. In particular, LLMs can be manipulated into unsafe behaviour by prompts known as jailbreaks. The variety of jailbreak styles is growing, necessitating the use of external defences known as guardrails. While many jailbreak defences have been proposed, not all defences are able to handle new out-of-distribution attacks due to the narrow segment of jailbreaks used to align them. Moreover, the lack of systematisation around defences has created significant gaps in their practical application. In this work, we perform systematic benchmarking across 15 different defences, considering a broad swathe of malicious and benign datasets. We find that there is significant performance variation depending on the style of jailbreak a defence is subject to. Additionally, we show that based on current datasets available for evaluation, simple baselines can display competitive out-of-distribution performance compared to many state-of-the-art defences. Code is available at https://github.com/IBM/Adversarial-Prompt-Evaluation.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2502.15427

Country: Europe > Denmark (0.14)

Genre: Research Report (0.64)

Industry:

Information Technology > Security & Privacy (0.68)
Media > News (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Towards Assurance of LLM Adversarial Robustness using Ontology-Driven Argumentation

Momcilovic, Tomas Bueno, Buesser, Beat, Zizzo, Giulio, Purcell, Mark, Balta, Dian

arXiv.org Artificial IntelligenceOct-10-2024

Despite the impressive adaptability of large language models (LLMs), challenges remain in ensuring their security, transparency, and interpretability. Given their susceptibility to adversarial attacks, LLMs need to be defended with an evolving combination of adversarial training and guardrails. However, managing the implicit and heterogeneous knowledge for continuously assuring robustness is difficult. We introduce a novel approach for assurance of the adversarial robustness of LLMs based on formal argumentation. Using ontologies for formalization, we structure state-of-the-art attacks and defenses, facilitating the creation of a human-readable assurance case, and a machine-readable representation. We demonstrate its application with examples in English language and code translation tasks, and provide implications for theory and practice, by targeting engineers, data scientists, users, and auditors.

large language model, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2410.07962

Country:

Europe > Switzerland > Zürich > Zürich (0.14)
Europe > Middle East > Malta (0.14)

Genre: Research Report > Promising Solution (0.34)

Industry: Information Technology > Security & Privacy (0.91)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Ontologies (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Add feedback

Knowledge-Augmented Reasoning for EUAIA Compliance and Adversarial Robustness of LLMs

Momcilovic, Tomas Bueno, Balta, Dian, Buesser, Beat, Zizzo, Giulio, Purcell, Mark

arXiv.org Artificial IntelligenceOct-4-2024

The EU AI Act (EUAIA) introduces requirements for AI systems which intersect with the processes required to establish adversarial robustness. However, given the ambiguous language of regulation and the dynamicity of adversarial attacks, developers of systems with highly complex models such as LLMs may find their effort to be duplicated without the assurance of having achieved either compliance or robustness. This paper presents a functional architecture that focuses on bridging the two properties, by introducing components with clear reference to their source. Taking the detection layer recommended by the literature, and the reporting layer required by the law, we aim to support developers and auditors with a reasoning layer based on knowledge augmentation (rules, assurance cases, contextual mappings). Our findings demonstrate a novel direction for ensuring LLMs deployed in the EU are both compliant and adversarially robust, which underpin trustworthiness.

artificial intelligence, large language model, natural language, (17 more...)

arXiv.org Artificial Intelligence

2410.09078

Country:

North America > United States (0.28)
Europe > Switzerland (0.28)

Genre: Research Report > New Finding (0.69)

Industry:

Information Technology > Security & Privacy (1.00)
Government > Military (0.92)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

Developing Assurance Cases for Adversarial Robustness and Regulatory Compliance in LLMs

Momcilovic, Tomas Bueno, Balta, Dian, Buesser, Beat, Zizzo, Giulio, Purcell, Mark

arXiv.org Artificial IntelligenceOct-4-2024

This paper presents an approach to developing assurance cases for adversarial robustness and regulatory compliance in large language models (LLMs). Focusing on both natural and code language tasks, we explore the vulnerabilities these models face, including adversarial attacks based on jailbreaking, heuristics, and randomization. We propose a layered framework incorporating guardrails at various stages of LLM deployment, aimed at mitigating these attacks and ensuring compliance with the EU AI Act. Our approach includes a meta-layer for dynamic risk management and reasoning, crucial for addressing the evolving nature of LLM vulnerabilities. We illustrate our method with two exemplary assurance cases, highlighting how different contexts demand tailored strategies to ensure robust and compliant AI systems.

guardrail, large language model, natural language, (15 more...)

arXiv.org Artificial Intelligence

2410.05304

Country:

Europe > Germany (0.28)
Europe > Switzerland (0.28)

Genre: Research Report (0.41)

Industry:

Law (1.00)
Information Technology > Security & Privacy (1.00)
Government (1.00)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

Towards Assuring EU AI Act Compliance and Adversarial Robustness of LLMs

Momcilovic, Tomas Bueno, Buesser, Beat, Zizzo, Giulio, Purcell, Mark, Balta, Dian

arXiv.org Artificial IntelligenceOct-4-2024

Large language models are prone to misuse and vulnerable to security threats, raising significant safety and security concerns. The European Union's Artificial Intelligence Act seeks to enforce AI robustness in certain contexts, but faces implementation challenges due to the lack of standards, complexity of LLMs and emerging security vulnerabilities. Our research introduces a framework using ontologies, assurance cases, and factsheets to support engineers and stakeholders in understanding and documenting AI system compliance and security regarding adversarial robustness. This approach aims to ensure that LLMs adhere to regulatory standards and are equipped to counter potential threats.

large language model, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2410.05306

Country:

Europe > Germany (0.47)
Europe > Switzerland (0.28)

Genre: Research Report (0.50)

Industry:

Information Technology > Security & Privacy (1.00)
Government > Regional Government > Europe Government (0.40)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.70)

Add feedback

Attack Atlas: A Practitioner's Perspective on Challenges and Pitfalls in Red Teaming GenAI

Rawat, Ambrish, Schoepf, Stefan, Zizzo, Giulio, Cornacchia, Giandomenico, Hameed, Muhammad Zaid, Fraser, Kieran, Miehling, Erik, Buesser, Beat, Daly, Elizabeth M., Purcell, Mark, Sattigeri, Prasanna, Chen, Pin-Yu, Varshney, Kush R.

arXiv.org Artificial IntelligenceSep-23-2024

As generative AI, particularly large language models (LLMs), become increasingly integrated into production applications, new attack surfaces and vulnerabilities emerge and put a focus on adversarial threats in natural language and multi-modal systems. Red-teaming has gained importance in proactively identifying weaknesses in these systems, while blue-teaming works to protect against such adversarial attacks. Despite growing academic interest in adversarial risks for generative AI, there is limited guidance tailored for practitioners to assess and mitigate these challenges in real-world environments. To address this, our contributions include: (1) a practical examination of red- and blue-teaming strategies for securing generative AI, (2) identification of key challenges and open questions in defense development and evaluation, and (3) the Attack Atlas, an intuitive framework that brings a practical approach to analyzing single-turn input attacks, placing it at the forefront for practitioners. This work aims to bridge the gap between academic insights and practical security measures for the protection of generative AI systems.

arxiv preprint arxiv, large language model, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2409.15398

Country: North America > Mexico > Mexico City (0.14)

Genre: Research Report (0.40)

Industry:

Information Technology > Security & Privacy (1.00)
Government > Military (0.88)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.86)

Add feedback

Boundary Adversarial Examples Against Adversarial Overfitting

Hameed, Muhammad Zaid, Buesser, Beat

arXiv.org Artificial IntelligenceNov-25-2022

Standard adversarial training approaches suffer from robust overfitting where the robust accuracy decreases when models are adversarially trained for too long. The origin of this problem is still unclear and conflicting explanations have been reported, i.e., memorization effects induced by large loss data or because of small loss data and growing differences in loss distribution of training samples as the adversarial training progresses. Consequently, several mitigation approaches including early stopping, temporal ensembling and weight perturbations on small loss data have been proposed to mitigate the effect of robust overfitting. However, a side effect of these strategies is a larger reduction in clean accuracy compared to standard adversarial training. In this paper, we investigate if these mitigation approaches are complimentary to each other in improving adversarial training performance. We further propose the use of helper adversarial examples that can be obtained with minimal cost in the adversarial example generation, and show how they increase the clean accuracy in the existing approaches without compromising the robust accuracy.

accuracy, artificial intelligence, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2211.14088

Country:

Europe (0.14)
North America > Canada > Ontario > Toronto (0.14)

Genre: Research Report (0.64)

Industry: Information Technology (0.47)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Neural Feature Learning From Relational Database

Lam, Hoang Thanh, Minh, Tran Ngoc, Sinn, Mathieu, Buesser, Beat, Wistuba, Martin

arXiv.org Artificial IntelligenceJun-17-2018

Feature engineering is one of the most important but most tedious tasks in data science. This work studies automation of feature learning from relational database. We first prove theoretically that finding the optimal features from relational data for predictive tasks is NP-hard. We propose an efficient rule-based approach based on heuristics and a deep neural network to automatically learn appropriate features from relational data. We benchmark our approaches in ensembles in past Kaggle competitions. Our new approach wins late medals and beats the state-of-the-art solutions with significant margins. To the best of our knowledge, this is the first time an automated data science system could win medals in Kaggle competitions with complex relational database.

deep learning, neural network, transformation, (19 more...)

arXiv.org Artificial Intelligence

1801.05372

Country: North America > United States (0.68)

Genre: Research Report > Promising Solution (0.34)

Industry: Education (0.46)

Technology:

Information Technology > Databases (1.00)
Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

AI Meets Chemistry

Kishimoto, Akihiro (IBM Research) | Buesser, Beat (IBM Research) | Botea, Adi (IBM Research)

AAAI ConferencesFeb-8-2018

We argue that chemistry should be the next grand challenge for Artificial Intelligence. The AI research community and humanity would benefit tremendously from focusing AI research on chemistry on a regular basis, as a benchmark as well as a real-world application domain. To support our position, we review the importance of chemical compound discovery and synthesis planning and discuss the properties of search spaces in a chemistry problem. Knowledge acquired in domains such as two-player board games or single-player puzzles places the AI community in a good position to solve critical problems in the chemistry domain. Yet, we show that searching in chemistry problems poses significant additional challenges that will have to be addressed. Finally, we envision how several AI areas like Natural Language Processing, Machine Learning, planning and search, are relevant for chemistry.

chemistry, chess, deep learning, (22 more...)

AAAI Conferences

Thirty-Second AAAI Conference on Artificial Intelligence

Country: North America > United States (0.14)

Genre:

Overview (0.48)
Research Report (0.47)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Leisure & Entertainment > Games > Chess (0.31)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (0.89)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback