AITopics | crescendo

Collaborating Authors

crescendo

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

PLAGUE: Plug-and-play framework for Lifelong Adaptive Generation of Multi-turn Exploits

Bhuiya, Neeladri, Aggarwal, Madhav, Purwar, Diptanshu

arXiv.org Artificial IntelligenceOct-23-2025

Large Language Models (LLMs) are improving at an exceptional rate. With the advent of agentic workflows, multi-turn dialogue has become the de facto mode of interaction with LLMs for completing long and complex tasks. While LLM capabilities continue to improve, they remain increasingly susceptible to jailbreaking, especially in multi-turn scenarios where harmful intent can be subtly injected across the conversation to produce nefarious outcomes. While single-turn attacks have been extensively explored, adaptability, efficiency and effectiveness continue to remain key challenges for their multi-turn counterparts. To address these gaps, we present PLAGUE, a novel plug-and-play framework for designing multi-turn attacks inspired by lifelong-learning agents. PLAGUE dissects the lifetime of a multi-turn attack into three carefully designed phases (Primer, Planner and Finisher) that enable a systematic and information-rich exploration of the multi-turn attack family. Evaluations show that red-teaming agents designed using PLAGUE achieve state-of-the-art jailbreaking results, improving attack success rates (ASR) by more than 30% across leading models in a lesser or comparable query budget. Particularly, PLAGUE enables an ASR (based on StrongReject) of 81.4% on OpenAI's o3 and 67.3% on Claude's Opus 4.1, two models that are considered highly resistant to jailbreaks in safety literature. Our work offers tools and insights to understand the importance of plan initialization, context optimization and lifelong learning in crafting multi-turn attacks for a comprehensive model vulnerability evaluation.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2510.17947

Country: North America > United States (0.28)

Genre:

Workflow (1.00)
Instructional Material (0.86)
Research Report > New Finding (0.46)
Personal > Interview (0.46)

Industry:

Information Technology > Security & Privacy (1.00)
Health & Medicine > Therapeutic Area > Chemical/Biological/Radiation Warfare Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.89)

Add feedback

Capability-Based Scaling Laws for LLM Red-Teaming

Panfilov, Alexander, Kassianik, Paul, Andriushchenko, Maksym, Geiping, Jonas

arXiv.org Artificial IntelligenceMay-29-2025

As large language models grow in capability and agency, identifying vulnerabilities through red-teaming becomes vital for safe deployment. However, traditional prompt-engineering approaches may prove ineffective once red-teaming turns into a weak-to-strong problem, where target models surpass red-teamers in capabilities. To study this shift, we frame red-teaming through the lens of the capability gap between attacker and target. We evaluate more than 500 attacker-target pairs using LLM-based jailbreak attacks that mimic human red-teamers across diverse families, sizes, and capability levels. Three strong trends emerge: (i) more capable models are better attackers, (ii) attack success drops sharply once the target's capability exceeds the attacker's, and (iii) attack success rates correlate with high performance on social science splits of the MMLU-Pro benchmark. From these trends, we derive a jailbreaking scaling law that predicts attack success for a fixed target based on attacker-target capability gap. These findings suggest that fixed-capability attackers (e.g., humans) may become ineffective against future models, increasingly capable open-source models amplify risks for existing systems, and model providers must accurately measure and control models' persuasive and manipulative abilities to limit their effectiveness as attackers.

arxiv preprint arxiv, large language model, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2505.20162

Country: Asia > Middle East (0.28)

Genre: Research Report > New Finding (1.00)

Industry: Information Technology > Security & Privacy (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Derail Yourself: Multi-turn LLM Jailbreak Attack through Self-discovered Clues

Ren, Qibing, Li, Hao, Liu, Dongrui, Xie, Zhanxu, Lu, Xiaoya, Qiao, Yu, Sha, Lei, Yan, Junchi, Ma, Lizhuang, Shao, Jing

arXiv.org Artificial IntelligenceOct-14-2024

This study exposes the safety vulnerabilities of Large Language Models (LLMs) in multi-turn interactions, where malicious users can obscure harmful intents across several queries. We introduce ActorAttack, a novel multi-turn attack method inspired by actor-network theory, which models a network of semantically linked actors as attack clues to generate diverse and effective attack paths toward harmful targets. ActorAttack addresses two main challenges in multi-turn attacks: (1) concealing harmful intents by creating an innocuous conversation topic about the actor, and (2) uncovering diverse attack paths towards the same harmful target by leveraging LLMs' knowledge to specify the correlated actors as various attack clues. In this way, ActorAttack outperforms existing single-turn and multi-turn attack methods across advanced aligned LLMs, even for GPT-o1. We will publish a dataset called SafeMTData, which includes multi-turn adversarial prompts and safety alignment data, generated by ActorAttack. We demonstrate that models safety-tuned using our safety dataset are more robust to multi-turn attacks. Code is available at https://github.com/renqibing/ActorAttack.

arxiv preprint arxiv, large language model, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2410.107

Country:

North America > United States > Texas (0.04)
North America > United States > New York (0.04)
Asia > China > Shanghai > Shanghai (0.04)
(4 more...)

Genre: Research Report (1.00)

Industry:

Law > Statutes (1.00)
Law Enforcement & Public Safety > Crime Prevention & Enforcement (1.00)
Information Technology > Security & Privacy (1.00)
(4 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Great, Now Write an Article About That: The Crescendo Multi-Turn LLM Jailbreak Attack

Russinovich, Mark, Salem, Ahmed, Eldan, Ronen

arXiv.org Artificial IntelligenceApr-2-2024

Large Language Models (LLMs) have risen significantly in popularity and are increasingly being adopted across multiple applications. These LLMs are heavily aligned to resist engaging in illegal or unethical topics as a means to avoid contributing to responsible AI harms. However, a recent line of attacks, known as "jailbreaks", seek to overcome this alignment. Intuitively, jailbreak attacks aim to narrow the gap between what the model can do and what it is willing to do. In this paper, we introduce a novel jailbreak attack called Crescendo. Unlike existing jailbreak methods, Crescendo is a multi-turn jailbreak that interacts with the model in a seemingly benign manner. It begins with a general prompt or question about the task at hand and then gradually escalates the dialogue by referencing the model's replies, progressively leading to a successful jailbreak. We evaluate Crescendo on various public systems, including ChatGPT, Gemini Pro, Gemini-Ultra, LlaMA-2 70b Chat, and Anthropic Chat. Our results demonstrate the strong efficacy of Crescendo, with it achieving high attack success rates across all evaluated models and tasks. Furthermore, we introduce Crescendomation, a tool that automates the Crescendo attack, and our evaluation showcases its effectiveness against state-of-the-art models.

crescendo, crescendomation, jailbreak, (15 more...)

arXiv.org Artificial Intelligence

2404.01833

Country:

North America > United States > Mississippi > Hinds County > Jackson (0.04)
South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)

Genre: Research Report > New Finding (1.00)

Industry:

Information Technology > Security & Privacy (1.00)
Media (0.95)
Government (0.93)
Law Enforcement & Public Safety (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.95)

Add feedback

Crescendo.ai - Data Science and AI R&D Firm

#artificialintelligenceDec-9-2022, 23:15:09 GMT

Crescendo.ai is a data science firm with its own AI R&D center. We are a private Swiss initiative, with operations across Europe. We build and implement AI-powered solutions and tools to help public and private companies, make smarter decisions. With hundreds of thousands of valuable data entry across our own properties, we are developing powerful platforms and systems to help us make better and swifter decisions. Our goal is to allow public and private companies to tap into the incredible potential of AI and Machine Learning.

artificial intelligence, crescendo, machine learning, (3 more...)

#artificialintelligence

Country: Europe (0.35)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.35)

Add feedback

Why Music Makes Us Feel According to Artificial Intelligence

#artificialintelligenceNov-11-2019, 03:16:48 GMT

Your heart beats faster, palms sweat and part of your brain called the Heschl's gyrus lights up like a Christmas tree. Chances are, you've never thought about what happens to your brain and body when you listen to music in such a detailed way. But it's a question that has puzzled scientists for decades: Why does something as abstract as music provoke such a consistent response? In a new study, a team of USC researchers, with the help of artificial intelligence, investigated how music affects listeners' brains, bodies and emotions. The research team looked at heart rate, galvanic skin response (or sweat gland activity), brain activity and subjective feelings of happiness and sadness in a group of volunteers as they listened to three pieces of unfamiliar music.

artificial intelligence, brain, instrument, (12 more...)

#artificialintelligence

Genre: Research Report (0.53)

Industry: