cognitive overload
Guarding the Guardrails: A Taxonomy-Driven Approach to Jailbreak Detection
Sorokoletova, Olga E., Giarrusso, Francesco, Suriani, Vincenzo, Nardi, Daniele
Jailbreaking techniques pose a significant threat to the safety of Large Language Models (LLMs). Existing defenses typically focus on single-turn attacks, lack coverage across languages, and rely on limited taxonomies that either fail to capture the full diversity of attack strategies or emphasize risk categories rather than the jailbreaking techniques. To advance the understanding of the effectiveness of jailbreaking techniques, we conducted a structured red-teaming challenge. The outcome of our experiments are manifold. First, we developed a comprehensive hierarchical taxonomy of 50 jailbreak strategies, consolidating and extending prior classifications into seven broad families, including impersonation, persuasion, privilege escalation, cognitive overload, obfuscation, goal conflict, and data poisoning. Second, we analyzed the data collected from the challenge to examine the prevalence and success rates of different attack types, providing insights into how specific jailbreak strategies exploit model vulnerabilities and induce misalignment. Third, we benchmark a popular LLM for jailbreak detection, evaluating the benefits of taxonomy-guided prompting for improving automatic detection. Finally, we compiled a new Italian dataset of 1364 multi-turn adversarial dialogues, annotated with our taxonomy, enabling the study of interactions where adversarial intent emerges gradually and succeeds in bypassing traditional safeguards.
- North America > United States > Florida > Miami-Dade County > Miami (0.04)
- Europe > Italy (0.04)
- Europe > Croatia (0.04)
- (3 more...)
Mitigating Societal Cognitive Overload in the Age of AI: Challenges and Directions
Societal cognitive overload, driven by the deluge of inform ation and complexity in the AI age, poses a critical challenge to human well-being an d societal resilience. This paper argues that mitigating cognitive overload is not only essential for improving present-day life but also a crucial prerequisite fo r navigating the potential risks of advanced AI, including existential threats. W e exa mine how AI exacerbates cognitive overload through various mechanisms, incl uding information proliferation, algorithmic manipulation, automation anxiet ies, deregulation, and the erosion of meaning. The paper reframes the AI safety debate t o center on cognitive overload, highlighting its role as a bridge between near-te rm harms and long-term risks. It concludes by discussing potential institutional adaptations, research directions, and policy considerations that arise from adopti ng an overload-resilient perspective on human-AI alignment, suggesting pathways fo r future exploration rather than prescribing definitive solutions. W e stand at a precipice. Human societies are increasingly st ruggling to process the sheer volume and complexity of information in the digital age, a conditio n dramatically amplified by the rapid proliferation of artificial intelligence (AI). While Toffle r (1970) foresaw "future shock" from accelerating change and Eppler & Mengis (2004); Bawden & Robin son (2009) analyzed individual information overload, Byung-Chul Han, in his critique of ne oliberalism and technological domination (Han, 2017), argues that contemporary society faces a regime of technological domination that exploits and overwhelms the psyche. This exploitation and overwhelming of the psyche, now dramatically amplified by AI-driven information and comple xity, elevates information overload to a systemic crisis: societal cognitive overload .
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- Asia > Singapore (0.04)
- Law (1.00)
- Government (1.00)
- Health & Medicine (0.68)
- Media > News (0.47)
- Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (1.00)
- Information Technology > Artificial Intelligence > Cognitive Science (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.68)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)
Understanding the Dark Side of LLMs' Intrinsic Self-Correction
Zhang, Qingjie, Qiu, Han, Wang, Di, Qian, Haoting, Li, Yiming, Zhang, Tianwei, Huang, Minlie
Intrinsic self-correction was proposed to improve LLMs' responses via feedback prompts solely based on their inherent capability. However, recent works show that LLMs' intrinsic self-correction fails without oracle labels as feedback prompts. In this paper, we aim to interpret LLMs' intrinsic self-correction for different tasks, especially for those failure cases. By including one simple task and three complex tasks with state-of-the-art (SOTA) LLMs like ChatGPT families (o1, 4o, 3.5-turbo) and Llama families (2-7B, 3-8B, and 3.1-8B), we design three interpretation methods to reveal the dark side of LLMs' intrinsic self-correction. We identify intrinsic self-correction can (1) cause LLMs to waver both intermedia and final answers and lead to prompt bias on simple factual questions; (2) introduce human-like cognitive bias on complex tasks. In light of our findings, we also provide two simple yet effective strategies for alleviation: question repeating and supervised fine-tuning with a few samples. We open-source our work at https://x-isc.info/.
- North America > United States > Texas > Dallas County > Dallas (0.04)
- Europe > Italy (0.04)
- North America > United States > Nevada (0.04)
- (3 more...)
- Health & Medicine (0.93)
- Consumer Products & Services (0.67)
Cognitive Overload Attack:Prompt Injection for Long Context
Upadhayay, Bibek, Behzadan, Vahid, Karbasi, Amin
Large Language Models (LLMs) have demonstrated remarkable capabilities in performing tasks across various domains without needing explicit retraining. This capability, known as In-Context Learning (ICL), while impressive, exposes LLMs to a variety of adversarial prompts and jailbreaks that manipulate safety-trained LLMs into generating undesired or harmful output. In this paper, we propose a novel interpretation of ICL in LLMs through the lens of cognitive neuroscience, by drawing parallels between learning in human cognition with ICL. We applied the principles of Cognitive Load Theory in LLMs and empirically validate that similar to human cognition, LLMs also suffer from cognitive overload a state where the demand on cognitive processing exceeds the available capacity of the model, leading to potential errors. Furthermore, we demonstrated how an attacker can exploit ICL to jailbreak LLMs through deliberately designed prompts that induce cognitive overload on LLMs, thereby compromising the safety mechanisms of LLMs. We empirically validate this threat model by crafting various cognitive overload prompts and show that advanced models such as GPT-4, Claude-3.5 Sonnet, Claude-3 OPUS, Llama-3-70B-Instruct, Gemini-1.0-Pro, and Gemini-1.5-Pro can be successfully jailbroken, with attack success rates of up to 99.99%. Our findings highlight critical vulnerabilities in LLMs and underscore the urgency of developing robust safeguards. We propose integrating insights from cognitive load theory into the design and evaluation of LLMs to better anticipate and mitigate the risks of adversarial attacks. By expanding our experiments to encompass a broader range of models and by highlighting vulnerabilities in LLMs' ICL, we aim to ensure the development of safer and more reliable AI systems.
- North America > United States > California > San Francisco County > San Francisco (0.14)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- North America > United States > Connecticut > New Haven County > West Haven (0.04)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)
- Information Technology > Security & Privacy (1.00)
- Government (1.00)
- Health & Medicine > Therapeutic Area > Neurology (0.85)
Cognitive Overload: Jailbreaking Large Language Models with Overloaded Logical Thinking
Xu, Nan, Wang, Fei, Zhou, Ben, Li, Bang Zheng, Xiao, Chaowei, Chen, Muhao
While large language models (LLMs) have demonstrated increasing power, they have also given rise to a wide range of harmful behaviors. As representatives, jailbreak attacks can provoke harmful or unethical responses from LLMs, even after safety alignment. In this paper, we investigate a novel category of jailbreak attacks specifically designed to target the cognitive structure and processes of LLMs. Specifically, we analyze the safety vulnerability of LLMs in the face of (1) multilingual cognitive overload, (2) veiled expression, and (3) effect-to-cause reasoning. Different from previous jailbreak attacks, our proposed cognitive overload is a black-box attack with no need for knowledge of model architecture or access to model weights. Experiments conducted on AdvBench and MasterKey reveal that various LLMs, including both popular open-source model Llama 2 and the proprietary model ChatGPT, can be compromised through cognitive overload. Motivated by cognitive psychology work on managing cognitive load, we further investigate defending cognitive overload attack from two perspectives. Empirical studies show that our cognitive overload from three perspectives can jailbreak all studied LLMs successfully, while existing defense strategies can hardly mitigate the caused malicious uses effectively.
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- North America > United States > Wisconsin > Dane County > Madison (0.04)
- North America > United States > Pennsylvania (0.04)
- (2 more...)
5 Ways to Overcome Cognitive Overload
Cognitive overload happens to students and teachers. Often looking like ADHD, cognitive overload can happen for a variety of reasons including challenges to your working memory. Todd Finley some ways to help your students and yourself when you struggle with cognitive overload. What is it, and how do we work with it in our students and in ourselves? Today thought leader Todd Finley is going to help us understand this. I know Todd from writing also Edutopia.
Crown, a new app from Tinder's parent company, turns dating into a game
If you're already resentful of online dating culture and how it turned finding companionship into a game, you may not be quite ready for this: Crown, a new dating app that actually turns getting matches into a game. Crown is the latest project to launch from Match Group, the operator of a number of dating sites and apps including Match, Tinder, Plenty of Fish, OK Cupid, and others. The app was thought up by Match Product Manager Patricia Parker, who understands first-hand both the challenges and the benefits of online dating – Parker met her husband online, so has direct experience in the world of online dating. Crown won Match Group's internal "ideathon," and was then developed in-house by a team of millennial women, with a goal of serving women's needs in particular. The main problem Crown is trying to solve is the cognitive overload of using dating apps.
Tinder's New Crown App Is Regression Rather Than Revolution
Dating apps make a lot of money (exhibit a...see above) but it's not all cash registers and harmony at match group and pals. The young folks it seems are spending too much time on their phones and getting bored according to the Techcrunch exclusive. Create a new app called'Crown' and further gamify love to solve this. You can find love faster if you cut down options and increase competition. Sounds like a classy move. This just shows your original product blows...it's not cognitive overload.
Augmented reality: A catalyst for the coming cognitive revolution
In the last 20 years, business executives have used these all-too-familiar terms to describe their world. The near-constant interruptions from ringing telephones, buzzing pagers, the web, and an unceasing stream of email messages were thought to be overwhelming people with information, forcing them to switch attention from one task to the next, again and again. People were becoming multitaskers who couldn't concentrate for more than a few seconds at a time. Today, the impact of the information barrage has been termed cognitive overload--the information itself doesn't cause problems, but having to think about it does.1,2 In almost all situations, the amount of information that comes at people exceeds their cognitive capacity to handle it, and their performance could be adversely affected if they miss important details or have difficulty understanding the information.
Cognitive Load Theory: Implications for Affective Computing
Kalyuga, Slava (University of New South Wales)
It has been also demonstrated that emotional In its basic underpinning assumptions, cognitive load states (e.g., negative mood or anxiety) directly influence theory relies on the analogy between the information cognitive task performance and the operation of working processing aspects of evolution by natural selection and memory, while less evidence exists about the effect of the human cognition (Sweller & Sweller, 2006). It considers emotional content of the processed information (e.g., both biological evolution and human cognition as Kensinger & Corkin, 2003).
- North America > United States > New York (0.04)
- Asia > Middle East > Jordan (0.04)
- Oceania > Australia > New South Wales > Sydney (0.04)
- (3 more...)
- Health & Medicine (0.73)
- Education > Educational Technology > Educational Software > Computer Based Training (0.46)