Goto

Collaborating Authors

 sear


A Representation Engineering Perspective on the Effectiveness of Multi-Turn Jailbreaks

Bullwinkel, Blake, Russinovich, Mark, Salem, Ahmed, Zanella-Beguelin, Santiago, Jones, Daniel, Severi, Giorgio, Kim, Eugenia, Hines, Keegan, Minnich, Amanda, Zunger, Yonatan, Kumar, Ram Shankar Siva

arXiv.org Artificial Intelligence

Recent research has demonstrated that state-of-the-art LLMs and defenses remain susceptible to multi-turn jailbreak attacks. These attacks require only closed-box model access and are often easy to perform manually, posing a significant threat to the safe and secure deployment of LLM-based systems. We study the effectiveness of the Crescendo multi-turn jailbreak at the level of intermediate model representations and find that safety-aligned LMs often represent Crescendo responses as more benign than harmful, especially as the number of conversation turns increases. Our analysis indicates that at each turn, Crescendo prompts tend to keep model outputs in a "benign" region of representation space, effectively tricking the model into fulfilling harmful requests. Further, our results help explain why single-turn jailbreak defenses like circuit breakers are generally ineffective against multi-turn attacks, motivating the development of mitigations that address this generalization gap.


SEAR: A Multimodal Dataset for Analyzing AR-LLM-Driven Social Engineering Behaviors

Yu, Tianlong, Ye, Chenghang, Yang, Zheyu, Zhou, Ziyi, Tang, Cui, Tao, Zui, Zhang, Jun, Wang, Kailong, Zhou, Liting, Yang, Yang, Bi, Ting

arXiv.org Artificial Intelligence

The SEAR Dataset is a novel multimodal resource designed to study the emerging threat of social engineering (SE) attacks orchestrated through augmented reality (AR) and multimodal large language models (LLMs). This dataset captures 180 annotated conversations across 60 participants in simulated adversarial scenarios, including meetings, classes and networking events. It comprises synchronized AR-captured visual/audio cues (e.g., facial expressions, vocal tones), environmental context, and curated social media profiles, alongside subjective metrics such as trust ratings and susceptibility assessments. Key findings reveal SEAR's alarming efficacy in eliciting compliance (e.g., 93.3% phishing link clicks, 85% call acceptance) and hijacking trust (76.7% post-interaction trust surge). The dataset supports research in detecting AR-driven SE attacks, designing defensive frameworks, and understanding multimodal adversarial manipulation. Rigorous ethical safeguards, including anonymization and IRB compliance, ensure responsible use. The SEAR dataset is available at https://github.com/INSLabCN/SEAR-Dataset.


On the Feasibility of Using MultiModal LLMs to Execute AR Social Engineering Attacks

Bi, Ting, Ye, Chenghang, Yang, Zheyu, Zhou, Ziyi, Tang, Cui, Zhang, Jun, Tao, Zui, Wang, Kailong, Zhou, Liting, Yang, Yang, Yu, Tianlong

arXiv.org Artificial Intelligence

Augmented Reality (AR) and Multimodal Large Language Models (LLMs) are rapidly evolving, providing unprecedented capabilities for human-computer interaction. However, their integration introduces a new attack surface for social engineering. In this paper, we systematically investigate the feasibility of orchestrating AR-driven Social Engineering attacks using Multimodal LLM for the first time, via our proposed SEAR framework, which operates through three key phases: (1) AR-based social context synthesis, which fuses Multimodal inputs (visual, auditory and environmental cues); (2) role-based Multimodal RAG (Retrieval-Augmented Generation), which dynamically retrieves and integrates contextual data while preserving character differentiation; and (3) ReInteract social engineering agents, which execute adaptive multiphase attack strategies through inference interaction loops. To verify SEAR, we conducted an IRB-approved study with 60 participants in three experimental configurations (unassisted, AR+LLM, and full SEAR pipeline) compiling a new dataset of 180 annotated conversations in simulated social scenarios. Our results show that SEAR is highly effective at eliciting high-risk behaviors (e.g., 93.3% of participants susceptible to email phishing). The framework was particularly effective in building trust, with 85% of targets willing to accept an attacker's call after an interaction. Also, we identified notable limitations such as ``occasionally artificial'' due to perceived authenticity gaps. This work provides proof-of-concept for AR-LLM driven social engineering attacks and insights for developing defensive countermeasures against next-generation augmented reality threats.


Integrate the Essence and Eliminate the Dross: Fine-Grained Self-Consistency for Free-Form Language Generation

Wang, Xinglin, Li, Yiwei, Feng, Shaoxiong, Yuan, Peiwen, Pan, Boyuan, Wang, Heda, Hu, Yao, Li, Kan

arXiv.org Artificial Intelligence

Self-consistency (SC), leveraging multiple samples from LLMs, shows significant gains on various reasoning tasks but struggles with free-form generation due to the difficulty of aggregating answers. Its variants, UCS and USC, rely on sample selection or voting mechanisms to improve output quality. These methods, however, face limitations due to their inability to fully utilize the nuanced consensus knowledge present within multiple candidate samples, often resulting in suboptimal outputs. We propose Fine-Grained Self-Consistency (FSC) to addresses these limitations by extracting and integrating segment-level commonalities from candidate samples, enhancing the performance of LLMs both in open-ended and reasoning tasks. Based on this, we present two additional strategies: candidate filtering, which enhances overall quality by identifying highly similar candidate sets, and merging, which reduces input token requirements by combining similar samples. The effectiveness of FSC is demonstrated through extensive experiments on various tasks, including summarization, code generation, and mathematical reasoning, using GPT-3.5-turbo and GPT-4. The results indicate significant improvements over baseline methods, showcasing the potential of FSC to optimize output quality by effectively synthesizing fine-grained consensus knowledge from multiple samples.


Efficient RL via Disentangled Environment and Agent Representations

Gmelin, Kevin, Bahl, Shikhar, Mendonca, Russell, Pathak, Deepak

arXiv.org Artificial Intelligence

Agents that are aware of the separation between themselves and their environments can leverage this understanding to form effective representations of visual input. We propose an approach for learning such structured representations for RL algorithms, using visual knowledge of the agent, such as its shape or mask, which is often inexpensive to obtain. This is incorporated into the RL objective using a simple auxiliary loss. We show that our method, Structured Environment-Agent Representations, outperforms state-of-the-art model-free approaches over 18 different challenging visual simulation environments spanning 5 different robots. Website at https://sear-rl.github.io/


CloudFactory raises $65 million to prep and process data sets

#artificialintelligence

AI and machine learning algorithms require data. But the bulk of that data is of no use if it isn't first labeled by human annotators. This predicament has given rise to a cottage industry of startups, including Scale AI, which recently raised $100 million for its extensive suite of data labeling services. That's not to mention Mighty AI, Hive, Appen, and Alegion, which together occupy a data annotation tools segment that's anticipated to be worth $1.6 billion by 2025. CloudFactory is yet another vying for attention.


Sears' bankruptcy underscores the need for tech innovation in retail

#artificialintelligence

The demise of Sears provides a perfect cautionary tale. By all accounts, the former retail giant -- which was in a "death spiral for well over a decade" -- is paying a steep price for its failure to innovate in new technology. Yet, the means for Sears to do just that was available. Business technology has been advancing at a rapid pace over the past decade. Deep Learning and Artificial Intelligence (AI), along with an unprecedented availability of data, have made it possible to extract business insights like never before.


Artificial Intelligence: It's Only Scary if You Let it Control You LBBOnline

#artificialintelligence

A lot of press about artificial intelligence (AI) focuses on its scary side. Many paint the vision of robots, smart houses (think back to the 1999 Disney movie "Smart House" of a cyborg maid that takes over) and artificial beings crafting our experiences. This fear is not all hype, as Microsoft's AI-powered Twitter bot showed when she went haywire, tweeting inappropriate and aggressive responses that were egged on by other Twitter users. As with any new technology, though, you can manage it with an appropriate level of checks and balances to ensure ethical standards are being met. Part of the reason people are scared of AI is because they don't understand it.


Waymo leads the self-driving car race, Fox scores Thursday Night Football, and more trending news

#artificialintelligence

The news professionals are talking about now, curated by LinkedIn's editors. Waymo's self-driving cars logged the most miles of all driverless vehicle companies in California, according to a report from the state's DMV. Waymo drove 352,545 miles in the year ending in November 2017 (roughly 50% less than the year prior, due to shifting much of its fleet to Phoenix). In second place, GM's Cruise division logged 131,676 miles. "This is still Waymo (née Google) and GM's party, and everyone else is playing catch-up," says The Verge.


Alexa, turn up my Kenmore AC; Sears cuts a deal with Amazon

Boston Herald

Sears will begin selling its appliances on Amazon.com, The announcement Thursday sent shares of Sears soaring more than 18 percent at the opening bell. The tie-up with the internet behemoth could give shares of the storied retailer one of its biggest one-day percentage gains ever. Sears, which also owns Kmart, said that its Kenmore Smart appliances will be fully integrated with Amazon's Alexa, allowing users to control things like air conditioners through voice commands. "The launch of Kenmore products on Amazon.com will significantly expand the distribution and availability of the Kenmore brand in the U.S.," Chairman and CEO Edward Lampert said in a company release.