AITopics | Ren, Qibing

Collaborating Authors

Ren, Qibing

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Derail Yourself: Multi-turn LLM Jailbreak Attack through Self-discovered Clues

Ren, Qibing, Li, Hao, Liu, Dongrui, Xie, Zhanxu, Lu, Xiaoya, Qiao, Yu, Sha, Lei, Yan, Junchi, Ma, Lizhuang, Shao, Jing

arXiv.org Artificial IntelligenceOct-14-2024

This study exposes the safety vulnerabilities of Large Language Models (LLMs) in multi-turn interactions, where malicious users can obscure harmful intents across several queries. We introduce ActorAttack, a novel multi-turn attack method inspired by actor-network theory, which models a network of semantically linked actors as attack clues to generate diverse and effective attack paths toward harmful targets. ActorAttack addresses two main challenges in multi-turn attacks: (1) concealing harmful intents by creating an innocuous conversation topic about the actor, and (2) uncovering diverse attack paths towards the same harmful target by leveraging LLMs' knowledge to specify the correlated actors as various attack clues. In this way, ActorAttack outperforms existing single-turn and multi-turn attack methods across advanced aligned LLMs, even for GPT-o1. We will publish a dataset called SafeMTData, which includes multi-turn adversarial prompts and safety alignment data, generated by ActorAttack. We demonstrate that models safety-tuned using our safety dataset are more robust to multi-turn attacks. Code is available at https://github.com/renqibing/ActorAttack.

arxiv preprint arxiv, large language model, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2410.107

Country: North America > United States (1.00)

Genre: Research Report (1.00)

Industry:

Law Enforcement & Public Safety > Crime Prevention & Enforcement (1.00)
Law (1.00)
Information Technology > Security & Privacy (1.00)
(4 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

CodeAttack: Revealing Safety Generalization Challenges of Large Language Models via Code Completion

Ren, Qibing, Gao, Chang, Shao, Jing, Yan, Junchi, Tan, Xin, Lam, Wai, Ma, Lizhuang

arXiv.org Artificial IntelligenceJun-9-2024

The rapid advancement of Large Language Models (LLMs) has brought about remarkable generative capabilities but also raised concerns about their potential misuse. While strategies like supervised fine-tuning and reinforcement learning from human feedback have enhanced their safety, these methods primarily focus on natural languages, which may not generalize to other domains. This paper introduces CodeAttack, a framework that transforms natural language inputs into code inputs, presenting a novel environment for testing the safety generalization of LLMs. Our comprehensive studies on state-of-the-art LLMs including GPT-4, Claude-2, and Llama-2 series reveal a new and universal safety vulnerability of these models against code input: CodeAttack bypasses the safety guardrails of all models more than 80\% of the time. We find that a larger distribution gap between CodeAttack and natural language leads to weaker safety generalization, such as encoding natural language input with data structures. Furthermore, we give our hypotheses about the success of CodeAttack: the misaligned bias acquired by LLMs during code training, prioritizing code completion over avoiding the potential safety risk. Finally, we analyze potential mitigation measures. These findings highlight new safety risks in the code domain and the need for more robust safety alignment algorithms to match the code capabilities of LLMs.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2403.07865

Country: Asia > China (0.28)

Genre: Research Report > New Finding (0.67)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Mind Your Solver! On Adversarial Attack and Defense for Combinatorial Optimization

Lu, Han, Li, Zenan, Wang, Runzhong, Ren, Qibing, Yan, Junchi, Yang, Xiaokang

arXiv.org Artificial IntelligenceDec-28-2021

It is worth noting that many challenging task not only in its inherent CO problems can be essentially formulated as a graph problem complexity (e.g. NP-hard) but also the possible (Khalil et al., 2017; Bengio et al., 2020), hence it is sensitivity to input conditions. In this paper, we attractive and natural to modify the problem instance by take an initiative on developing the mechanisms modifying the graph structure, to generate more test cases for adversarial attack and defense towards combinatorial for solvers. In fact, vulnerability can often be an inherent optimization solvers, whereby the solver challenge for CO solvers since the problem is often strong is treated as a black-box function and the original nonlinear and NP-hard. From this perspective, we consider problem's underlying graph structure (which is attack and defense CO solvers in the following aspects.

artificial intelligence, machine learning, solver, (13 more...)

arXiv.org Artificial Intelligence

2201.00402

Genre: Research Report (0.64)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.94)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.67)

Add feedback