AITopics | Zhang, Shudong

Collaborating Authors

Zhang, Shudong

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

GlitchMiner: Mining Glitch Tokens in Large Language Models via Gradient-based Discrete Optimization

Wu, Zihui, Gao, Haichang, Wang, Ping, Zhang, Shudong, Liu, Zhaoxiang, Lian, Shiguo

arXiv.org Artificial IntelligenceNov-9-2024

Glitch tokens in Large Language Models (LLMs) can trigger unpredictable behaviors, threatening model reliability and safety. Existing detection methods rely on predefined patterns, limiting their adaptability across diverse LLM architectures. We propose GlitchMiner, a gradient-based discrete optimization framework that efficiently identifies glitch tokens by introducing entropy as a measure of prediction uncertainty and employing a local search strategy to explore the token space. Experiments across multiple LLM architectures demonstrate that GlitchMiner outperforms existing methods in detection accuracy and adaptability, achieving over 10% average efficiency improvement. This method enhances vulnerability assessment in LLMs, contributing to the development of more robust and reliable applications.

large language model, machine learning, natural language, (14 more...)

arXiv.org Artificial Intelligence

2410.15052

Country: Asia > China (0.14)

Genre: Research Report > New Finding (0.93)

Industry: Information Technology > Security & Privacy (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

ART: Automatic Red-teaming for Text-to-Image Models to Protect Benign Users

Li, Guanlin, Chen, Kangjie, Zhang, Shudong, Zhang, Jie, Zhang, Tianwei

arXiv.org Artificial IntelligenceJun-17-2024

Large-scale pre-trained generative models are taking the world by storm, due to their abilities in generating creative content. Meanwhile, safeguards for these generative models are developed, to protect users' rights and safety, most of which are designed for large language models. Existing methods primarily focus on jailbreak and adversarial attacks, which mainly evaluate the model's safety under malicious prompts. Recent work found that manually crafted safe prompts can unintentionally trigger unsafe generations. To further systematically evaluate the safety risks of text-to-image models, we propose a novel Automatic Red-Teaming framework, ART. Our method leverages both vision language model and large language model to establish a connection between unsafe generations and their prompts, thereby more efficiently identifying the model's vulnerabilities. With our comprehensive experiments, we reveal the toxicity of the popular open-source text-to-image models. The experiments also validate the effectiveness, adaptability, and great diversity of ART. Additionally, we introduce three large-scale red-teaming datasets for studying the safety risks associated with text-to-image models. Datasets and models can be found in https://github.com/GuanlinLee/ART.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2405.1936

Genre:

Research Report (1.00)
Instructional Material > Course Syllabus & Notes (0.34)

Industry:

Law Enforcement & Public Safety > Crime Prevention & Enforcement (1.00)
Law > Criminal Law (0.94)
Health & Medicine > Therapeutic Area > Psychiatry/Psychology (0.67)
Information Technology > Security & Privacy (0.66)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.94)

Add feedback

Lower Difficulty and Better Robustness: A Bregman Divergence Perspective for Adversarial Training

Wu, Zihui, Gao, Haichang, Zhou, Bingqian, Guo, Xiaoyan, Zhang, Shudong

arXiv.org Artificial IntelligenceJan-5-2024

In this paper, we investigate on improving the adversarial robustness obtained in adversarial training (AT) via reducing the difficulty of optimization. To better study this problem, we build a novel Bregman divergence perspective for AT, in which AT can be viewed as the sliding process of the training data points on the negative entropy curve. Based on this perspective, we analyze the learning objectives of two typical AT methods, i.e., PGD-AT and TRADES, and we find that the optimization process of TRADES is easier than PGD-AT for that TRADES separates PGD-AT. In addition, we discuss the function of entropy in TRADES, and we find that models with high entropy can be better robustness learners. Inspired by the above findings, we propose two methods, i.e., FAIT and MER, which can both not only reduce the difficulty of optimization under the 10-step PGD adversaries, but also provide better robustness. Our work suggests that reducing the difficulty of optimization under the 10-step PGD adversaries is a promising approach for enhancing the adversarial robustness in AT.

artificial intelligence, machine learning, robustness, (16 more...)

arXiv.org Artificial Intelligence

2208.12511

Country:

Europe (0.67)
North America > United States > New York > New York County > New York City (0.14)
North America > United States > California > Los Angeles County > Long Beach (0.14)

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback