AITopics | Wang, Kailong

Collaborating Authors

Wang, Kailong

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Detecting LLM Fact-conflicting Hallucinations Enhanced by Temporal-logic-based Reasoning

Li, Ningke, Song, Yahui, Wang, Kailong, Li, Yuekang, Shi, Ling, Liu, Yi, Wang, Haoyu

arXiv.org Artificial IntelligenceFeb-18-2025

Abstract--Large language models (LLMs) face the challenge of hallucinations - outputs that seem coherent but are actually incorrect. A particularly damaging type is fact-conflicting hallucination (FCH), where generated content contradicts established facts. Addressing FCH presents three main challenges: 1) Automatically constructing and maintaining large-scale benchmark datasets is difficult and resource-intensive; 2) Generating complex and efficient test cases that the LLM has not been trained on - especially those involving intricate temporal features - is challenging, yet crucial for eliciting hallucinations; and 3) Validating the reasoning behind LLM outputs is inherently difficult, particularly with complex logical relationships, as it requires transparency in the model's decision-making process. LLMs are tested using these cases through template-based prompts, which require them to generate both answers and reasoning steps. T o validate the reasoning, we propose two semantic-aware oracles that compare the semantic structure of LLM outputs to the ground truths. Key insights reveal that LLMs struggle with out-of-distribution knowledge and logical reasoning. These findings highlight the importance of continued efforts to detect and mitigate hallucinations in LLMs. Large Language Models (LLMs) have revolutionized language processing, demonstrating impressive text generation and comprehension capabilities with diverse applications. However, despite their growing use, LLMs face significant security and privacy challenges [1], [2], [3], [4], [5], which affect their overall effectiveness and reliability . A critical issue is the phenomenon of hallucination, where LLMs generate outputs that are coherent but factually incorrect or irrelevant. This tendency to produce misleading information compromises the safety and usability of LLM-based systems. This paper focuses on fact-conflicting hallucina tion (FCH), the most prominent form of hallucination in LLMs. FCH occurs when LLMs generate content that directly contradicts established facts. For instance, as illustrated in Figure 1, an LLM incorrectly asserts that " Haruki Murakami won the Nobel Prize in Literature in 2016 ", whereas the fact is that "Haruki Murakami has not won the Nobel Prize, though he has received numerous other literary awards ". Such inaccuracies can significantly lead to user confusion and undermine the trust and reliability that are crucial for LLM applications. N. Li, K. Wang, and H. Wang are with Huazhong University of Science and T echnology, China. Song is with the National University of Singapore, Singapore. Li is with the University of New South Wales, Australia.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2502.13416

Country:

North America > United States (1.00)
Oceania > Australia > New South Wales (0.24)
Asia > Singapore > Central Region > Singapore (0.24)

Genre: Personal > Honors (1.00)

Industry:

Leisure & Entertainment (0.92)
Information Technology (0.66)
Media > Television (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Add feedback

Model-Editing-Based Jailbreak against Safety-aligned Large Language Models

Li, Yuxi, Zhang, Zhibo, Wang, Kailong, Shi, Ling, Wang, Haoyu

arXiv.org Artificial IntelligenceDec-11-2024

Large Language Models (LLMs) have transformed numerous fields by enabling advanced natural language interactions but remain susceptible to critical vulnerabilities, particularly jailbreak attacks. Current jailbreak techniques, while effective, often depend on input modifications, making them detectable and limiting their stealth and scalability. This paper presents Targeted Model Editing (TME), a novel white-box approach that bypasses safety filters by minimally altering internal model structures while preserving the model's intended functionalities. TME identifies and removes safety-critical transformations (SCTs) embedded in model matrices, enabling malicious queries to bypass restrictions without input modifications. By analyzing distinct activation patterns between safe and unsafe queries, TME isolates and approximates SCTs through an optimization process. Implemented in the D-LLM framework, our method achieves an average Attack Success Rate (ASR) of 84.86% on four mainstream open-source LLMs, maintaining high performance. Unlike existing methods, D-LLM eliminates the need for specific triggers or harmful response collections, offering a stealthier and more effective jailbreak strategy. This work reveals a covert and robust threat vector in LLM security and emphasizes the need for stronger safeguards in model safety alignment.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2412.08201

Country:

Asia (1.00)
North America > United States (0.68)

Genre: Research Report (1.00)

Industry:

Information Technology > Security & Privacy (1.00)
Government (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

The Fusion of Large Language Models and Formal Methods for Trustworthy AI Agents: A Roadmap

Zhang, Yedi, Cai, Yufan, Zuo, Xinyue, Luan, Xiaokun, Wang, Kailong, Hou, Zhe, Zhang, Yifan, Wei, Zhiyuan, Sun, Meng, Sun, Jun, Sun, Jing, Dong, Jin Song

arXiv.org Artificial IntelligenceDec-9-2024

Large Language Models (LLMs) have emerged as a transformative AI paradigm, profoundly influencing daily life through their exceptional language understanding and contextual generation capabilities. Despite their remarkable performance, LLMs face a critical challenge: the propensity to produce unreliable outputs due to the inherent limitations of their learning-based nature. Formal methods (FMs), on the other hand, are a well-established computation paradigm that provides mathematically rigorous techniques for modeling, specifying, and verifying the correctness of systems. FMs have been extensively applied in mission-critical software engineering, embedded systems, and cybersecurity. However, the primary challenge impeding the deployment of FMs in real-world settings lies in their steep learning curves, the absence of user-friendly interfaces, and issues with efficiency and adaptability. This position paper outlines a roadmap for advancing the next generation of trustworthy AI systems by leveraging the mutual enhancement of LLMs and FMs. First, we illustrate how FMs, including reasoning and certification techniques, can help LLMs generate more reliable and formally certified outputs. Subsequently, we highlight how the advanced learning capabilities and adaptability of LLMs can significantly enhance the usability, efficiency, and scalability of existing FM tools. Finally, we show that unifying these two computation paradigms -- integrating the flexibility and intelligence of LLMs with the rigorous reasoning abilities of FMs -- has transformative potential for the development of trustworthy AI software systems. We acknowledge that this integration has the potential to enhance both the trustworthiness and efficiency of software engineering practices while fostering the development of intelligent FM tools capable of addressing complex yet real-world challenges.

large language model, machine learning, natural language, (15 more...)

arXiv.org Artificial Intelligence

2412.06512

Country:

Asia (1.00)
North America > United States (0.68)
North America > Canada (0.67)

Genre:

Research Report (1.00)
Workflow (0.68)

Industry:

Information Technology > Security & Privacy (0.48)
Transportation (0.46)
Government > Military (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Lockpicking LLMs: A Logit-Based Jailbreak Using Token-level Manipulation

Li, Yuxi, Liu, Yi, Li, Yuekang, Shi, Ling, Deng, Gelei, Chen, Shengquan, Wang, Kailong

arXiv.org Artificial IntelligenceJun-19-2024

Large language models (LLMs) have transformed the field of natural language processing, but they remain susceptible to jailbreaking attacks that exploit their capabilities to generate unintended and potentially harmful content. Existing token-level jailbreaking techniques, while effective, face scalability and efficiency challenges, especially as models undergo frequent updates and incorporate advanced defensive measures. In this paper, we introduce JailMine, an innovative token-level manipulation approach that addresses these limitations effectively. JailMine employs an automated "mining" process to elicit malicious responses from LLMs by strategically selecting affirmative outputs and iteratively reducing the likelihood of rejection. Through rigorous testing across multiple well-known LLMs and datasets, we demonstrate JailMine's effectiveness and efficiency, achieving a significant average reduction of 86% in time consumed while maintaining high success rates averaging 95%, even in the face of evolving defensive strategies. Our work contributes to the ongoing effort to assess and mitigate the vulnerability of LLMs to jailbreaking attacks, underscoring the importance of continued vigilance and proactive measures to enhance the security and reliability of these powerful language models.

arxiv preprint arxiv, large language model, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2405.13068

Country: North America > United States (0.93)

Genre:

Instructional Material (0.97)
Research Report > New Finding (0.93)

Industry:

Law > Criminal Law (1.00)
Law Enforcement & Public Safety > Crime Prevention & Enforcement (1.00)
Information Technology > Security & Privacy (1.00)
(6 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)

Add feedback

Glitch Tokens in Large Language Models: Categorization Taxonomy and Effective Detection

Li, Yuxi, Liu, Yi, Deng, Gelei, Zhang, Ying, Song, Wenjia, Shi, Ling, Wang, Kailong, Li, Yuekang, Liu, Yang, Wang, Haoyu

arXiv.org Artificial IntelligenceApr-19-2024

With the expanding application of Large Language Models (LLMs) in various domains, it becomes imperative to comprehensively investigate their unforeseen behaviors and consequent outcomes. In this study, we introduce and systematically explore the phenomenon of "glitch tokens", which are anomalous tokens produced by established tokenizers and could potentially compromise the models' quality of response. Specifically, we experiment on seven top popular LLMs utilizing three distinct tokenizers and involving a totally of 182,517 tokens. We present categorizations of the identified glitch tokens and symptoms exhibited by LLMs when interacting with glitch tokens. Based on our observation that glitch tokens tend to cluster in the embedding space, we propose GlitchHunter, a novel iterative clustering-based technique, for efficient glitch token detection. The evaluation shows that our approach notably outperforms three baseline methods on eight open-source LLMs. To the best of our knowledge, we present the first comprehensive study on glitch tokens. Our new detection further provides valuable insights into mitigating tokenization-related errors in LLMs.

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2404.09894

Country: North America > United States > Oregon (0.14)

Genre: Research Report > New Finding (1.00)

Industry: Education (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.93)

Add feedback

Beyond Fidelity: Explaining Vulnerability Localization of Learning-based Detectors

Cheng, Baijun, Zhao, Shengming, Wang, Kailong, Wang, Meizhen, Bai, Guangdong, Feng, Ruitao, Guo, Yao, Ma, Lei, Wang, Haoyu

arXiv.org Artificial IntelligenceJan-5-2024

Vulnerability detectors based on deep learning (DL) models have proven their effectiveness in recent years. However, the shroud of opacity surrounding the decision-making process of these detectors makes it difficult for security analysts to comprehend. To address this, various explanation approaches have been proposed to explain the predictions by highlighting important features, which have been demonstrated effective in other domains such as computer vision and natural language processing. Unfortunately, an in-depth evaluation of vulnerability-critical features, such as fine-grained vulnerability-related code lines, learned and understood by these explanation approaches remains lacking. In this study, we first evaluate the performance of ten explanation approaches for vulnerability detectors based on graph and sequence representations, measured by two quantitative metrics including fidelity and vulnerability line coverage rate. Our results show that fidelity alone is not sufficient for evaluating these approaches, as fidelity incurs significant fluctuations across different datasets and detectors. We subsequently check the precision of the vulnerability-related code lines reported by the explanation approaches, and find poor accuracy in this task among all of them. This can be attributed to the inefficiency of explainers in selecting important features and the presence of irrelevant artifacts learned by DL-based detectors.

detector, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2401.02686

Country:

Oceania > Australia (0.28)
North America > Canada > Alberta (0.14)

Genre: Research Report > New Finding (1.00)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Digger: Detecting Copyright Content Mis-usage in Large Language Model Training

Li, Haodong, Deng, Gelei, Liu, Yi, Wang, Kailong, Li, Yuekang, Zhang, Tianwei, Liu, Yang, Xu, Guoai, Xu, Guosheng, Wang, Haoyu

arXiv.org Artificial IntelligenceJan-1-2024

Pre-training, which utilizes extensive and varied datasets, is a critical factor in the success of Large Language Models (LLMs) across numerous applications. However, the detailed makeup of these datasets is often not disclosed, leading to concerns about data security and potential misuse. This is particularly relevant when copyrighted material, still under legal protection, is used inappropriately, either intentionally or unintentionally, infringing on the rights of the authors. In this paper, we introduce a detailed framework designed to detect and assess the presence of content from potentially copyrighted books within the training datasets of LLMs. This framework also provides a confidence estimation for the likelihood of each content sample's inclusion. To validate our approach, we conduct a series of simulated experiments, the results of which affirm the framework's effectiveness in identifying and addressing instances of content misuse in LLM training processes. Furthermore, we investigate the presence of recognizable quotes from famous literary works within these datasets. The outcomes of our study have significant implications for ensuring the ethical use of copyrighted materials in the development of LLMs, highlighting the need for more transparent and responsible data management practices in this field.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2401.00676

Country: Asia > China (0.46)

Genre: Research Report > New Finding (0.93)

Industry:

Law (1.00)
Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.97)

Add feedback

Large Language Models for Software Engineering: A Systematic Literature Review

Hou, Xinyi, Zhao, Yanjie, Liu, Yue, Yang, Zhou, Wang, Kailong, Li, Li, Luo, Xiapu, Lo, David, Grundy, John, Wang, Haoyu

arXiv.org Artificial IntelligenceSep-12-2023

Large Language Models (LLMs) have significantly impacted numerous domains, including Software Engineering (SE). Many recent publications have explored LLMs applied to various SE tasks. Nevertheless, a comprehensive understanding of the application, effects, and possible limitations of LLMs on SE is still in its early stages. To bridge this gap, we conducted a systematic literature review on LLM4SE, with a particular focus on understanding how LLMs can be exploited to optimize processes and outcomes. We collect and analyze 229 research papers from 2017 to 2023 to answer four key research questions (RQs). In RQ1, we categorize different LLMs that have been employed in SE tasks, characterizing their distinctive features and uses. In RQ2, we analyze the methods used in data collection, preprocessing, and application highlighting the role of well-curated datasets for successful LLM for SE implementation. RQ3 investigates the strategies employed to optimize and evaluate the performance of LLMs in SE. Finally, RQ4 examines the specific SE tasks where LLMs have shown success to date, illustrating their practical contributions to the field. From the answers to these RQs, we discuss the current state-of-the-art and trends, identifying gaps in existing research, and flagging promising areas for future study.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2308.1062

Country:

Asia > China (0.46)
North America > United States > Pennsylvania (0.14)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)
Overview (1.00)
Research Report > Promising Solution (0.92)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Prompt Injection attack against LLM-integrated Applications

Liu, Yi, Deng, Gelei, Li, Yuekang, Wang, Kailong, Zhang, Tianwei, Liu, Yepang, Wang, Haoyu, Zheng, Yan, Liu, Yang

arXiv.org Artificial IntelligenceJun-8-2023

Large Language Models (LLMs), renowned for their superior proficiency in language comprehension and generation, stimulate a vibrant ecosystem of applications around them. However, their extensive assimilation into various services introduces significant security risks. This study deconstructs the complexities and implications of prompt injection attacks on actual LLM-integrated applications. Initially, we conduct an exploratory analysis on ten commercial applications, highlighting the constraints of current attack strategies in practice. Prompted by these limitations, we subsequently formulate HouYi, a novel black-box prompt injection attack technique, which draws inspiration from traditional web injection attacks. HouYi is compartmentalized into three crucial elements: a seamlessly-incorporated pre-constructed prompt, an injection prompt inducing context partition, and a malicious payload designed to fulfill the attack objectives. Leveraging HouYi, we unveil previously unknown and severe attack outcomes, such as unrestricted arbitrary LLM usage and uncomplicated application prompt theft. We deploy HouYi on 36 actual LLM-integrated applications and discern 31 applications susceptible to prompt injection. 10 vendors have validated our discoveries, including Notion, which has the potential to impact millions of users. Our investigation illuminates both the possible risks of prompt injection attacks and the possible tactics for mitigation.

artificial intelligence, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2306.05499

Genre: Research Report > New Finding (0.66)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback