AITopics | Lao, Yingjie

Collaborating Authors

Lao, Yingjie

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

UniGuardian: A Unified Defense for Detecting Prompt Injection, Backdoor Attacks and Adversarial Attacks in Large Language Models

Lin, Huawei, Lao, Yingjie, Geng, Tong, Yu, Tan, Zhao, Weijie

arXiv.org Artificial IntelligenceFeb-18-2025

Large Language Models (LLMs) are vulnerable to attacks like prompt injection, backdoor attacks, and adversarial attacks, which manipulate prompts or models to generate harmful outputs. In this paper, departing from traditional deep learning attack paradigms, we explore their intrinsic relationship and collectively term them Prompt Trigger Attacks (PTA). This raises a key question: Can we determine if a prompt is benign or poisoned? To address this, we propose UniGuardian, the first unified defense mechanism designed to detect prompt injection, backdoor attacks, and adversarial attacks in LLMs. Additionally, we introduce a single-forward strategy to optimize the detection pipeline, enabling simultaneous attack detection and text generation within a single forward pass. Our experiments confirm that UniGuardian accurately and efficiently identifies malicious prompts in LLMs.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2502.13141

Country:

North America > United States (1.00)
Europe (1.00)

Genre:

Research Report (0.64)
Overview (0.46)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Online Gradient Boosting Decision Tree: In-Place Updates for Efficient Adding/Deleting Data

Lin, Huawei, Chung, Jun Woo, Lao, Yingjie, Zhao, Weijie

arXiv.org Machine LearningFeb-3-2025

Gradient Boosting Decision Tree (GBDT) is one of the most popular machine learning models in various applications. However, in the traditional settings, all data should be simultaneously accessed in the training procedure: it does not allow to add or delete any data instances after training. In this paper, we propose an efficient online learning framework for GBDT supporting both incremental and decremental learning. To the best of our knowledge, this is the first work that considers an in-place unified incremental and decremental learning on GBDT. To reduce the learning cost, we present a collection of optimizations for our framework, so that it can add or delete a small fraction of data on the fly. We theoretically show the relationship between the hyper-parameters of the proposed optimizations, which enables trading off accuracy and cost on incremental and decremental learning. The backdoor attack results show that our framework can successfully inject and remove backdoor in a well-trained model using incremental and decremental learning, and the empirical results on public datasets confirm the effectiveness and efficiency of our proposed online learning framework and optimizations.

artificial intelligence, decision tree learning, machine learning, (19 more...)

arXiv.org Machine Learning

2502.01634

Country:

Europe (1.00)
Asia (0.67)
North America > United States > California (0.46)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.28)

Genre: Research Report > New Finding (1.00)

Industry:

Information Technology > Security & Privacy (1.00)
Education (0.92)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

UIBDiffusion: Universal Imperceptible Backdoor Attack for Diffusion Models

Han, Yuning, Zhao, Bingyin, Chu, Rui, Luo, Feng, Sikdar, Biplab, Lao, Yingjie

arXiv.org Artificial IntelligenceDec-31-2024

Recent studies show that diffusion models (DMs) are vulnerable to backdoor attacks. Existing backdoor attacks impose unconcealed triggers (e.g., a gray box and eyeglasses) that contain evident patterns, rendering remarkable attack effects yet easy detection upon human inspection and defensive algorithms. While it is possible to improve stealthiness by reducing the strength of the backdoor, doing so can significantly compromise its generality and effectiveness. In this paper, we propose UIBDiffusion, the universal imperceptible backdoor attack for diffusion models, which allows us to achieve superior attack and generation performance while evading state-of-the-art defenses. We propose a novel trigger generation approach based on universal adversarial perturbations (UAPs) and reveal that such perturbations, which are initially devised for fooling pre-trained discriminative models, can be adapted as potent imperceptible backdoor triggers for DMs. We evaluate UIBDiffusion on multiple types of DMs with different kinds of samplers across various datasets and targets. Experimental results demonstrate that UIBDiffusion brings three advantages: 1) Universality, the imperceptible trigger is universal (i.e., image and model agnostic) where a single trigger is effective to any images and all diffusion models with different samplers; 2) Utility, it achieves comparable generation quality (e.g., FID) and even better attack success rate (i.e., ASR) at low poison rates compared to the prior works; and 3) Undetectability, UIBDiffusion is plausible to human perception and can bypass Elijah and TERD, the SOTA defenses against backdoors for DMs. We will release our backdoor triggers and code.

artificial intelligence, machine learning, uibdiffusion, (14 more...)

arXiv.org Artificial Intelligence

2412.11441

Country:

North America > Canada (0.68)
Europe > Austria > Vienna (0.14)
North America > United States > California (0.14)
(2 more...)

Genre: Research Report > New Finding (0.86)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)

Add feedback

DMin: Scalable Training Data Influence Estimation for Diffusion Models

Lin, Huawei, Lao, Yingjie, Zhao, Weijie

arXiv.org Artificial IntelligenceDec-11-2024

Identifying the training data samples that most influence a generated image is a critical task in understanding diffusion models, yet existing influence estimation methods are constrained to small-scale or LoRA-tuned models due to computational limitations. As diffusion models scale up, these methods become impractical. To address this challenge, we propose DMin (Diffusion Model influence), a scalable framework for estimating the influence of each training data sample on a given generated image. By leveraging efficient gradient compression and retrieval techniques, DMin reduces storage requirements from 339.39 TB to only 726 MB and retrieves the top-k most influential training samples in under 1 second, all while maintaining performance. Our empirical results demonstrate DMin is both effective in identifying influential training samples and efficient in terms of computational and storage requirements.

artificial intelligence, diffusion model, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2412.08637

Country:

Europe > Austria > Vienna (0.14)
North America > United States > Hawaii (0.14)
North America > United States > Maryland (0.14)
Asia > Middle East > UAE (0.14)

Genre: Research Report > New Finding (0.66)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)

Add feedback

Theoretical Corrections and the Leveraging of Reinforcement Learning to Enhance Triangle Attack

Meng, Nicole, Manicke, Caleb, Chen, David, Lao, Yingjie, Ding, Caiwen, Hong, Pengyu, Mahmood, Kaleel

arXiv.org Artificial IntelligenceNov-18-2024

Adversarial examples represent a serious issue for the application of machine learning models in many sensitive domains. For generating adversarial examples, decision based black-box attacks are one of the most practical techniques as they only require query access to the model. One of the most recently proposed state-of-the-art decision based black-box attacks is Triangle Attack (TA). In this paper, we offer a high-level description of TA and explain potential theoretical limitations. We then propose a new decision based black-box attack, Triangle Attack with Reinforcement Learning (TARL). Our new attack addresses the limits of TA by leveraging reinforcement learning. This creates an attack that can achieve similar, if not better, attack accuracy than TA with half as many queries on state-of-the-art classifiers and defenses across ImageNet and CIFAR-10.

adversarial example, machine learning, reinforcement learning, (16 more...)

arXiv.org Artificial Intelligence

2411.12071

Country: North America > United States (0.28)

Genre: Research Report (1.00)

Industry: Information Technology > Security & Privacy (0.69)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Understanding the Robustness of Randomized Feature Defense Against Query-Based Adversarial Attacks

Nguyen, Quang H., Lao, Yingjie, Pham, Tung, Wong, Kok-Seng, Doan, Khoa D.

arXiv.org Artificial IntelligenceSep-30-2023

Recent works have shown that deep neural networks are vulnerable to adversarial examples that find samples close to the original image but can make the model misclassify. Even with access only to the model's output, an attacker can employ black-box attacks to generate such adversarial examples. In this work, we propose a simple and lightweight defense against black-box attacks by adding random noise to hidden features at intermediate layers of the model at inference time. Our theoretical analysis confirms that this method effectively enhances the model's resilience against both score-based and decision-based black-box attacks. Importantly, our defense does not necessitate adversarial training and has minimal impact on accuracy, rendering it applicable to any pre-trained model. Our analysis also reveals the significance of selectively adding noise to different parts of the model based on the gradient of the adversarial objective function, which can be varied during the attack. We demonstrate the robustness of our defense against multiple black-box attacks through extensive empirical experiments involving diverse models with various architectures.

artificial intelligence, machine learning, robustness, (19 more...)

arXiv.org Artificial Intelligence

2310.00567

Country: North America > Canada > Ontario > Toronto (0.14)

Genre: Research Report (0.64)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.88)

Add feedback

Class-Oriented Poisoning Attack

Zhao, Bingyin, Lao, Yingjie

arXiv.org Machine LearningJul-31-2020

Poisoning attacks on machine learning systems compromise the model performance by deliberately injecting malicious samples in the training dataset to influence the training process. Prior works focus on either availability attacks (i.e., lowering the overall model accuracy) or integrity attacks (i.e., enabling specific instance based backdoor). In this paper, we advance the adversarial objectives of the availability attacks to a per-class basis, which we refer to as class-oriented poisoning attacks. We demonstrate that the proposed attack is capable of forcing the corrupted model to predict in two specific ways: (i) classify unseen new images to a targeted "supplanter" class, and (ii) misclassify images from a "victim" class while maintaining the classification accuracy on other non-victim classes. To maximize the adversarial effect, we propose a gradient-based framework that manipulates the logits to retain/eliminate the desired/undesired feature information in the generated poisoning images. Using newly defined metrics at the class level, we illustrate the effectiveness of the proposed class-oriented poisoning attacks on various models (e.g., LeNet-5, Vgg-9, and ResNet-50) over a wide range of datasets (e.g., MNIST, CIFAR-10, and ImageNet-ILSVRC2012).

deep learning, neural network, poisoning attack, (17 more...)

arXiv.org Machine Learning

2008.00047

Genre: Research Report (1.00)

Industry:

Information Technology > Security & Privacy (1.00)
Transportation (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)

Add feedback

Hardware Trojan Attacks on Neural Networks

Clements, Joseph, Lao, Yingjie

arXiv.org Machine LearningJun-14-2018

With the rising popularity of machine learning and the ever increasing demand for computational power, there is a growing need for hardware optimized implementations of neural networks and other machine learning models. As the technology evolves, it is also plausible that machine learning or artificial intelligence will soon become consumer electronic products and military equipment, in the form of well-trained models. Unfortunately, the modern fabless business model of manufacturing hardware, while economic, leads to deficiencies in security through the supply chain. In this paper, we illuminate these security issues by introducing hardware Trojan attacks on neural networks, expanding the current taxonomy of neural network security to incorporate attacks of this nature. To aid in this, we develop a novel framework for inserting malicious hardware Trojans in the implementation of a neural network classifier. We evaluate the capabilities of the adversary in this setting by implementing the attack algorithm on convolutional neural networks while controlling a variety of parameters available to the adversary. Our experimental results show that the proposed algorithm could effectively classify a selected input trigger as a specified class on the MNIST dataset by injecting hardware Trojans into $0.03\%$, on average, of neurons in the 5th hidden layer of arbitrary 7-layer convolutional neural networks, while undetectable under the test data. Finally, we discuss the potential defenses to protect neural networks against hardware Trojan attacks.

deep learning, hardware trojan, neural network, (18 more...)

arXiv.org Machine Learning

1806.05768

Genre: Research Report > New Finding (0.48)

Industry: Information Technology > Security & Privacy (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.87)

Add feedback