Wang, Haoyu
Haptic Repurposing with GenAI
Wang, Haoyu
Mixed Reality aims to merge the digital and physical worlds to create immersive human-computer interactions. Despite notable advancements, the absence of realistic haptic feedback often breaks the immersive experience by creating a disconnect between visual and tactile perceptions. This paper introduces Haptic Repurposing with GenAI, an innovative approach to enhance MR interactions by transforming any physical objects into adaptive haptic interfaces for AI-generated virtual assets. Utilizing state-of-the-art generative AI models, this system captures both 2D and 3D features of physical objects and, through user-directed prompts, generates corresponding virtual objects that maintain the physical form of the original objects. Through model-based object tracking, the system dynamically anchors virtual assets to physical props in real time, allowing objects to visually morph into any user-specified virtual object. This paper details the system's development, presents findings from usability studies that validate its effectiveness, and explores its potential to significantly enhance interactive MR environments. The hope is this work can lay a foundation for further research into AI-driven spatial transformation in immersive and haptic technologies.
PRSA: PRompt Stealing Attacks against Large Language Models
Yang, Yong, Li, Changjiang, Jiang, Yi, Chen, Xi, Wang, Haoyu, Zhang, Xuhong, Wang, Zonghui, Ji, Shouling
In recent years, "prompt as a service" has greatly enhanced the utility of large language models (LLMs) by enabling them to perform various downstream tasks efficiently without fine-tuning. This has also increased the commercial value of prompts. However, the potential risk of leakage in these commercialized prompts remains largely underexplored. In this paper, we introduce a novel attack framework, PRSA, designed for prompt stealing attacks against LLMs. The main idea of PRSA is to infer the intent behind a prompt by analyzing its input-output content, enabling the generation of a surrogate prompt that replicates the original's functionality. Specifically, PRSA mainly consists of two key phases: prompt mutation and prompt pruning. In the mutation phase, we propose a prompt attention algorithm based on output difference. The algorithm facilitates the generation of effective surrogate prompts by learning key factors that influence the accurate inference of prompt intent. During the pruning phase, we employ a two-step related word identification strategy to detect and mask words that are highly related to the input, thus improving the generalizability of the surrogate prompts. We verify the actual threat of PRSA through evaluation in both real-world settings, non-interactive and interactive prompt services. The results strongly confirm the PRSA's effectiveness and generalizability. We have reported these findings to prompt service providers and actively collaborate with them to implement defensive measures.
CLAQ: Pushing the Limits of Low-Bit Post-Training Quantization for LLMs
Wang, Haoyu, Liu, Bei, Shao, Hang, Xiao, Bo, Zeng, Ke, Wan, Guanglu, Qian, Yanmin
Parameter quantization for Large Language Models (LLMs) has attracted increasing attentions recently in reducing memory costs and improving computational efficiency. Early approaches have been widely adopted. However, the existing methods suffer from poor performance in low-bit (such as 2 to 3 bits) scenarios. In this paper, we present a novel and effective Column-Level Adaptive weight Quantization (CLAQ) framework by introducing three different types of adaptive strategies for LLM quantization. Firstly, a K-Means clustering based algorithm is proposed that allows dynamic generation of quantization centroids for each column of a parameter matrix. Secondly, we design an outlier-guided adaptive precision search strategy which can dynamically assign varying bit-widths to different columns. Finally, a dynamic outlier reservation scheme is developed to retain some parameters in their original float point precision, in trade off of boosted model performance. Experiments on various mainstream open source LLMs including LLaMA-1, LLaMA-2 and Yi demonstrate that our methods achieve the state-of-the-art results across different bit settings, especially in extremely low-bit scenarios. Code is available at https://github.com/fayuge/CLAQ.
LIDAO: Towards Limited Interventions for Debiasing (Large) Language Models
Liu, Tianci, Wang, Haoyu, Wang, Shiyang, Cheng, Yu, Gao, Jing
Large language models (LLMs) have achieved impressive performance on various natural language generation tasks. Nonetheless, they suffer from generating negative and harmful contents that are biased against certain demographic groups (e.g., female), raising severe fairness concerns. As remedies, prior works intervened the generation by removing attitude or demographic information, inevitably degrading the generation quality and resulting in notable \textit{fairness-fluency} trade-offs. However, it is still under-explored to what extent the fluency \textit{has to} be affected in order to achieve a desired level of fairness. In this work, we conduct the first formal study from an information-theoretic perspective. We show that previous approaches are excessive for debiasing and propose LIDAO, a general framework to debias a (L)LM at a better fluency provably. We further robustify LIDAO in adversarial scenarios, where a carefully-crafted prompt may stimulate LLMs exhibiting instruction-following abilities to generate texts with fairness issue appears only when the prompt is also taken into account. Experiments on three LMs ranging from 0.7B to 7B parameters demonstrate the superiority of our method.
DL2Fence: Integrating Deep Learning and Frame Fusion for Enhanced Detection and Localization of Refined Denial-of-Service in Large-Scale NoCs
Wang, Haoyu, Halak, Basel, Ren, Jianjie, Atamli, Ahmad
This study introduces a refined Flooding Injection Rate-adjustable Denial-of-Service (DoS) model for Network-on-Chips (NoCs) and more importantly presents DL2Fence, a novel framework utilizing Deep Learning (DL) and Frame Fusion (2F) for DoS detection and localization. Two Convolutional Neural Networks models for classification and segmentation were developed to detect and localize DoS respectively. It achieves detection and localization accuracies of 95.8% and 91.7%, and precision rates of 98.5% and 99.3% in a 16x16 mesh NoC. The framework's hardware overhead notably decreases by 76.3% when scaling from 8x8 to 16x16 NoCs, and it requires 42.4% less hardware compared to state-of-the-arts. This advancement demonstrates DL2Fence's effectiveness in balancing outstanding detection performance in large-scale NoCs with extremely low hardware overhead.
Effective In-Context Example Selection through Data Compression
Sun, Zhongxiang, Zhang, Kepu, Wang, Haoyu, Zhang, Xiao, Xu, Jun
In-context learning has been extensively validated in large language models. However, the mechanism and selection strategy for in-context example selection, which is a crucial ingredient in this approach, lacks systematic and in-depth research. In this paper, we propose a data compression approach to the selection of in-context examples. We introduce a two-stage method that can effectively choose relevant examples and retain sufficient information about the training dataset within the in-context examples. Our method shows a significant improvement of an average of 5.90% across five different real-world datasets using four language models.
GPT Store Mining and Analysis
Su, Dongxun, Zhao, Yanjie, Hou, Xinyi, Wang, Shenao, Wang, Haoyu
As a pivotal extension of the renowned ChatGPT, the GPT The development of Large Language Models (LLMs) has been Store serves as a dynamic marketplace for various Generative a transformative force in human life, reshaping interactions, Pre-trained Transformer (GPT) models, shaping the frontier enhancing communication, and influencing decision-making of conversational AI. This paper presents an in-depth measurement processes. A notable manifestation of this impact is ChatGPT, study of the GPT Store, with a focus on the categorization which, since its inception, has garnered widespread popularity, of GPTs by topic, factors influencing GPT popularity, evidenced by its millions of active users and its profound and the potential security risks. Our investigation starts with integration into various sectors such as education, business, assessing the categorization of GPTs in the GPT Store, analyzing and entertainment [17]. This surge in popularity not only how they are organized by topics, and evaluating the highlights the effectiveness of ChatGPT in understanding effectiveness of the classification system. We then examine and generating human-like text but also underscores the the factors that affect the popularity of specific GPTs, looking growing public interest in AI-driven solutions.
ExACT: An End-to-End Autonomous Excavator System Using Action Chunking With Transformers
Chen, Liangliang, Jin, Shiyu, Wang, Haoyu, Zhang, Liangjun
Excavators are crucial for diverse tasks such as construction and mining, while autonomous excavator systems enhance safety and efficiency, address labor shortages, and improve human working conditions. Different from the existing modularized approaches, this paper introduces ExACT, an end-to-end autonomous excavator system that processes raw LiDAR, camera data, and joint positions to control excavator valves directly. Utilizing the Action Chunking with Transformers (ACT) architecture, ExACT employs imitation learning to take observations from multi-modal sensors as inputs and generate actionable sequences. In our experiment, we build a simulator based on the captured real-world data to model the relations between excavator valve states and joint velocities. With a few human-operated demonstration data trajectories, ExACT demonstrates the capability of completing different excavation tasks, including reaching, digging and dumping through imitation learning in validations with the simulator. To the best of our knowledge, ExACT represents the first instance towards building an end-to-end autonomous excavator system via imitation learning methods with a minimal set of human demonstrations. The video about this work can be accessed at https://youtu.be/NmzR_Rf-aEk.
Wall-Climbing Performance of Gecko-inspired Robot with Soft Feet and Digits enhanced by Gravity Compensation
Wang, Bingcheng, Weng, Zhiyuan, Wang, Haoyu, Wang, Shuangjie, Wang, Zhouyi, Dai, Zhendong, Jusufi, Ardian
Gravitational forces can induce deviations in body posture from desired configurations in multi-legged arboreal robot locomotion with low leg stiffness, affecting the contact angle between the swing leg's end-effector and the climbing surface during the gait cycle. The relationship between desired and actual foot positions is investigated here in a leg-stiffness-enhanced model under external forces, focusing on the challenge of unreliable end-effector attachment on climbing surfaces in such robots. Inspired by the difference in ceiling attachment postures of dead and living geckos, feedforward compensation of the stance phase legs is the key to solving this problem. A feedforward gravity compensation (FGC) strategy, complemented by leg coordination, is proposed to correct gravity-influenced body posture and improve adhesion stability by reducing body inclination. The efficacy of this strategy is validated using a quadrupedal climbing robot, EF-I, as the experimental platform. Experimental validation on an inverted surface (ceiling walking) highlight the benefits of the FGC strategy, demonstrating its role in enhancing stability and ensuring reliable end-effector attachment without external assistance. In the experiment, robots without FGC only completed in 3 out of 10 trials, while robots with FGC achieved a 100\% success rate in the same trials. The speed was substantially greater with FGC, achieved 9.2 mm/s in the trot gait. This underscores the proposed potential of FGC strategy in overcoming the challenges associated with inconsistent end-effector attachment in robots with low leg stiffness, thereby facilitating stable locomotion even at inverted body attitude.
Glitch Tokens in Large Language Models: Categorization Taxonomy and Effective Detection
Li, Yuxi, Liu, Yi, Deng, Gelei, Zhang, Ying, Song, Wenjia, Shi, Ling, Wang, Kailong, Li, Yuekang, Liu, Yang, Wang, Haoyu
With the expanding application of Large Language Models (LLMs) in various domains, it becomes imperative to comprehensively investigate their unforeseen behaviors and consequent outcomes. In this study, we introduce and systematically explore the phenomenon of "glitch tokens", which are anomalous tokens produced by established tokenizers and could potentially compromise the models' quality of response. Specifically, we experiment on seven top popular LLMs utilizing three distinct tokenizers and involving a totally of 182,517 tokens. We present categorizations of the identified glitch tokens and symptoms exhibited by LLMs when interacting with glitch tokens. Based on our observation that glitch tokens tend to cluster in the embedding space, we propose GlitchHunter, a novel iterative clustering-based technique, for efficient glitch token detection. The evaluation shows that our approach notably outperforms three baseline methods on eight open-source LLMs. To the best of our knowledge, we present the first comprehensive study on glitch tokens. Our new detection further provides valuable insights into mitigating tokenization-related errors in LLMs.