Qian, Chen
Abdominal Undulation with Compliant Mechanism Improves Flight Performance of Biomimetic Robotic Butterfly
Lian, Xuyi, Luo, Mingyu, Lin, Te, Qian, Chen, Li, Tiefeng
Abstract-- This paper presents the design, modeling, and experimental validation of a biomimetic robotic butterfly (BRB) that integrates a compliant mechanism to achieve coupled wing-abdomen motion. Drawing inspiration from the natural flight dynamics of butterflies, a theoretical model is developed to investigate the impact of abdominal undulation on flight performance. To validate the model, motion capture experiments are conducted on three configurations: a BRB without an abdomen, with a fixed abdomen, and with an undulating abdomen. Recently, increasing attention has I. Flapping-wing aerial vehicles (FWAVs) have demonstrated Because the butterfly wings attached to the thorax have a advantages in maneuverability, energy efficiency, and adaptability, relatively high moment of inertia, aerodynamic and inertial making them ideal for potential applications such forces cause the thorax to pitch in sync with the wingbeats. Over past decades, significant forward flight, the abdomen swings in response to these progress has been made in designing bio-inspired FWAVs thoracic oscillations [13], [14], [15].
Aligning Large Language Models to Follow Instructions and Hallucinate Less via Effective Data Filtering
Si, Shuzheng, Zhao, Haozhe, Chen, Gang, Gao, Cheng, Bai, Yuzhuo, Wang, Zhitong, An, Kaikai, Luo, Kangyang, Qian, Chen, Qi, Fanchao, Chang, Baobao, Sun, Maosong
Training LLMs on data containing unfamiliar knowledge during the instruction tuning stage can encourage hallucinations. To address this challenge, we introduce NOVA, a novel framework designed to identify high-quality data that aligns well with the LLM's learned knowledge to reduce hallucinations. NOVA includes Internal Consistency Probing (ICP) and Semantic Equivalence Identification (SEI) to measure how familiar the LLM is with instruction data. Specifically, ICP evaluates the LLM's understanding of the given instruction by calculating the tailored consistency among multiple self-generated responses. SEI further assesses the familiarity of the LLM with the target response by comparing it to the generated responses, using the proposed semantic clustering and well-designed voting strategy. Finally, to ensure the quality of selected samples, we introduce an expert-aligned reward model, considering characteristics beyond just familiarity. By considering data quality and avoiding unfamiliar data, we can utilize the selected data to effectively align LLMs to follow instructions and hallucinate less.
Token Cleaning: Fine-Grained Data Selection for LLM Supervised Fine-Tuning
Pang, Jinlong, Di, Na, Zhu, Zhaowei, Wei, Jiaheng, Cheng, Hao, Qian, Chen, Liu, Yang
Recent studies show that in supervised fine-tuning (SFT) of large language models (LLMs), data quality matters more than quantity. While most data cleaning methods concentrate on filtering entire samples, the quality of individual tokens within a sample can vary significantly. After pre-training, even in high-quality samples, patterns or phrases that are not task-related can be redundant or uninformative. Continuing to fine-tune on these patterns may offer limited benefit and even degrade downstream task performance. In this paper, we investigate token quality from a noisy-label perspective and propose a generic token cleaning pipeline for SFT tasks. Our method filters out uninformative tokens while preserving those carrying key task-specific information. Specifically, we first evaluate token quality by examining the influence of model updates on each token, then apply a threshold-based separation. The token influence can be measured in a single pass with a fixed reference model or iteratively with self-evolving reference models. The benefits and limitations of both methods are analyzed theoretically by error upper bounds. Extensive experiments show that our framework consistently improves performance across multiple downstream tasks.
JammingSnake: A follow-the-leader continuum robot with variable stiffness based on fiber jamming
Qian, Chen, Liu, Tangyou, Wu, Liao
Follow-the-leader (FTL) motion is essential for continuum robots operating in fragile and confined environments. It allows the robot to exert minimal force on its surroundings, reducing the risk of damage. This paper presents a novel design of a snake-like robot capable of achieving FTL motion by integrating fiber jamming modules (FJMs). The proposed robot can dynamically adjust its stiffness during propagation and interaction with the environment. An algorithm is developed to independently control the tendon and FJM insertion movements, allowing the robot to maintain its shape while minimizing the forces exerted on surrounding structures. To validate the proposed design, comparative tests were conducted between a traditional tendon-driven robot and the novel design under different configurations. The results demonstrate that our design relies significantly less on contact with the surroundings to maintain its shape. This highlights its potential for safer and more effective operations in delicate environments, such as minimally invasive surgery (MIS) or industrial in-situ inspection.
Adaptive Parameter-Efficient Federated Fine-Tuning on Heterogeneous Devices
Liu, Jun, Liao, Yunming, Xu, Hongli, Xu, Yang, Liu, Jianchun, Qian, Chen
Federated fine-tuning (FedFT) has been proposed to fine-tune the pre-trained language models in a distributed manner. However, there are two critical challenges for efficient FedFT in practical applications, i.e., resource constraints and system heterogeneity. Existing works rely on parameter-efficient fine-tuning methods, e.g., low-rank adaptation (LoRA), but with major limitations. Herein, based on the inherent characteristics of FedFT, we observe that LoRA layers with higher ranks added close to the output help to save resource consumption while achieving comparable fine-tuning performance. Then we propose a novel LoRA-based FedFT framework, termed LEGEND, which faces the difficulty of determining the number of LoRA layers (called, LoRA depth) and the rank of each LoRA layer (called, rank distribution). We analyze the coupled relationship between LoRA depth and rank distribution, and design an efficient LoRA configuration algorithm for heterogeneous devices, thereby promoting fine-tuning efficiency. Extensive experiments are conducted on a physical platform with 80 commercial devices. The results show that LEGEND can achieve a speedup of 1.5-2.8$\times$ and save communication costs by about 42.3% when achieving the target accuracy, compared to the advanced solutions.
Explainable and Interpretable Multimodal Large Language Models: A Comprehensive Survey
Dang, Yunkai, Huang, Kaichen, Huo, Jiahao, Yan, Yibo, Huang, Sirui, Liu, Dongrui, Gao, Mengxi, Zhang, Jie, Qian, Chen, Wang, Kun, Liu, Yong, Shao, Jing, Xiong, Hui, Hu, Xuming
The rapid development of Artificial Intelligence (AI) has revolutionized numerous fields, with large language models (LLMs) and computer vision (CV) systems driving advancements in natural language understanding and visual processing, respectively. The convergence of these technologies has catalyzed the rise of multimodal AI, enabling richer, cross-modal understanding that spans text, vision, audio, and video modalities. Multimodal large language models (MLLMs), in particular, have emerged as a powerful framework, demonstrating impressive capabilities in tasks like image-text generation, visual question answering, and cross-modal retrieval. Despite these advancements, the complexity and scale of MLLMs introduce significant challenges in interpretability and explainability, essential for establishing transparency, trustworthiness, and reliability in high-stakes applications. This paper provides a comprehensive survey on the interpretability and explainability of MLLMs, proposing a novel framework that categorizes existing research across three perspectives: (I) Data, (II) Model, (III) Training \& Inference. We systematically analyze interpretability from token-level to embedding-level representations, assess approaches related to both architecture analysis and design, and explore training and inference strategies that enhance transparency. By comparing various methodologies, we identify their strengths and limitations and propose future research directions to address unresolved challenges in multimodal explainability. This survey offers a foundational resource for advancing interpretability and transparency in MLLMs, guiding researchers and practitioners toward developing more accountable and robust multimodal AI systems.
Quantum Hamiltonian Descent for Graph Partition
Cheng, Jinglei, Zhou, Ruilin, Gan, Yuhang, Qian, Chen, Liu, Junyu
We introduce Quantum Hamiltonian Descent as a novel approach to solve the graph partition problem. By reformulating graph partition as a Quadratic Unconstrained Binary Optimization (QUBO) problem, we leverage QHD's quantum-inspired dynamics to identify optimal community structures. Our method implements a multi-level refinement strategy that alternates between QUBO formulation and QHD optimization to iteratively improve partition quality. Experimental results demonstrate that our QHD-based approach achieves superior modularity scores (up to 5.49\%) improvement with reduced computational overhead compared to traditional optimization methods. This work establishes QHD as an effective quantum-inspired framework for tackling graph partition challenges in large-scale networks.
GraphTeam: Facilitating Large Language Model-based Graph Analysis via Multi-Agent Collaboration
Li, Xin, Chu, Qizhi, Chen, Yubin, Liu, Yang, Liu, Yaoqi, Yu, Zekai, Chen, Weize, Qian, Chen, Shi, Chuan, Yang, Cheng
Graphs are widely used for modeling relational data in real-world scenarios, such as social networks and urban computing. Existing LLM-based graph analysis approaches either integrate graph neural networks (GNNs) for specific machine learning tasks, limiting their transferability, or rely solely on LLMs' internal reasoning ability, resulting in suboptimal performance. To address these limitations, we take advantage of recent advances in LLM-based agents, which have shown capabilities of utilizing external knowledge or tools for problem solving. By simulating human problem-solving strategies such as analogy and collaboration, we propose a multi-agent system based on LLMs named GraphTeam, for graph analysis. GraphTeam consists of five LLM-based agents from three modules, and the agents with different specialities can collaborate with each other to address complex problems. Specifically, (1) input-output normalization module: the question agent extracts and refines four key arguments from the original question, facilitating the problem understanding, and the answer agent organizes the results to meet the output requirement; (2) external knowledge retrieval module: we first build a knowledge base consisting of relevant documentation and experience information, and then the search agent retrieves the most relevant entries for each question. (3) problem-solving module: given the retrieved information from search agent, the coding agent uses established algorithms via programming to generate solutions, and in case the coding agent does not work, the reasoning agent will directly compute the results without programming. Extensive experiments on six graph analysis benchmarks demonstrate that GraphTeam achieves state-of-the-art performance with an average 25.85% improvement over the best baseline in terms of accuracy. The code and data are available at https://github.com/BUPT-GAMMA/GraphTeam.
DEAN: Deactivating the Coupled Neurons to Mitigate Fairness-Privacy Conflicts in Large Language Models
Qian, Chen, Liu, Dongrui, Zhang, Jie, Liu, Yong, Shao, Jing
Ensuring awareness of fairness and privacy in Large Language Models (LLMs) is critical. Interestingly, we discover a counter-intuitive trade-off phenomenon that enhancing an LLM's privacy awareness through Supervised Fine-Tuning (SFT) methods significantly decreases its fairness awareness with thousands of samples. To address this issue, inspired by the information theory, we introduce a trainingfree method to DEActivate the fairness and privacy coupled Neurons (DEAN), which theoretically and empirically decrease the mutual information between fairness and privacy awareness. Extensive experimental results demonstrate that DEAN eliminates the trade-off phenomenon and significantly improves LLMs' fairness and privacy awareness simultaneously, e.g., improving Qwen-2-7B-Instruct's fairness awareness by 12.2% and privacy awareness by 14.0%. More crucially, DEAN remains robust and effective with limited annotated data or even when only malicious fine-tuning data is available, whereas SFT methods may fail to perform properly in such scenarios. We hope this study provides valuable insights into concurrently addressing fairness and privacy concerns in LLMs and can be integrated into comprehensive frameworks to develop more ethical and responsible AI systems. Our code is available at https://github.com/ChnQ/DEAN. In recent years, as LLMs increasingly permeate sensitive areas such as healthcare, finance, and education (Li et al., 2023b; Yuan et al., 2023; Al-Smadi, 2023), concerns regarding their fairness and privacy implications have become critically important (Liu et al., 2023; Sun et al., 2024a). For instance, when queried for sensitive information such as a social security number, we would expect the LLM to refuse to provide such information.
Can Large Language Models Analyze Graphs like Professionals? A Benchmark, Datasets and Models
Li, Xin, Chen, Weize, Chu, Qizhi, Li, Haopeng, Sun, Zhaojun, Li, Ran, Qian, Chen, Wei, Yiwei, Liu, Zhiyuan, Shi, Chuan, Sun, Maosong, Yang, Cheng
The need to analyze graphs is ubiquitous across various fields, from social networks to biological research and recommendation systems. Therefore, enabling the ability of large language models (LLMs) to process graphs is an important step toward more advanced general intelligence. However, current LLM benchmarks on graph analysis require models to directly reason over the prompts describing graph topology, and are thus limited to small graphs with only a few dozens of nodes. In contrast, human experts typically write programs based on popular libraries for task solving, and can thus handle graphs with different scales. To this end, a question naturally arises: can LLMs analyze graphs like professionals? In this paper, we introduce ProGraph, a manually crafted benchmark containing 3 categories of graph tasks. The benchmark expects solutions based on programming instead of directly reasoning over raw inputs. Our findings reveal that the performance of current LLMs is unsatisfactory, with the best model achieving only 36% accuracy. To bridge this gap, we propose LLM4Graph datasets, which include crawled documents and auto-generated codes based on 6 widely used graph libraries. By augmenting closed-source LLMs with document retrieval and fine-tuning open-source ones on the codes, we show 11-32% absolute improvements in their accuracies. Our results underscore that the capabilities of LLMs in handling structured data are still under-explored, and show the effectiveness of LLM4Graph in enhancing LLMs' proficiency of graph analysis. The benchmark, datasets and enhanced open-source models are available at https://github.com/BUPT-GAMMA/ProGraph.