Chen, Liang
GT-SNT: A Linear-Time Transformer for Large-Scale Graphs via Spiking Node Tokenization
Zhang, Huizhe, Li, Jintang, Zhu, Yuchang, Zhong, Huazhen, Chen, Liang
Graph Transformers (GTs), which integrate message passing and self-attention mechanisms simultaneously, have achieved promising empirical results in graph prediction tasks. However, the design of scalable and topology-aware node tok-enization has lagged behind other modalities. This gap becomes critical as the quadratic complexity of full attention renders them impractical on large-scale graphs. Recently, Spiking Neural Networks (SNNs), as brain-inspired models, provided an energy-saving scheme to convert input intensity into discrete spike-based representations through event-driven spiking neurons. Inspired by these characteristics, we propose a linear-time Graph Transformer with Spiking Node Tokenization (GT -SNT) for node classification. By integrating multi-step feature propagation with SNNs, spiking node tokenization generates compact, locality-aware spike count embeddings as node tokens to avoid predefined code-books and their utilization issues. The codebook guided self-attention leverages these tokens to perform node-to-token attention for linear-time global context aggregation. In experiments, we compare GT -SNT with other state-of-the-art baselines on node classification datasets ranging from small to large. Experimental results show that GT -SNT achieves comparable performances on most datasets and reaches up to 130 faster inference speed compared to other GTs.
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Asia > China > Guangdong Province > Shenzhen (0.04)
- Asia > China > Fujian Province > Xiamen (0.04)
Falcon: A Comprehensive Chinese Text-to-SQL Benchmark for Enterprise-Grade Evaluation
Luo, Wenzhen, Guan, Wei, Yao, Yifan, Pan, Yimin, Wang, Feng, Yu, Zhipeng, Wen, Zhe, Chen, Liang, Zhuang, Yihong
We introduce Falcon, a cross-domain Chinese text-to-SQL benchmark grounded in an enterprise-compatible dialect (MaxCompute/Hive). It contains 600 Chinese questions over 28 databases; 77% require multi-table reasoning and over half touch more than four tables. Each example is annotated along SQL-computation features and Chinese semantics. For evaluation, we release a robust execution comparator and an automated evaluation pipeline, under which all current state-of-the-art large-scale models (including Deepseek) achieve accuracies of at most 50%. Major errors originate from two sources: (1) schema linking in large enterprise landscapes - hundreds of tables, denormalized fields, ambiguous column names, implicit foreign-key relations and domain-specific synonyms that make correct join/column selection difficult; and (2) mapping concise, colloquial Chinese into the exact operators and predicates required for analytics - e.g., choosing the correct aggregation and group-by keys, expressing time windows and granularities, applying unit conversions, handling NULLs and data-quality rules, and formulating nested or windowed subqueries. Falcon therefore targets Chinese-specific semantics and enterprise dialects (abbreviations, business jargon, fuzzy entity references) and provides a reproducible middle ground before full production deployment by using realistic enterprise schemas, query templates, an execution comparator, and an automated evaluation pipeline for end-to-end validation.
- Information Technology > Data Science (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
An Ensembled Penalized Federated Learning Framework for Falling People Detection
Rao, Sizhe, Zhang, Runqiu, Saha, Sajal, Chen, Liang
Abstract--Falls among elderly and disabled individuals remain a leading cause of injury and mortality worldwide, necessitating robust, accurate, and privacy-aware fall detection systems. Traditional fall detection approaches, whether centralized or point-wise, often struggle with key challenges such as limited gener-alizability, data privacy concerns, and variability in individual movement behaviors. T o address these limitations, we propose EPFL--an Ensembled Penalized Federated Learning framework that integrates continual learning, personalized modeling, and a novel Specialized Weighted Aggregation (SW A) strategy. EPFL leverages wearable sensor data to capture sequential motion patterns while preserving user privacy through homomorphic encryption and federated training. Unlike existing federated models, EPFL incorporates both penalized local training and ensemble-based inference to improve inter-client consistency and adaptability to behavioral differences. Extensive experiments on a benchmark fall detection dataset demonstrate the effectiveness of our approach, achieving a Recall of 88.31% and an F1-score of 89.94%, significantly outperforming both centralized and baseline models. This work presents a scalable, secure, and accurate solution for real-world fall detection in healthcare settings, with strong potential for continuous improvement via its adaptive feedback mechanism. Due to changes in traditional family structures, the number of older individuals living alone has significantly increased over the past few decades [1]. According to the report from World Health Organization (WHO) [2], falls are the second leading cause of unintentional injury deaths worldwide, with particularly high morbidity among individuals aged 60 and older. Resulting in severe injuries, including fractures, head trauma, and even death, falls can significantly decline the quality of life of older adults [3]. Considering this, the need for effective monitoring and fall detection systems has been raised by this change aiming to ensure the safety of seniors. Falls can have long-term impacts on individuals, including significant disability-adjusted life years (DAL Ys) and high financial costs. According to the report [2], falls cause over 38 million DAL Ys lost annually worldwide. In Canada, a 20% reduction in falls could save approximately US$120 million each year. Considering the severe injuries, potential fatalities and other additional costs resulting from sudden falls [4], fall detection is a critical research area, especially for the elderly and individuals with disabilities.
- North America > Canada > British Columbia > Regional District of Fraser–Fort George > Prince George (0.04)
- Europe > Spain > Andalusia > Málaga Province > Málaga (0.04)
- Information Technology > Security & Privacy (1.00)
- Health & Medicine (1.00)
Beyond Two-Stage Training: Cooperative SFT and RL for LLM Reasoning
Chen, Liang, Han, Xueting, Shen, Li, Bai, Jing, Wong, Kam-Fai
Reinforcement learning (RL) has proven effective in incentivizing the reasoning abilities of large language models (LLMs), but suffers from severe efficiency challenges due to its trial-and-error nature. While the common practice employs supervised fine-tuning (SFT) as a warm-up stage for RL, this decoupled two-stage approach suffers from catastrophic forgetting: second-stage RL gradually loses SFT-acquired behaviors and inefficiently explores new patterns. This study introduces a novel method for learning reasoning models that employs bilevel optimization to facilitate better cooperation between these training paradigms. By conditioning the SFT objective on the optimal RL policy, our approach enables SFT to meta-learn how to guide RL's optimization process. During training, the lower level performs RL updates while simultaneously receiving SFT supervision, and the upper level explicitly maximizes the cooperative gain-the performance advantage of joint SFT-RL training over RL alone. Empirical evaluations on five reasoning benchmarks demonstrate that our method consistently outperforms baselines and achieves a better balance between effectiveness and efficiency.
- North America > United States (0.04)
- Asia > Middle East > Saudi Arabia > Asir Province > Abha (0.04)
- Asia > China > Hong Kong (0.04)
- Asia > China > Guangdong Province > Shenzhen (0.04)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.93)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.70)
Mitigating Overthinking through Reasoning Shaping
Song, Feifan, Wei, Shaohang, Gao, Bofei, Wang, Yejie, Luo, Wen, Li, Wei, Yao, Linli, Xiong, Weimin, Chen, Liang, Liu, Tianyu, Wang, Houfeng
Large reasoning models (LRMs) boosted by Reinforcement Learning from Verifier Reward (RLVR) have shown great power in problem solving, yet they often cause overthinking: excessive, meandering reasoning that inflates computational cost. Prior designs of penalization in RLVR manage to reduce token consumption while often harming model performance, which arises from the oversimplicity of token-level supervision. In this paper, we argue that the granularity of supervision plays a crucial role in balancing efficiency and accuracy, and propose Group Relative Segment Penalization (GRSP), a step-level method to regularize reasoning. Since preliminary analyses show that reasoning segments are strongly correlated with token consumption and model performance, we design a length-aware weighting mechanism across segment clusters. Extensive experiments demonstrate that GRSP achieves superior token efficiency without heavily compromising accuracy, especially the advantages with harder problems. Moreover, GRSP stabilizes RL training and scales effectively across model sizes.
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.71)
- Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (0.46)
SGNNBench: A Holistic Evaluation of Spiking Graph Neural Network on Large-scale Graph
Zhang, Huizhe, Li, Jintang, Zhu, Yuchang, Chen, Liang, Kuang, Li
Graph Neural Networks (GNNs) are exemplary deep models designed for graph data. Message passing mechanism enables GNNs to effectively capture graph topology and push the performance boundaries across various graph tasks. However, the trend of developing such complex machinery for graph representation learning has become unsustainable on large-scale graphs. The computational and time overhead make it imperative to develop more energy-efficient GNNs to cope with the explosive growth of real-world graphs. Spiking Graph Neural Networks (SGNNs), which integrate biologically plausible learning via unique spike-based neurons, have emerged as a promising energy-efficient alternative. Different layers communicate with sparse and binary spikes, which facilitates computation and storage of intermediate graph representations. Despite the proliferation of SGNNs proposed in recent years, there is no systematic benchmark to explore the basic design principles of these brain-inspired networks on the graph data. To bridge this gap, we present SGNNBench to quantify progress in the field of SGNNs. Specifically, SGNNBench conducts an in-depth investigation of SGNNs from multiple perspectives, including effectiveness, energy efficiency, and architectural design. We comprehensively evaluate 9 state-of-the-art SGNNs across 18 datasets. Regarding efficiency, we empirically compare these baselines w.r.t model size, memory usage, and theoretical energy consumption to reveal the often-overlooked energy bottlenecks of SGNNs. Besides, we elaborately investigate the design space of SGNNs to promote the development of a general SGNN paradigm.
- Asia > China > Guangdong Province > Guangzhou (0.04)
- North America > United States > New York > New York County > New York City (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Asia > China > Hunan Province > Changsha (0.04)
Sparse Training Scheme for Multimodal LLM
Shi, Kean, Chen, Liang, Zhao, Haozhe, Chang, Baobao
Multimodal Large Language Models (MLLMs) have demonstrated outstanding performance across a variety of domains. However, training MLLMs is often inefficient due to the significantly longer input sequences introduced by multimodal data and the low utilization of inter-layer computations. To address this challenge, we shift the focus to the training process itself and propose a novel training-efficient framework based on sparse representations, termed the Sparse Training Scheme (STS). This scheme consists of two key components: the Visual Token Compressor, which reduces the information load by compressing visual tokens, and the Layer Dynamic Skipper, which mitigates the computational overhead by dynamically skipping unnecessary layers in the language model during both forward and backward passes. Our approach is broadly applicable to diverse MLLM architectures and has been extensively evaluated on multiple benchmarks, demonstrating its effectiveness and efficiency.
Vulnerability-Aware Alignment: Mitigating Uneven Forgetting in Harmful Fine-Tuning
Chen, Liang, Han, Xueting, Shen, Li, Bai, Jing, Wong, Kam-Fai
Harmful fine-tuning (HFT), performed directly on open-source LLMs or through Fine-tuning-as-a-Service, breaks safety alignment and poses significant threats. Existing methods aim to mitigate HFT risks by learning robust representation on alignment data or making harmful data unlearnable, but they treat each data sample equally, leaving data vulnerability patterns understudied. In this work, we reveal that certain subsets of alignment data are consistently more prone to forgetting during HFT across different fine-tuning tasks. Inspired by these findings, we propose Vulnerability-Aware Alignment (VAA), which estimates data vulnerability, partitions data into "vulnerable" and "invulnerable" groups, and encourages balanced learning using a group distributionally robust optimization (Group DRO) framework. Specifically, VAA learns an adversarial sampler that samples examples from the currently underperforming group and then applies group-dependent adversarial perturbations to the data during training, aiming to encourage a balanced learning process across groups. Experiments across four fine-tuning tasks demonstrate that VAA significantly reduces harmful scores while preserving downstream task performance, outperforming state-of-the-art baselines.
- Asia > China > Hong Kong (0.04)
- North America > Canada (0.04)
- Asia > Thailand > Bangkok > Bangkok (0.04)
- (3 more...)
What Matters in LLM-generated Data: Diversity and Its Effect on Model Fine-Tuning
Zhu, Yuchang, Zhong, Huazhen, Lin, Qunshu, Wei, Haotong, Sun, Xiaolong, Yu, Zixuan, Liu, Minghao, Zheng, Zibin, Chen, Liang
With the remarkable generative capabilities of large language models (LLMs), using LLM-generated data to train downstream models has emerged as a promising approach to mitigate data scarcity in specific domains and reduce time-consuming annotations. However, recent studies have highlighted a critical issue: iterative training on self-generated data results in model collapse, where model performance degrades over time. Despite extensive research on the implications of LLM-generated data, these works often neglect the importance of data diversity, a key factor in data quality. In this work, we aim to understand the implications of the diversity of LLM-generated data on downstream model performance. Specifically, we explore how varying levels of diversity in LLM-generated data affect downstream model performance. Additionally, we investigate the performance of models trained on data that mixes different proportions of LLM-generated data, which we refer to as synthetic data. Our experimental results show that, with minimal distribution shift, moderately diverse LLM-generated data can enhance model performance in scenarios with insufficient labeled data, whereas highly diverse generated data has a negative impact. We hope our empirical findings will offer valuable guidance for future studies on LLMs as data generators.
SaGIF: Improving Individual Fairness in Graph Neural Networks via Similarity Encoding
Zhu, Yuchang, Li, Jintang, Zhang, Huizhe, Chen, Liang, Zheng, Zibin
Individual fairness (IF) in graph neural networks (GNNs), which emphasizes the need for similar individuals should receive similar outcomes from GNNs, has been a critical issue. Despite its importance, research in this area has been largely unexplored in terms of (1) a clear understanding of what induces individual unfairness in GNNs and (2) a comprehensive consideration of identifying similar individuals. To bridge these gaps, we conduct a preliminary analysis to explore the underlying reason for individual unfairness and observe correlations between IF and similarity consistency, a concept introduced to evaluate the discrepancy in identifying similar individuals based on graph structure versus node features. Inspired by our observations, we introduce two metrics to assess individual similarity from two distinct perspectives: topology fusion and feature fusion. Building upon these metrics, we propose Similarity-aware GNNs for Individual Fairness, named SaGIF. The key insight behind SaGIF is the integration of individual similarities by independently learning similarity representations, leading to an improvement of IF in GNNs. Our experiments on several real-world datasets validate the effectiveness of our proposed metrics and SaGIF. Specifically, SaGIF consistently outperforms state-of-the-art IF methods while maintaining utility performance. Code is available at: https://github.com/ZzoomD/SaGIF.
- Asia > China > Guangdong Province > Guangzhou (0.04)
- Asia > China > Guangdong Province > Zhuhai (0.04)
- Asia > China > Guangdong Province > Shenzhen (0.04)