Education
Efficient Online Continual Learning in Sensor-Based Human Activity Recognition
Zhang, Yao, Clayton, Souza Leite, Xiao, Yu
Abstract--Machine learning models for sensor-based human activity recognition (HAR) are expected to adapt post-deployment to recognize new activities and different ways of performing existing ones. T o address this need, Online Continual Learning (OCL) mechanisms have been proposed, allowing models to update their knowledge incrementally as new data become available while preserving previously acquired information. However, existing OCL approaches for sensor-based HAR are computationally intensive and require extensive labeled samples to represent new changes. Recently, pre-trained model-based (PTM-based) OCL approaches have shown significant improvements in performance and efficiency for computer vision applications. These methods achieve strong generalization capabilities by pre-training complex models on large datasets, followed by fine-tuning on downstream tasks for continual learning. However, applying PTM-based OCL approaches to sensor-based HAR poses significant challenges due to the inherent heterogeneity of HAR datasets and the scarcity of labeled data in post-deployment scenarios. This paper introduces PTRN-HAR, the first successful application of PTM-based OCL to sensor-based HAR. This extractor is then frozen during the streaming stage. Furthermore, it replaces the conventional dense classification layer with a relation module network. Our design not only significantly reduces the resource consumption required for model training while maintaining high performance, but also improves data efficiency by reducing the amount of labeled data needed for effective continual learning, as demonstrated through experiments on three public datasets, outperforming the state-of-the-art. The code can be found here: https://anonymous.4open.science/r/PTRN-HAR-AF60/ HE recognition of human activities using wearable sensors such as Inertial Measurement Unit (IMU) encompasses many practical applications in smart homes [1], healthcare [2], and manufacturing [3]. HAR is typically achieved using machine learning models trained on sensor data from a predefined set of activity classes collected from a selected group of subjects. After deployment, these models often need to evolve to recognize new activities or adapt to the distribution shift caused by changes in users' activity patterns due to aging, disease, or simply a distinct manner of performing the same activity [4].
Weightless Neural Networks for Continuously Trainable Personalized Recommendation Systems
Latif, Rafayel, Behera, Satwik, Al-Ebrahim, Ali
Given that conventional recommenders, while deeply effective, rely on large distributed systems pre-trained on aggregate user data, incorporating new data necessitates large training cycles, making them slow to adapt to real-time user feedback and often lacking transparency in recommendation rationale. We explore the performance of smaller personal models trained on per-user data using weightless neural networks (WNNs), an alternative to neural backpropagation that enable continuous learning by using neural networks as a state machine rather than a system with pretrained weights. We contrast our approach against a classic weighted system, also on a per-user level, and standard collaborative filtering, achieving competitive levels of accuracy on a subset of the MovieLens dataset. We close with a discussion of how weightless systems can be developed to augment centralized systems to achieve higher subjective accuracy through recommenders more directly tunable by end-users.
GUARD: Guideline Upholding Test through Adaptive Role-play and Jailbreak Diagnostics for LLMs
Jin, Haibo, Chen, Ruoxi, Zhang, Peiyan, Zhou, Andy, Wang, Haohan
As Large Language Models become increasingly integral to various domains, their potential to generate harmful responses has prompted significant societal and regulatory concerns. In response, governments have issued ethics guidelines to promote the development of trustworthy AI. However, these guidelines are typically high-level demands for developers and testers, leaving a gap in translating them into actionable testing questions to verify LLM compliance. To address this challenge, we introduce GUARD (\textbf{G}uideline \textbf{U}pholding Test through \textbf{A}daptive \textbf{R}ole-play and Jailbreak \textbf{D}iagnostics), a testing method designed to operationalize guidelines into specific guideline-violating questions that assess LLM adherence. To implement this, GUARD uses automated generation of guideline-violating questions based on government-issued guidelines, thereby testing whether responses comply with these guidelines. When responses directly violate guidelines, GUARD reports inconsistencies. Furthermore, for responses that do not directly violate guidelines, GUARD integrates the concept of ``jailbreaks'' to diagnostics, named GUARD-JD, which creates scenarios that provoke unethical or guideline-violating responses, effectively identifying potential scenarios that could bypass built-in safety mechanisms. Our method finally culminates in a compliance report, delineating the extent of adherence and highlighting any violations. We have empirically validated the effectiveness of GUARD on seven LLMs, including Vicuna-13B, LongChat-7B, Llama2-7B, Llama-3-8B, GPT-3.5, GPT-4, GPT-4o, and Claude-3.7, by testing compliance under three government-issued guidelines and conducting jailbreak diagnostics. Additionally, GUARD-JD can transfer jailbreak diagnostics to vision-language models, demonstrating its usage in promoting reliable LLM-based applications.
Information Science Principles of Machine Learning: A Causal Chain Meta-Framework Based on Formalized Information Mapping
This paper addresses the current lack of a unified formal framework in machine learning theory, as well as the absence of robust theoretical foundations for interpretability and ethical safety assurance. We first construct a formal information model, employing sets of well-formed formulas (WFFs) to explicitly define the ontological states and carrier mappings for the core components of machine learning. By introducing learnable and processable predicates, as well as learning and processing functions, we analyze the logical inference and constraint rules underlying causal chains in models, thereby establishing the Machine Learning Theory Meta-Framework (MLT-MF). Building upon this framework, we propose universal definitions for model interpretability and ethical safety, and rigorously prove and validate four key theorems: the equivalence between model interpretability and information existence, the constructive formulation of ethical safety assurance and two types of total variation distance (TVD) upper bounds. This work overcomes the limitations of previous fragmented approaches, providing a unified theoretical foundation from an information science perspective to systematically address the critical challenges currently facing machine learning.
Unified Humanoid Fall-Safety Policy from a Few Demonstrations
Xu, Zhengjie, Li, Ye, Lin, Kwan-yee, Yu, Stella X.
Our method enables humanoids to fall safely and rise promptly. Snapshots show real-world deployment on the Unitree G1: When suddenly destabilized, the robot redirects into a side fall with arm buffering, then reorients and rises, demonstrating adaptive and resilient recovery. Abstract-- Falling is an inherent risk of humanoid mobility. Maintaining stability is thus a primary safety focus in robot control and learning, yet no existing approach fully averts loss of balance. When instability does occur, prior work addresses only isolated aspects of falling: avoiding falls, choreographing a controlled descent, or standing up afterward. Consequently, humanoid robots lack integrated strategies for impact mitigation and prompt recovery when real falls defy these scripts. We aim to go beyond keeping balance to make the entire fall-and-recovery process safe and autonomous: prevent falls when possible, reduce impact when unavoidable, and stand up when fallen. By fusing sparse human demonstrations with reinforcement learning and an adaptive diffusion-based memory of safe reactions, we learn adaptive whole-body behaviors that unify fall prevention, impact mitigation, and rapid recovery in one policy. Experiments in simulation and on a Unitree G1 demonstrate robust sim-to-real transfer, lower impact forces, and consistently fast recovery across diverse disturbances, pointing toward safer, more resilient humanoids in real environments. Videos are available at https://firm2025.github.io/. Where there are legs, there will be stumbles. Even the most carefully trained humanoids - built for agile locomotion and intelligent navigation planning - are bound to be jolted off balance by a stray push, a loose stone, or an unexpected gust.
C3PO: Optimized Large Language Model Cascades with Probabilistic Cost Constraints for Reasoning
Valkanas, Antonios, Pal, Soumyasundar, Rumiantsev, Pavel, Zhang, Yingxue, Coates, Mark
Large language models (LLMs) have achieved impressive results on complex reasoning tasks, but their high inference cost remains a major barrier to real-world deployment. A promising solution is to use cascaded inference, where small, cheap models handle easy queries, and only the hardest examples are escalated to more powerful models. However, existing cascade methods typically rely on supervised training with labeled data, offer no theoretical generalization guarantees, and provide limited control over test-time computational cost. We introduce C3PO (Cost Controlled Cascaded Prediction Optimization), a self-supervised framework for optimizing LLM cascades under probabilistic cost constraints. By focusing on minimizing regret with respect to the most powerful model (MPM), C3PO avoids the need for labeled data by constructing a cascade using only unlabeled model outputs. It leverages conformal prediction to bound the probability that inference cost exceeds a user-specified budget. We provide theoretical guarantees on both cost control and generalization error, and show that our optimization procedure is effective even with small calibration sets. Empirically, C3PO achieves state-of-the-art performance across a diverse set of reasoning benchmarks including GSM8K, MATH-500, BigBench-Hard and AIME, outperforming strong LLM cascading baselines in both accuracy and cost-efficiency. Our results demonstrate that principled, label-free cascade optimization can enable scalable LLM deployment.
LoReTTA: A Low Resource Framework To Poison Continuous Time Dynamic Graphs
Pal, Himanshu, Bachina, Venkata Sai Pranav, Gangwal, Ankit, Sharma, Charu
Temporal Graph Neural Networks (TGNNs) are increasingly used in high-stakes domains, such as financial forecasting, recommendation systems, and fraud detection. However, their susceptibility to poisoning attacks poses a critical security risk. We introduce LoReTTA (Low Resource Two-phase Temporal Attack), a novel adversarial framework on Continuous-Time Dynamic Graphs, which degrades TGNN performance by an average of 29.47% across 4 widely benchmark datasets and 4 State-of-the-Art (SotA) models. LoReTTA operates through a two-stage approach: (1) sparsify the graph by removing high-impact edges using any of the 16 tested temporal importance metrics, (2) strategically replace removed edges with adversarial negatives via LoReTTA's novel degree-preserving negative sampling algorithm. Our plug-and-play design eliminates the need for expensive surrogate models while adhering to realistic unnoticeability constraints. LoReTTA degrades performance by upto 42.0% on MOOC, 31.5% on Wikipedia, 28.8% on UCI, and 15.6% on Enron. LoReTTA outperforms 11 attack baselines, remains undetectable to 4 leading anomaly detection systems, and is robust to 4 SotA adversarial defense training methods, establishing its effectiveness, unnoticeability, and robustness.
RLVE: Scaling Up Reinforcement Learning for Language Models with Adaptive Verifiable Environments
Zeng, Zhiyuan, Ivison, Hamish, Wang, Yiping, Yuan, Lifan, Li, Shuyue Stella, Ye, Zhuorui, Li, Siting, He, Jacqueline, Zhou, Runlong, Chen, Tong, Zhao, Chenyang, Tsvetkov, Yulia, Du, Simon Shaolei, Jaques, Natasha, Peng, Hao, Koh, Pang Wei, Hajishirzi, Hannaneh
We introduce Reinforcement Learning (RL) with Adaptive Verifiable Environments (RLVE), an approach using verifiable environments that procedurally generate problems and provide algorithmically verifiable rewards, to scale up RL for language models (LMs). RLVE enables each verifiable environment to dynamically adapt its problem difficulty distribution to the policy model's capabilities as training progresses. In contrast, static data distributions often lead to vanishing learning signals when problems are either too easy or too hard for the policy. To implement RLVE, we create RLVE-Gym, a large-scale suite of 400 verifiable environments carefully developed through manual environment engineering. Using RLVE-Gym, we show that environment scaling, i.e., expanding the collection of training environments, consistently improves generalizable reasoning capabilities. RLVE with joint training across all 400 environments in RLVE-Gym yields a 3.37% absolute average improvement across six reasoning benchmarks, starting from one of the strongest 1.5B reasoning LMs. By comparison, continuing this LM's original RL training yields only a 0.49% average absolute gain despite using over 3x more compute. We release our code publicly.
MG-HGNN: A Heterogeneous GNN Framework for Indoor Wi-Fi Fingerprint-Based Localization
Wang, Yibu, Zhang, Zhaoxin, Li, Ning, Zhao, Xinlong, Zhao, Dong, Zhao, Tianzi
Abstract--Received signal strength indicator (RSSI) is the primary representation of Wi-Fi fingerprints and serves as a crucial tool for indoor localization. However, existing RSSI-based positioning methods often suffer from reduced accuracy due to environmental complexity and challenges in processing multi-source information. T o address these issues, we propose a novel multi-graph heterogeneous GNN framework (MG-HGNN) to enhance spatial awareness and improve positioning performance. In this framework, two graph construction branches perform node and edge embedding, respectively, to generate informative graphs. Subsequently, a heterogeneous graph neural network is employed for graph representation learning, enabling accurate positioning. The MG-HGNN framework introduces the following key innovations: 1) multi-type task-directed graph construction that combines label estimation and feature encoding for richer graph information; 2) a heterogeneous GNN structure that enhances the performance of conventional GNN models. Evaluations on the UJIIndoorLoc and UTSIndoorLoc public datasets demonstrate that MG-HGNN not only achieves superior performance compared to several state-of-the-art methods, but also provides a novel perspective for enhancing GNN-based localization methods. Ablation studies further confirm the rationality and effectiveness of the proposed framework. Index T erms--Fingerprint-based localization, graph neural network, heterogeneous network, received signal strength indicator (RSSI). NDOOR localization technologies aim to estimate the position of mobile users or devices in indoor environments where satellite-based systems such as GPS are ineffective [1]. Over the past decade, a variety of wireless indoor localization techniques have been developed based on different sensing modalities, including Bluetooth Low Energy (BLE) [2], Ultra Wideband (UWB) [3], Radio Frequency Identification (RFID) [4], magnetic field sensing [5], and Wi-Fi [6], [7]. Among them, Wi-Fi based localization has attracted a lot of attention due to the ubiquity of Wi-Fi infrastructure, low deployment cost, and compatibility with existing mobile devices without requiring additional hardware [1]. This work has been submitted to the IEEE for possible publication. This work is supported by the National Key Research and Development Program of China [Grant No. 2024QY1103], the Shandong Provincial Natural Science Foundation, China [Grant No. ZR2024QF138].(Corresponding Yibu Wang, Zhaoxin Zhang, Ning Li, and Tianzi Zhao are with the School of Computer Science and Technology, Harbin Institute of Technology, China (e-mail: 24b903081@stu.hit.edu.cn; Xinlong Zhao is with the China Mineral Resources Group Big Data Co., Ltd, China (e-mail: xinlong.zhao@qq.com).
Designing Beyond Language: Sociotechnical Barriers in AI Health Technologies for Limited English Proficiency
Huang, Michelle, Rodriguez, Violeta J., Saha, Koustuv, August, Tal
Limited English proficiency (LEP) patients in the U.S. face systemic barriers to healthcare beyond language and interpreter access, encompassing procedural and institutional constraints. AI advances may support communication and care through on-demand translation and visit preparation, but also risk exacerbating existing inequalities. We conducted storyboard-driven interviews with 14 patient navigators to explore how AI could shape care experiences for Spanish-speaking LEP individuals. We identified tensions around linguistic and cultural misunderstandings, privacy concerns, and opportunities and risks for AI to augment care workflows. Participants highlighted structural factors that can undermine trust in AI systems, including sensitive information disclosure, unstable technology access, and low digital literacy. While AI tools can potentially alleviate social barriers and institutional constraints, there are risks of misinformation and uprooting human camaraderie. Our findings contribute design considerations for AI that support LEP patients and care teams via rapport-building, education, and language support, and minimizing disruptions to existing practices.