Goto

Collaborating Authors

 training paradigm





To Learn or Not to Learn, That is the Question -- A Feature-Task Dual Learning Model of Perceptual Learning

Neural Information Processing Systems

Perceptual learning refers to the practices through which participants learn to improve their performance in perceiving sensory stimuli. Two seemingly conflicting phenomena of specificity and transfer have been widely observed in perceptual learning. Here, we propose a dual-learning model to reconcile these two phenomena. The model consists of two learning processes. One is task-based learning, which is fast and enables the brain to adapt to a task rapidly by using existing feature representations.


RoboScape-R: Unified Reward-Observation World Models for Generalizable Robotics Training via RL

Tang, Yinzhou, Shang, Yu, Chen, Yinuo, Wei, Bingwen, Zhang, Xin, Yu, Shu'ang, Shi, Liangzhi, Yu, Chao, Gao, Chen, Wu, Wei, Li, Yong

arXiv.org Artificial Intelligence

Achieving generalizable embodied policies remains a key challenge. Traditional policy learning paradigms, including both Imitation Learning (IL) and Reinforcement Learning (RL), struggle to cultivate generalizability across diverse scenarios. While IL policies often overfit to specific expert trajectories, RL suffers from the inherent lack of a unified and general reward signal necessary for effective multi-scene generalization. We posit that the world model is uniquely capable of serving as a universal environment proxy to address this limitation. However, current world models primarily focus on their ability to predict observations and still rely on task-specific, handcrafted reward functions, thereby failing to provide a truly general training environment. Toward this problem, we propose RoboScape-R, a framework leveraging the world model to serve as a versatile, general-purpose proxy for the embodied environment within the RL paradigm. We introduce a novel world model-based general reward mechanism that generates ''endogenous'' rewards derived from the model's intrinsic understanding of real-world state transition dynamics. Extensive experiments demonstrate that RoboScape-R effectively addresses the limitations of traditional RL methods by providing an efficient and general training environment that substantially enhances the generalization capability of embodied policies. Our approach offers critical insights into utilizing the world model as an online training strategy and achieves an average 37.5% performance improvement over baselines under out-of-domain scenarios.


Sensing and Understanding the World over Air: A Large Multimodal Model for Mobile Networks

Duan, Zhuoran, Wei, Yuhao, Nan, Guoshun, Wang, Zijun, Yan, Yan, Xiong, Lihua, Ran, Yuhan, Zhang, Ji, Li, Jian, Cui, Qimei, Tao, Xiaofeng, Quek, Tony Q. S.

arXiv.org Artificial Intelligence

Abstract--Large models (LMs), such as ChatGPT, have made a significant impact across diverse domains and hold great potential to facilitate the evolution of network intelligence. Wireless-native multi-modal large models (WMLMs) can sense and understand the physical world through multi-modal data, serving as a key enabler that integrates communication, sensing, and intelligence, and thus they can boost various smart services to billions of users. However, research on WMLMs remains in its infancy, and the construction of domain-specific multi-modal large models for wireless networks is still underexplored. In this paper, we outlines the key characteristics of WMLMs and summarizes existing methods, on the basis of which a wireless-native multimodal training paradigm is proposed. Specifically, we constructed a GPT -style WMLM model and trained it on a real-world large-scale dataset, leveraging wireless signals as an anchor modality for contrastive learning. Our approach demonstrates outstanding performance compared with existing small-scale models and large multi-modal models, validating the feasibility of using wireless signals as a universal modality and highlighting WMLM's potential to emerge as a new paradigm for future wireless networks. The advent of large AI models (LMs) such as ChatGPT has propelled network intelligence into a new evolutionary phase. These remarkable enablers are poised to revolutionize future wireless networks through their advanced performance and generalization capability.



Guided Graph Compression for Quantum Graph Neural Networks

Casals, Mikel, Belis, Vasilis, Combarro, Elias F., Alarcón, Eduard, Vallecorsa, Sofia, Grossi, Michele

arXiv.org Artificial Intelligence

Graph Neural Networks (GNNs) are effective for processing graph-structured data but face challenges with large graphs due to high memory requirements and inefficient sparse matrix operations on GPUs. Quantum Computing (QC) offers a promising avenue to address these issues and inspires new algorithmic approaches. In particular, Quantum Graph Neural Networks (QGNNs) have been explored in recent literature. However, current quantum hardware limits the dimension of the data that can be effectively encoded. Existing approaches either simplify datasets manually or use artificial graph datasets. This work introduces the Guided Graph Compression (GGC) framework, which uses a graph autoencoder to reduce both the number of nodes and the dimensionality of node features. The compression is guided to enhance the performance of a downstream classification task, which can be applied either with a quantum or a classical classifier. The framework is evaluated on the Jet Tagging task, a classification problem of fundamental importance in high energy physics that involves distinguishing particle jets initiated by quarks from those by gluons. The GGC is compared against using the autoencoder as a standalone preprocessing step and against a baseline classical GNN classifier. Our numerical results demonstrate that GGC outperforms both alternatives, while also facilitating the testing of novel QGNN ansatzes on realistic datasets.


Schrödinger Bridge Mamba for One-Step Speech Enhancement

Yang, Jing, Wang, Sirui, Wu, Chao, Fan, Fan

arXiv.org Artificial Intelligence

ABSTRACT We propose Schr odinger Bridge Mamba (SBM), a new concept of training-inference framework motivated by the inherent compatibility between Schr odinger Bridge (SB) training paradigm and selective state-space model Mamba. Experiments on a joint denoising and dereverberation task using four benchmark datasets demonstrate that SBM, with only 1-step inference, outperforms strong baselines with 1-step or iterative inference and achieves the best real-time factor (RTF). Beyond speech enhancement, we discuss the integration of SB paradigm and selective state-space model architecture based on their underlying alignment, which indicates a promising direction for exploring new deep generative models potentially applicable to a broad range of generative tasks. Index T erms-- Schr odinger Bridge, Mamba, Deep generative model, Speech enhancement 1. INTRODUCTION Deep generative models have been increasingly employed for speech enhancement (SE) tasks. By learning the underlying distribution of clean audio given its degraded counterpart, generative models are capable of generating high-quality speech from low-quality inputs that include noise, reverberation, clipping, bandwidth limitation or a mixture of these artifacts.


Personalized Learning Path Planning with Goal-Driven Learner State Modeling

Lim, Joy Jia Yin, He, Ye, Yu, Jifan, Cong, Xin, Zhang-Li, Daniel, Liu, Zhiyuan, Liu, Huiqin, Hou, Lei, Li, Juanzi, Xu, Bin

arXiv.org Artificial Intelligence

Personalized Learning Path Planning (PLPP) aims to design adaptive learning paths that align with individual goals. While large language models (LLMs) show potential in personalizing learning experiences, existing approaches often lack mechanisms for goal-aligned planning. We introduce Pxplore, a novel framework for PLPP that integrates a reinforcement-based training paradigm and an LLM-driven educational architecture. We design a structured learner state model and an automated reward function that transforms abstract objectives into computable signals. We train the policy combining supervised fine-tuning (SFT) and Group Relative Policy Optimization (GRPO), and deploy it within a real-world learning platform. Extensive experiments validate Pxplore's effectiveness in producing coherent, personalized, and goal-driven learning paths. We release our code and dataset to facilitate future research.