AITopics | Pan, Wei

Collaborating Authors

Pan, Wei

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

TAR: Teacher-Aligned Representations via Contrastive Learning for Quadrupedal Locomotion

Mousa, Amr, Karavis, Neil, Caprio, Michele, Pan, Wei, Allmendinger, Richard

arXiv.org Artificial IntelligenceMar-26-2025

-- Quadrupedal locomotion via Reinforcement Learning (RL) is commonly addressed using the teacher-student paradigm, where a privileged teacher guides a proprioceptive student policy. However, key challenges such as representation misalignment between privileged teacher and proprioceptive-only student, covariate shift due to behavioral cloning, and lack of deployable adaption; lead to poor generalization in real-world scenarios. We propose T eacher-Aligned Representations via Contrastive Learning (T AR), a framework that leverages privileged information with self-supervised contrastive learning to bridge this gap. By aligning representations to a privileged teacher in simulation via contrastive objectives, our student policy learns structured latent spaces and exhibits robust generalization to Out-of-Distribution (OOD) scenarios, surpassing the fully privileged "T eacher". Results showed accelerated training by 2 compared to state-of-the-art baselines to achieve peak performance. OOD scenarios showed better generalization by 40% on average compared to existing methods. Open-source code and videos are available at https://ammousa.github.io/TARLoco/.

artificial intelligence, machine learning, reinforcement learning, (15 more...)

arXiv.org Artificial Intelligence

2503.20839

Country: North America > United States (0.25)

Genre: Research Report > New Finding (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.69)
Information Technology > Artificial Intelligence > Robots > Locomotion (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Add feedback

Explosive Jumping with Rigid and Articulated Soft Quadrupeds via Example Guided Reinforcement Learning

Apostolides, Georgios, Pan, Wei, Kober, Jens, Della Santina, Cosimo, Ding, Jiatao

arXiv.org Artificial IntelligenceMar-20-2025

Achieving controlled jumping behaviour for a quadruped robot is a challenging task, especially when introducing passive compliance in mechanical design. This study addresses this challenge via imitation-based deep reinforcement learning with a progressive training process. To start, we learn the jumping skill by mimicking a coarse jumping example generated by model-based trajectory optimization. Subsequently, we generalize the learned policy to broader situations, including various distances in both forward and lateral directions, and then pursue robust jumping in unknown ground unevenness. In addition, without tuning the reward much, we learn the jumping policy for a quadruped with parallel elasticity. Results show that using the proposed method, i) the robot learns versatile jumps by learning only from a single demonstration, ii) the robot with parallel compliance reduces the landing error by 11.1%, saves energy cost by 15.2% and reduces the peak torque by 15.8%, compared to the rigid robot without parallel elasticity, iii) the robot can perform jumps of variable distances with robustness against ground unevenness (maximal 4cm height perturbations) using only proprioceptive perception.

artificial intelligence, machine learning, reinforcement learning, (16 more...)

arXiv.org Artificial Intelligence

2503.16197

Country: Europe > Netherlands (0.14)

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)
Information Technology > Artificial Intelligence > Robots > Locomotion (0.67)

Add feedback

Safe Distributed Learning-Enhanced Predictive Control for Multiple Quadrupedal Robots

Zhan, Weishu, Liang, Zheng, Song, Hongyu, Pan, Wei

arXiv.org Artificial IntelligenceMar-6-2025

Quadrupedal robots exhibit remarkable adaptability in unstructured environments, making them well-suited for formation control in real-world applications. However, keeping stable formations while ensuring collision-free navigation presents significant challenges due to dynamic obstacles, communication constraints, and the complexity of legged locomotion. This paper proposes a distributed model predictive control framework for multi-quadruped formation control, integrating Control Lyapunov Functions to ensure formation stability and Control Barrier Functions for decentralized safety enforcement. To address the challenge of dynamically changing team structures, we introduce Scale-Adaptive Permutation-Invariant Encoding (SAPIE), which enables robust feature encoding of neighboring robots while preserving permutation invariance. Additionally, we develop a low-latency Data Distribution Service-based communication protocol and an event-triggered deadlock resolution mechanism to enhance real-time coordination and prevent motion stagnation in constrained spaces. Our framework is validated through high-fidelity simulations in NVIDIA Omniverse Isaac Sim and real-world experiments using our custom quadrupedal robotic system, XG. Results demonstrate stable formation control, real-time feasibility, and effective collision avoidance, validating its potential for large-scale deployment.

artificial intelligence, avoidance, robot, (18 more...)

arXiv.org Artificial Intelligence

2503.05836

Country: Europe > United Kingdom (0.14)

Genre: Research Report > New Finding (0.34)

Industry: Energy > Oil & Gas > Upstream (0.61)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Robots > Locomotion (0.91)

Add feedback

Adaptive Teaming in Multi-Drone Pursuit: Simulation, Training, and Deployment

Li, Yang, Chen, Junfan, Xue, Feng, Qiu, Jiabin, Li, Wenbin, Zhang, Qingrui, Wen, Ying, Pan, Wei

arXiv.org Artificial IntelligenceFeb-13-2025

Adaptive teaming, the ability to collaborate with unseen teammates without prior coordination, remains an underexplored challenge in multi-robot collaboration. This paper focuses on adaptive teaming in multi-drone cooperative pursuit, a critical task with real-world applications such as border surveillance, search-and-rescue, and counter-terrorism. We first define and formalize the \textbf{A}daptive Teaming in \textbf{M}ulti-\textbf{D}rone \textbf{P}ursuit (AT-MDP) problem and introduce AT-MDP framework, a comprehensive framework that integrates simulation, algorithm training and real-world deployment. AT-MDP framework provides a flexible experiment configurator and interface for simulation, a distributed training framework with an extensive algorithm zoo (including two newly proposed baseline methods) and an unseen drone zoo for evaluating adaptive teaming, as well as a real-world deployment system that utilizes edge computing and Crazyflie drones. To the best of our knowledge, AT-MDP framework is the first adaptive framework for continuous-action decision-making in complex real-world drone tasks, enabling multiple drones to coordinate effectively with unseen teammates. Extensive experiments in four multi-drone pursuit environments of increasing difficulty confirm the effectiveness of AT-MDP framework, while real-world deployments further validate its feasibility in physical systems. Videos and code are available at https://sites.google.com/view/at-mdp.

artificial intelligence, machine learning, teammate, (15 more...)

arXiv.org Artificial Intelligence

2502.09762

Genre: Research Report (0.82)

Industry: Leisure & Entertainment > Games (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles > Drones (0.68)

Add feedback

SpikingSoft: A Spiking Neuron Controller for Bio-inspired Locomotion with Soft Snake Robots

Zhang, Chuhan, Wang, Cong, Pan, Wei, Della Santina, Cosimo

arXiv.org Artificial IntelligenceFeb-10-2025

Inspired by the dynamic coupling of moto-neurons and physical elasticity in animals, this work explores the possibility of generating locomotion gaits by utilizing physical oscillations in a soft snake by means of a low-level spiking neural mechanism. To achieve this goal, we introduce the Double Threshold Spiking neuron model with adjustable thresholds to generate varied output patterns. This neuron model can excite the natural dynamics of soft robotic snakes, and it enables distinct movements, such as turning or moving forward, by simply altering the neural thresholds. Finally, we demonstrate that our approach, termed SpikingSoft, naturally pairs and integrates with reinforcement learning. The high-level agent only needs to adjust the two thresholds to generate complex movement patterns, thus strongly simplifying the learning of reactive locomotion. Simulation results demonstrate that the proposed architecture significantly enhances the performance of the soft snake robot, enabling it to achieve target objectives with a 21.6% increase in success rate, a 29% reduction in time to reach the target, and smoother movements compared to the vanilla reinforcement learning controllers or Central Pattern Generator controller acting in torque space.

controller, machine learning, reinforcement learning, (16 more...)

arXiv.org Artificial Intelligence

2501.19072

Country:

Europe > Netherlands (0.14)
Europe > United Kingdom (0.14)

Genre: Research Report > New Finding (0.66)

Industry:

Energy (0.48)
Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Toward Scalable Multirobot Control: Fast Policy Learning in Distributed MPC

Zhang, Xinglong, Pan, Wei, Li, Cong, Xu, Xin, Wang, Xiangke, Zhang, Ronghua, Hu, Dewen

arXiv.org Artificial IntelligenceDec-27-2024

Distributed model predictive control (DMPC) is promising in achieving optimal cooperative control in multirobot systems (MRS). However, real-time DMPC implementation relies on numerical optimization tools to periodically calculate local control sequences online. This process is computationally demanding and lacks scalability for large-scale, nonlinear MRS. This article proposes a novel distributed learning-based predictive control (DLPC) framework for scalable multirobot control. Unlike conventional DMPC methods that calculate open-loop control sequences, our approach centers around a computationally fast and efficient distributed policy learning algorithm that generates explicit closed-loop DMPC policies for MRS without using numerical solvers. The policy learning is executed incrementally and forward in time in each prediction interval through an online distributed actor-critic implementation. The control policies are successively updated in a receding-horizon manner, enabling fast and efficient policy learning with the closed-loop stability guarantee. The learned control policies could be deployed online to MRS with varying robot scales, enhancing scalability and transferability for large-scale MRS. Furthermore, we extend our methodology to address the multirobot safe learning challenge through a force field-inspired policy learning approach. We validate our approach's effectiveness, scalability, and efficiency through extensive experiments on cooperative tasks of large-scale wheeled robots and multirotor drones. Our results demonstrate the rapid learning and deployment of DMPC policies for MRS with scales up to 10,000 units.

artificial intelligence, machine learning, robot, (17 more...)

arXiv.org Artificial Intelligence

2412.19669

Country: Europe (0.67)

Genre: Research Report > New Finding (1.00)

Industry:

Energy > Oil & Gas > Upstream (0.54)
Energy > Oil & Gas > Downstream (0.41)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (0.93)

Add feedback

LoKO: Low-Rank Kalman Optimizer for Online Fine-Tuning of Large Models

Abdi, Hossein, Sun, Mingfei, Zhang, Andi, Kaski, Samuel, Pan, Wei

arXiv.org Artificial IntelligenceOct-15-2024

Training large models with millions or even billions of parameters from scratch incurs substantial computational costs. Parameter Efficient Fine-Tuning (PEFT) methods, particularly Low-Rank Adaptation (LoRA), address this challenge by adapting only a reduced number of parameters to specific tasks with gradient-based optimizers. In this paper, we cast PEFT as an optimal filtering/state estimation problem and present Low-Rank Kalman Optimizer (LoKO) to estimate the optimal trainable parameters in an online manner. We leverage the low-rank decomposition in LoRA to significantly reduce matrix sizes in Kalman iterations and further capitalize on a diagonal approximation of the covariance matrix to effectively decrease computational complexity from quadratic to linear in the number of trainable parameters. Moreover, we discovered that the initialization of the covariance matrix within the Kalman algorithm and the accurate estimation of the observation noise covariance are the keys in this formulation, and we propose robust approaches that work well across a vast range of well-established computer vision and language models. Our results show that LoKO converges with fewer iterations and yields better performance models compared to commonly used optimizers with LoRA in both image classifications and language tasks. Our study opens up the possibility of leveraging the Kalman filter as an effective optimizer for the online fine-tuning of large models.

arxiv preprint arxiv, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2410.11551

Genre: Research Report > New Finding (1.00)

Industry: Education > Educational Setting > Online (0.68)

Technology:

Information Technology > Sensing and Signal Processing (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(2 more...)

Add feedback

Modular Adaptive Aerial Manipulation under Unknown Dynamic Coupling Forces

Yadav, Rishabh Dev, Dantu, Swati, Pan, Wei, Sun, Sihao, Roy, Spandan, Baldi, Simone

arXiv.org Artificial IntelligenceOct-10-2024

--Successful aerial manipulation largely depends on how effectively a controller can tackle the coupling dynamic forces between the aerial vehicle and the manipulator . However, this control problem has remained largely unsolved as the existing control approaches either require precise knowledge of the aerial vehicle/manipulator inertial couplings, or neglect the state-dependent uncertainties especially arising during the interaction phase. This work proposes an adaptive control solution to overcome this long standing control challenge without any a priori knowledge of the coupling dynamic terms. Additionally, in contrast to the existing adaptive control solutions, the proposed control framework is modular, that is, it allows independent tuning of the adaptive gains for the vehicle position sub-dynamics, the vehicle attitude sub-dynamics, and the manipulator sub-dynamics. Stability of the closed loop under the proposed scheme is derived analytically, and real-time experiments validate the effectiveness of the proposed scheme over the state-of-the-art approaches. I. INTRODUCTION An Unmanned Aerial Manipulator (UAM) is a coupled system where a quadrotor (or multirotor) vehicle carries a manipulator: the presence of the manipulator greatly improves the dexterity and flexibility of the quadrotor, making it capable to accomplish a wide range of tasks, from simple payload transportation to more complex tasks such as pick and place, contact-based inspection, grasping and assembling etc. [1]-[8]. This work was supported in part by "Aerial Manipulation" under IHFC grand project (GP/2021/DA/032), in part by "Capacity building for human resource development in Unmanned Aircraft System (Drone and related Technology)", MeiTY, India, in part by the Natural Science Foundation of China grants 62233004 and 62073074, and in part by Jiangsu Provincial Scientific Research Center of Applied Mathematics grant BK20233002.

artificial intelligence, controller, manipulator, (13 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/TMECH.2024.3457806

2410.08285

Country:

Europe (1.00)
Asia > China (0.88)
Asia > India (0.67)

Genre: Research Report > Promising Solution (0.34)

Industry:

Energy (0.88)
Information Technology > Robotics & Automation (0.34)
Aerospace & Defense > Aircraft (0.34)

Technology: Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles > Drones (0.48)

Add feedback

Spatial-aware decision-making with ring attractors in reinforcement learning systems

Saura, Marcos Negre, Allmendinger, Richard, Papamarkou, Theodore, Pan, Wei

arXiv.org Artificial IntelligenceOct-3-2024

This paper explores the integration of ring attractors, a mathematical model inspired by neural circuit dynamics, into the reinforcement learning (RL) action selection process. Ring attractors, as specialized brain-inspired structures that encode spatial information and uncertainty, offer a biologically plausible mechanism to improve learning speed and predictive performance. They do so by explicitly encoding the action space, facilitating the organization of neural activity, and enabling the distribution of spatial representations across the neural network in the context of deep RL. The application of ring attractors in the RL action selection process involves mapping actions to specific locations on the ring and decoding the selected action based on neural activity. We investigate the application of ring attractors by both building them as exogenous models and integrating them as part of a Deep Learning policy algorithm. Our results show a significant improvement in state-of-the-art models for the Atari 100k benchmark. Notably, our integrated approach improves the performance of state-of-the-art models by half, representing a 53% increase over selected baselines. This paper addresses the challenge of efficient action selection in reinforcement learning (RL), particularly in environments with spatial structures. Our primary contribution is the novel integration of ring attractors (Kim et al., 2017), a neural circuit model from neuroscience, into the RL framework. This approach improves spatial awareness in action selection and provides a mechanism for uncertainty-aware decision making in RL, leading to more accurate and efficient learning in complex environments.

artificial intelligence, machine learning, reinforcement learning, (18 more...)

arXiv.org Artificial Intelligence

2410.03119

Genre:

Research Report > Promising Solution (0.86)
Research Report > New Finding (0.68)

Industry:

Leisure & Entertainment (0.94)
Health & Medicine > Therapeutic Area > Neurology (0.90)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Add feedback

Adaptive Advantage-Guided Policy Regularization for Offline Reinforcement Learning

Liu, Tenglong, Li, Yang, Lan, Yixing, Gao, Hao, Pan, Wei, Xu, Xin

arXiv.org Artificial IntelligenceJul-15-2024

In offline reinforcement learning, the challenge of out-of-distribution (OOD) is pronounced. To address this, existing methods often constrain the learned policy through policy regularization. However, these methods often suffer from the issue of unnecessary conservativeness, hampering policy improvement. This occurs due to the indiscriminate use of all actions from the behavior policy that generates the offline dataset as constraints. The problem becomes particularly noticeable when the quality of the dataset is suboptimal. Thus, we propose Adaptive Advantage-guided Policy Regularization (A2PR), obtaining high-advantage actions from an augmented behavior policy combined with VAE to guide the learned policy. A2PR can select high-advantage actions that differ from those present in the dataset, while still effectively maintaining conservatism from OOD actions. This is achieved by harnessing the VAE capacity to generate samples matching the distribution of the data points. We theoretically prove that the improvement of the behavior policy is guaranteed. Besides, it effectively mitigates value overestimation with a bounded performance gap. Empirically, we conduct a series of experiments on the D4RL benchmark, where A2PR demonstrates state-of-the-art performance. Furthermore, experimental results on additional suboptimal mixed datasets reveal that A2PR exhibits superior performance. Code is available at https://github.com/ltlhuuu/A2PR.

artificial intelligence, machine learning, reinforcement learning, (13 more...)

arXiv.org Artificial Intelligence

2405.19909

Country:

Asia > China (0.28)
Europe > Austria > Vienna (0.14)

Genre: Research Report > New Finding (1.00)

Industry: Leisure & Entertainment > Games (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback