AITopics | Zhou, Hongyi

Collaborating Authors

Zhou, Hongyi

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Chunking the Critic: A Transformer-based Soft Actor-Critic with N-Step Returns

Tian, Dong, Li, Ge, Zhou, Hongyi, Celik, Onur, Neumann, Gerhard

arXiv.org Artificial IntelligenceMar-6-2025

Unlike traditional methods that focus on evaluating single state-action pairs or apply action chunking in the actor network, this approach feeds chunked actions directly into the critic. Leveraging the Transformer's strength in processing sequential data, the proposed architecture achieves more robust value estimation. Empirical evaluations demonstrate that this method leads to efficient and stable training, particularly excelling in environments with sparse rewards or Multi-Phase tasks.Contribution(s) 1. We present a novel critic architecture for SAC that leverages Transformers to process sequential information, resulting in more accurate value estimations. Context: Transformer-Based Critic Network 2. We introduce a method for incorporating N-Step returns into the critic network in a stable and efficient manner, effectively mitigating the common challenges of variance and importance sampling associated with N-returns. Context: Stable Integration of N-Returns 3. We shift action chunking from the actor to the critic, demonstrating that enhanced temporal reasoning at the critic level--beyond traditional actor-side exploration--drives performance improvements in sparse and multi-phase tasks. Context: Unlike previous approaches that focus on actor-side chunking for exploration, our Transformer-based critic network produces a smooth value surface that is highly responsive to dataset variations, eliminating the need for additional exploration enhancements.

artificial intelligence, machine learning, reinforcement learning, (15 more...)

arXiv.org Artificial Intelligence

2503.0366

Country: North America > United States (0.14)

Genre: Research Report > New Finding (0.46)

Industry: Energy > Oil & Gas (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.97)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.89)

Add feedback

IRIS: An Immersive Robot Interaction System

Jiang, Xinkai, Yuan, Qihao, Dincer, Enes Ulas, Zhou, Hongyi, Li, Ge, Li, Xueyin, Haag, Julius, Schreiber, Nicolas, Li, Kailai, Neumann, Gerhard, Lioutikov, Rudolf

arXiv.org Artificial IntelligenceFeb-17-2025

This paper introduces IRIS, an immersive Robot Interaction System leveraging Extended Reality (XR), designed for robot data collection and interaction across multiple simulators, benchmarks, and real-world scenarios. While existing XR-based data collection systems provide efficient and intuitive solutions for large-scale data collection, they are often challenging to reproduce and reuse. This limitation arises because current systems are highly tailored to simulator-specific use cases and environments. IRIS is a novel, easily extendable framework that already supports multiple simulators, benchmarks, and even headsets. Furthermore, IRIS is able to include additional information from real-world sensors, such as point clouds captured through depth cameras. A unified scene specification is generated directly from simulators or real-world sensors and transmitted to XR headsets, creating identical scenes in XR. This specification allows IRIS to support any of the objects, assets, and robots provided by the simulators. In addition, IRIS introduces shared spatial anchors and a robust communication protocol that links simulations between multiple XR headsets. This feature enables multiple XR headsets to share a synchronized scene, facilitating collaborative and multi-user data collection. IRIS can be deployed on any device that supports the Unity Framework, encompassing the vast majority of commercially available headsets. In this work, IRIS was deployed and tested on the Meta Quest 3 and the HoloLens 2. IRIS showcased its versatility across a wide range of real-world and simulated scenarios, using current popular robot simulators such as MuJoCo, IsaacSim, CoppeliaSim, and Genesis. In addition, a user study evaluates IRIS on a data collection task for the LIBERO benchmark. The study shows that IRIS significantly outperforms the baseline in both objective and subjective metrics.

artificial intelligence, iris, robot, (15 more...)

arXiv.org Artificial Intelligence

2502.03297

Country: Europe (0.67)

Genre: Research Report (1.00)

Industry:

Leisure & Entertainment (0.93)
Information Technology (0.88)

Technology: Information Technology > Artificial Intelligence > Robots > Robot Planning & Action (0.68)

Add feedback

X-IL: Exploring the Design Space of Imitation Learning Policies

Jia, Xiaogang, Donat, Atalay, Huang, Xi, Zhao, Xuan, Blessing, Denis, Zhou, Hongyi, Zhang, Hanyi, Wang, Han A., Wang, Qian, Lioutikov, Rudolf, Neumann, Gerhard

arXiv.org Artificial IntelligenceFeb-17-2025

Designing modern imitation learning (IL) policies requires making numerous decisions, including the selection of feature encoding, architecture, policy representation, and more. As the field rapidly advances, the range of available options continues to grow, creating a vast and largely unexplored design space for IL policies. In this work, we present X-IL, an accessible open-source framework designed to systematically explore this design space. The framework's modular design enables seamless swapping of policy components, such as backbones (e.g., Transformer, Mamba, xLSTM) and policy optimization techniques (e.g., Score-matching, Flow-matching). This flexibility facilitates comprehensive experimentation and has led to the discovery of novel policy configurations that outperform existing methods on recent robot learning benchmarks. Our experiments demonstrate not only significant performance gains but also provide valuable insights into the strengths and weaknesses of various design choices. This study serves as both a practical reference for practitioners and a foundation for guiding future research in imitation learning.

artificial intelligence, machine learning, representation, (16 more...)

arXiv.org Artificial Intelligence

2502.1233

Country: Europe (0.46)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.70)

Add feedback

Towards Fusing Point Cloud and Visual Representations for Imitation Learning

Donat, Atalay, Jia, Xiaogang, Huang, Xi, Taranovic, Aleksandar, Blessing, Denis, Li, Ge, Zhou, Hongyi, Zhang, Hanyi, Lioutikov, Rudolf, Neumann, Gerhard

arXiv.org Artificial IntelligenceFeb-17-2025

Learning for manipulation requires using policies that have access to rich sensory information such as point clouds or RGB images. Point clouds efficiently capture geometric structures, making them essential for manipulation tasks in imitation learning. In contrast, RGB images provide rich texture and semantic information that can be crucial for certain tasks. Existing approaches for fusing both modalities assign 2D image features to point clouds. However, such approaches often lose global contextual information from the original images. In this work, we propose a novel imitation learning method that effectively combines the strengths of both point cloud and RGB modalities. Our method conditions the point-cloud encoder on global and local image tokens using adaptive layer norm conditioning, leveraging the beneficial properties of both modalities. Through extensive experiments on the challenging RoboCasa benchmark, we demonstrate the limitations of relying on either modality alone and show that our method achieves state-of-the-art performance across all tasks.

artificial intelligence, machine learning, point cloud, (12 more...)

arXiv.org Artificial Intelligence

2502.1232

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

Uniform-in-time weak propagation of chaos for consensus-based optimization

Bayraktar, Erhan, Ekren, Ibrahim, Zhou, Hongyi

arXiv.org Artificial IntelligenceFeb-1-2025

We study the uniform-in-time weak propagation of chaos for the consensus-based optimization (CBO) method on a bounded searching domain. We apply the methodology for studying long-time behaviors of interacting particle systems developed in the work of Delarue and Tse (ArXiv:2104.14973). Our work shows that the weak error has order $O(N^{-1})$ uniformly in time, where $N$ denotes the number of particles. The main strategy behind the proofs are the decomposition of the weak errors using the linearized Fokker-Planck equations and the exponential decay of their Sobolev norms. Consequently, our result leads to the joint convergence of the empirical distribution of the CBO particle system to the Dirac-delta distribution at the global minimizer in population size and running time in Wasserstein-type metrics.

artificial intelligence, evolutionary algorithm, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2502.00582

Country: North America > United States > Michigan > Washtenaw County > Ann Arbor (0.14)

Genre: Research Report (0.70)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Evolutionary Systems (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.93)

Add feedback

BMP: Bridging the Gap between B-Spline and Movement Primitives

Liao, Weiran, Li, Ge, Zhou, Hongyi, Lioutikov, Rudolf, Neumann, Gerhard

arXiv.org Artificial IntelligenceDec-20-2024

This work introduces B-spline Movement Primitives (BMPs), a new Movement Primitive (MP) variant that leverages B-splines for motion representation. B-splines are a well-known concept in motion planning due to their ability to generate complex, smooth trajectories with only a few control points while satisfying boundary conditions, i.e., passing through a specified desired position with desired velocity. However, current usages of B-splines tend to ignore the higher-order statistics in trajectory distributions, which limits their usage in imitation learning (IL) and reinforcement learning (RL), where modeling trajectory distribution is essential. In contrast, MPs are commonly used in IL and RL for their capacity to capture trajectory likelihoods and correlations. However, MPs are constrained by their abilities to satisfy boundary conditions and usually need extra terms in learning objectives to satisfy velocity constraints. By reformulating B-splines as MPs, represented through basis functions and weight parameters, BMPs combine the strengths of both approaches, allowing B-splines to capture higher-order statistics while retaining their ability to satisfy boundary conditions. Empirical results in IL and RL demonstrate that BMPs broaden the applicability of B-splines in robot learning and offer greater expressiveness compared to existing MP variants.

artificial intelligence, machine learning, trajectory, (16 more...)

arXiv.org Artificial Intelligence

2411.10336

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.47)

Add feedback

A Retrospective on the Robot Air Hockey Challenge: Benchmarking Robust, Reliable, and Safe Learning Techniques for Real-world Robotics

Liu, Puze, Günster, Jonas, Funk, Niklas, Gröger, Simon, Chen, Dong, Bou-Ammar, Haitham, Jankowski, Julius, Marić, Ante, Calinon, Sylvain, Orsula, Andrej, Olivares-Mendez, Miguel, Zhou, Hongyi, Lioutikov, Rudolf, Neumann, Gerhard, Zhalehmehrabi, Amarildo Likmeta Amirhossein, Bonenfant, Thomas, Restelli, Marcello, Tateo, Davide, Liu, Ziyuan, Peters, Jan

arXiv.org Artificial IntelligenceNov-8-2024

Machine learning methods have a groundbreaking impact in many application domains, but their application on real robotic platforms is still limited. Despite the many challenges associated with combining machine learning technology with robotics, robot learning remains one of the most promising directions for enhancing the capabilities of robots. When deploying learning-based approaches on real robots, extra effort is required to address the challenges posed by various real-world factors. To investigate the key factors influencing real-world deployment and to encourage original solutions from different researchers, we organized the Robot Air Hockey Challenge at the NeurIPS 2023 conference. We selected the air hockey task as a benchmark, encompassing low-level robotics problems and high-level tactics. Different from other machine learning-centric benchmarks, participants need to tackle practical challenges in robotics, such as the sim-to-real gap, low-level control issues, safety problems, real-time requirements, and the limited availability of real-world data. Furthermore, we focus on a dynamic environment, removing the typical assumption of quasi-static motions of other real-world benchmarks. The competition's results show that solutions combining learning-based approaches with prior knowledge outperform those relying solely on data when real-world deployment is challenging. Our ablation study reveals which real-world factors may be overlooked when building a learning-based solution. The successful real-world air hockey deployment of best-performing agents sets the foundation for future competitions and follow-up research directions.

large language model, machine learning, reinforcement learning, (20 more...)

arXiv.org Artificial Intelligence

2411.05718

Country:

Europe > Italy (0.46)
Europe > Germany (0.46)

Genre:

Research Report > New Finding (0.48)
Research Report > Promising Solution (0.34)

Industry:

Leisure & Entertainment > Sports > Hockey (1.00)
Leisure & Entertainment > Games > Computer Games (0.88)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.68)
(2 more...)

Add feedback

HIRO: Heuristics Informed Robot Online Path Planning Using Pre-computed Deterministic Roadmaps

Huang, Xi, Sóti, Gergely, Zhou, Hongyi, Ledermann, Christoph, Hein, Björn, Kröger, Torsten

arXiv.org Artificial IntelligenceOct-26-2024

Dividing robot environments into static and dynamic elements, we use the static part for initializing a deterministic roadmap, which provides a lower bound of the final path cost as informed heuristics for fast path-finding. These heuristics guide a search tree to explore the roadmap during runtime. The search tree examines the edges using a fuzzy collision checking concerning the dynamic environment. Finally, the heuristics tree exploits knowledge fed back from the fuzzy collision checking module and updates the lower bound for the path cost. As we demonstrate in real-world experiments, the closed-loop formed by these three components significantly accelerates the planning procedure. An additional backtracking step ensures the feasibility of the resulting paths. Experiments in simulation and the real world show that HIRO can find collision-free paths considerably faster than baseline methods with and without prior knowledge of the environment.

artificial intelligence, planning & scheduling, roadmap, (17 more...)

arXiv.org Artificial Intelligence

2410.20279

Country: Europe > Germany (0.28)

Genre: Research Report (0.64)

Industry: Energy (0.76)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (0.65)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (0.55)

Add feedback

TOP-ERL: Transformer-based Off-Policy Episodic Reinforcement Learning

Li, Ge, Tian, Dong, Zhou, Hongyi, Jiang, Xinkai, Lioutikov, Rudolf, Neumann, Gerhard

arXiv.org Artificial IntelligenceOct-12-2024

This work introduces a novel off-policy Reinforcement Learning (RL) algorithm that utilizes a transformer-based architecture for predicting the state-action values for a sequence of actions. These returns are effectively used to update the policy that predicts a smooth trajectory instead of a single action in each decision step. Predicting a whole trajectory of actions is commonly done in episodic RL (ERL) (Kober & Peters, 2008) and differs conceptually from conventional step-based RL (SRL) methods like PPO (Schulman et al., 2017) and SAC (Haarnoja et al., 2018a) where an action is sampled in each time step. The action selection concept in ERL is promising as shown in recent works in RL (Otto et al., 2022; Li et al., 2024). Similar insights have been made in the field of Imitation Learning, where predicting action sequences instead of single actions has led to great success (Zhao et al., 2023; Reuss et al., 2024). Additionally, decision-making in ERL aligns with the human's decision-making strategy, where the human generally does not decide in each single time step but rather performs a whole sequence of actions to complete a task - for instance, swinging an arm to play tennis without overthinking each per-step movement. Episodic RL is a distinct family of RL that emphasizes the maximization of returns over entire episodes, typically lasting several seconds, rather than optimizing the intermediate states during environment interactions (Whitley et al., 1993; Igel, 2003; Peters & Schaal, 2008). Unlike SRL, ERL shifts the solution search from per-step actions to a parameterized trajectory space, leveraging techniques like Movement Primitives (MPs) (Schaal, 2006; Paraschos et al., 2013) for generating action sequences. This approach enables a broader exploration horizon (Kober & Peters, 2008), captures temporal and degrees of freedom (DoF) correlations (Li et al., 2024), and ensures smooth transitions between re-planning phases (Otto et al., 2023).

machine learning, reinforcement learning, trajectory, (15 more...)

arXiv.org Artificial Intelligence

2410.09536

Genre: Research Report > New Finding (0.67)

Industry: Leisure & Entertainment > Sports (0.48)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Variational Distillation of Diffusion Policies into Mixture of Experts

Zhou, Hongyi, Blessing, Denis, Li, Ge, Celik, Onur, Jia, Xiaogang, Neumann, Gerhard, Lioutikov, Rudolf

arXiv.org Artificial IntelligenceJun-18-2024

This work introduces Variational Diffusion Distillation (VDD), a novel method that distills denoising diffusion policies into Mixtures of Experts (MoE) through variational inference. Diffusion Models are the current state-of-the-art in generative modeling due to their exceptional ability to accurately learn and represent complex, multi-modal distributions. This ability allows Diffusion Models to replicate the inherent diversity in human behavior, making them the preferred models in behavior learning such as Learning from Human Demonstrations (LfD). However, diffusion models come with some drawbacks, including the intractability of likelihoods and long inference times due to their iterative sampling process. The inference times, in particular, pose a significant challenge to real-time applications such as robot control. In contrast, MoEs effectively address the aforementioned issues while retaining the ability to represent complex distributions but are notoriously difficult to train. VDD is the first method that distills pre-trained diffusion models into MoE models, and hence, combines the expressiveness of Diffusion Models with the benefits of Mixture Models. Specifically, VDD leverages a decompositional upper bound of the variational objective that allows the training of each expert separately, resulting in a robust optimization scheme for MoEs. VDD demonstrates across nine complex behavior learning tasks, that it is able to: i) accurately distill complex distributions learned by the diffusion model, ii) outperform existing state-of-the-art distillation methods, and iii) surpass conventional methods for training MoE.

artificial intelligence, diffusion model, machine learning, (13 more...)

arXiv.org Artificial Intelligence

2406.12538

Country:

North America > United States (0.14)
Europe > Germany (0.14)

Genre: Research Report > Promising Solution (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Add feedback