AITopics | Zeng, Wenjun

Collaborating Authors

Zeng, Wenjun

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Disentangled World Models: Learning to Transfer Semantic Knowledge from Distracting Videos for Reinforcement Learning

Wang, Qi, Zhang, Zhipeng, Xie, Baao, Jin, Xin, Wang, Yunbo, Wang, Shiyu, Zheng, Liaomo, Yang, Xiaokang, Zeng, Wenjun

arXiv.org Artificial IntelligenceMar-11-2025

Training visual reinforcement learning (RL) in practical scenarios presents a significant challenge, $\textit{i.e.,}$ RL agents suffer from low sample efficiency in environments with variations. While various approaches have attempted to alleviate this issue by disentanglement representation learning, these methods usually start learning from scratch without prior knowledge of the world. This paper, in contrast, tries to learn and understand underlying semantic variations from distracting videos via offline-to-online latent distillation and flexible disentanglement constraints. To enable effective cross-domain semantic knowledge transfer, we introduce an interpretable model-based RL framework, dubbed Disentangled World Models (DisWM). Specifically, we pretrain the action-free video prediction model offline with disentanglement regularization to extract semantic knowledge from distracting videos. The disentanglement capability of the pretrained model is then transferred to the world model through latent distillation. For finetuning in the online environment, we exploit the knowledge from the pretrained model and introduce a disentanglement constraint to the world model. During the adaptation phase, the incorporation of actions and rewards from online environment interactions enriches the diversity of the data, which in turn strengthens the disentangled representation learning. Experimental results validate the superiority of our approach on various benchmarks.

machine learning, natural language, reinforcement learning, (16 more...)

arXiv.org Artificial Intelligence

2503.08751

Country: Asia > China > Zhejiang Province (0.15)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (1.00)
(2 more...)

Add feedback

ULTHO: Ultra-Lightweight yet Efficient Hyperparameter Optimization in Deep Reinforcement Learning

Yuan, Mingqi, Li, Bo, Jin, Xin, Zeng, Wenjun

arXiv.org Artificial IntelligenceMar-8-2025

Hyperparameter optimization (HPO) is a billion-dollar problem in machine learning, which significantly impacts the training efficiency and model performance. However, achieving efficient and robust HPO in deep reinforcement learning (RL) is consistently challenging due to its high non-stationarity and computational cost. To tackle this problem, existing approaches attempt to adapt common HPO techniques (e.g., population-based training or Bayesian optimization) to the RL scenario. However, they remain sample-inefficient and computationally expensive, which cannot facilitate a wide range of applications. In this paper, we propose ULTHO, an ultra-lightweight yet powerful framework for fast HPO in deep RL within single runs. Specifically, we formulate the HPO process as a multi-armed bandit with clustered arms (MABC) and link it directly to long-term return optimization. ULTHO also provides a quantified and statistical perspective to filter the HPs efficiently. We test ULTHO on benchmarks including ALE, Procgen, MiniGrid, and PyBullet. Extensive experiments demonstrate that the ULTHO can achieve superior performance with simple architecture, contributing to the development of advanced and automated RL systems.

artificial intelligence, machine learning, reinforcement learning, (17 more...)

arXiv.org Artificial Intelligence

2503.06101

Country:

Asia > China (0.14)
North America > United States (0.14)

Genre: Research Report (1.00)

Industry: Leisure & Entertainment > Games (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Add feedback

Adaptive Data Exploitation in Deep Reinforcement Learning

Yuan, Mingqi, Li, Bo, Jin, Xin, Zeng, Wenjun

arXiv.org Artificial IntelligenceJan-21-2025

We introduce ADEPT: Adaptive Data ExPloiTation, a simple yet powerful framework to enhance the **data efficiency** and **generalization** in deep reinforcement learning (RL). Specifically, ADEPT adaptively manages the use of sampled data across different learning stages via multi-armed bandit (MAB) algorithms, optimizing data utilization while mitigating overfitting. Moreover, ADEPT can significantly reduce the computational overhead and accelerate a wide range of RL algorithms. We test ADEPT on benchmarks including Procgen, MiniGrid, and PyBullet. Extensive simulation demonstrates that ADEPT can achieve superior performance with remarkable computational efficiency, offering a practical solution to data-efficient RL. Our code is available at https://github.com/yuanmingqi/ADEPT.

environment step, machine learning, reinforcement learning, (12 more...)

arXiv.org Artificial Intelligence

2501.1262

Country:

Asia > China (0.28)
North America > United States (0.28)

Genre: Research Report > New Finding (0.46)

Industry:

Energy (0.67)
Leisure & Entertainment > Games (0.45)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Deep Reinforcement Learning with Hybrid Intrinsic Reward Model

Yuan, Mingqi, Li, Bo, Jin, Xin, Zeng, Wenjun

arXiv.org Artificial IntelligenceJan-21-2025

Intrinsic reward shaping has emerged as a prevalent approach to solving hard-exploration and sparse-rewards environments in reinforcement learning (RL). While single intrinsic rewards, such as curiosity-driven or novelty-based methods, have shown effectiveness, they often limit the diversity and efficiency of exploration. Moreover, the potential and principle of combining multiple intrinsic rewards remains insufficiently explored. To address this gap, we introduce HIRE (Hybrid Intrinsic REward), a flexible and elegant framework for creating hybrid intrinsic rewards through deliberate fusion strategies. With HIRE, we conduct a systematic analysis of the application of hybrid intrinsic rewards in both general and unsupervised RL across multiple benchmarks. Extensive experiments demonstrate that HIRE can significantly enhance exploration efficiency and diversity, as well as skill acquisition in complex and dynamic settings.

machine learning, ngu, reinforcement learning, (17 more...)

arXiv.org Artificial Intelligence

2501.12627

Country: Asia > China > Zhejiang Province (0.14)

Genre: Research Report > New Finding (0.68)

Industry: Education (0.48)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Proactive Agents for Multi-Turn Text-to-Image Generation Under Uncertainty

Hahn, Meera, Zeng, Wenjun, Kannen, Nithish, Galt, Rich, Badola, Kartikeya, Kim, Been, Wang, Zi

arXiv.org Artificial IntelligenceDec-9-2024

User prompts for generative AI models are often underspecified, leading to sub-optimal responses. This problem is particularly evident in text-to-image (T2I) generation, where users commonly struggle to articulate their precise intent. This disconnect between the user's vision and the model's interpretation often forces users to painstakingly and repeatedly refine their prompts. To address this, we propose a design for proactive T2I agents equipped with an interface to (1) actively ask clarification questions when uncertain, and (2) present their understanding of user intent as an understandable belief graph that a user can edit. We build simple prototypes for such agents and verify their effectiveness through both human studies and automated evaluation. We observed that at least 90% of human subjects found these agents and their belief graphs helpful for their T2I workflow. Moreover, we develop a scalable automated evaluation approach using two agents, one with a ground truth image and the other tries to ask as few questions as possible to align with the ground truth. On DesignBench, a benchmark we created for artists and designers, the COCO dataset (Lin et al., 2014), and ImageInWords (Garg et al., 2024), we observed that these T2I agents were able to ask informative questions and elicit crucial information to achieve successful alignment with at least 2 times higher VQAScore (Lin et al., 2024) than the standard single-turn T2I generation. Demo: https://github.com/google-deepmind/proactive_t2i_agents.

artificial intelligence, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2412.06771

Country:

Asia (0.67)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)

Genre: Research Report (0.51)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.48)

Add feedback

Open-World Reinforcement Learning over Long Short-Term Imagination

Li, Jiajian, Wang, Qi, Wang, Yunbo, Jin, Xin, Li, Yang, Zeng, Wenjun, Yang, Xiaokang

arXiv.org Artificial IntelligenceOct-4-2024

Training visual reinforcement learning agents in a high-dimensional open world presents significant challenges. While various model-based methods have improved sample efficiency by learning interactive world models, these agents tend to be "short-sighted", as they are typically trained on short snippets of imagined experiences. We argue that the primary obstacle in open-world decision-making is improving the efficiency of off-policy exploration across an extensive state space. In this paper, we present LS-Imagine, which extends the imagination horizon within a limited number of state transition steps, enabling the agent to explore behaviors that potentially lead to promising long-term feedback. The foundation of our approach is to build a long short-term world model. To achieve this, we simulate goal-conditioned jumpy state transitions and compute corresponding affordance maps by zooming in on specific areas within single images. This facilitates the integration of direct long-term values into behavior learning. Our method demonstrates significant improvements over state-of-the-art techniques in MineDojo.

affordance map, machine learning, reinforcement learning, (16 more...)

arXiv.org Artificial Intelligence

2410.03618

Country: Asia > China (0.47)

Genre: Research Report > Promising Solution (0.66)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Add feedback

HIMO: A New Benchmark for Full-Body Human Interacting with Multiple Objects

Lv, Xintao, Xu, Liang, Yan, Yichao, Jin, Xin, Xu, Congsheng, Wu, Shuwen, Liu, Yifan, Li, Lincheng, Bi, Mengxiao, Zeng, Wenjun, Yang, Xiaokang

arXiv.org Artificial IntelligenceJul-17-2024

Generating human-object interactions (HOIs) is critical with the tremendous advances of digital avatars. Existing datasets are typically limited to humans interacting with a single object while neglecting the ubiquitous manipulation of multiple objects. Thus, we propose HIMO, a large-scale MoCap dataset of full-body human interacting with multiple objects, containing 3.3K 4D HOI sequences and 4.08M 3D HOI frames. We also annotate HIMO with detailed textual descriptions and temporal segments, benchmarking two novel tasks of HOI synthesis conditioned on either the whole text prompt or the segmented text prompts as fine-grained timeline control. To address these novel tasks, we propose a dual-branch conditional diffusion model with a mutual interaction module for HOI synthesis. Besides, an auto-regressive generation pipeline is also designed to obtain smooth transitions between HOI segments. Experimental results demonstrate the generalization ability to unseen object geometries and temporal compositions.

artificial intelligence, dataset, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2407.12371

Country: Asia > China (0.28)

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Add feedback

Real-Time Dynamic Robot-Assisted Hand-Object Interaction via Motion Primitives

Yuan, Mingqi, Wang, Huijiang, Chu, Kai-Fung, Iida, Fumiya, Li, Bo, Zeng, Wenjun

arXiv.org Artificial IntelligenceMay-29-2024

Advances in artificial intelligence (AI) have been propelling the evolution of human-robot interaction (HRI) technologies. However, significant challenges remain in achieving seamless interactions, particularly in tasks requiring physical contact with humans. These challenges arise from the need for accurate real-time perception of human actions, adaptive control algorithms for robots, and the effective coordination between human and robotic movements. In this paper, we propose an approach to enhancing physical HRI with a focus on dynamic robot-assisted hand-object interaction (HOI). Our methodology integrates hand pose estimation, adaptive robot control, and motion primitives to facilitate human-robot collaboration. Specifically, we employ a transformer-based algorithm to perform real-time 3D modeling of human hands from single RGB images, based on which a motion primitives model (MPM) is designed to translate human hand motions into robotic actions. The robot's action implementation is dynamically fine-tuned using the continuously updated 3D hand models. Experimental validations, including a ring-wearing task, demonstrate the system's effectiveness in adapting to real-time movements and assisting in precise task executions.

artificial intelligence, machine learning, robot, (18 more...)

arXiv.org Artificial Intelligence

2405.19531

Country:

Europe > United Kingdom (0.14)
Asia > China (0.14)

Genre: Research Report (0.64)

Industry: Health & Medicine (0.94)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

RLeXplore: Accelerating Research in Intrinsically-Motivated Reinforcement Learning

Yuan, Mingqi, Castanyer, Roger Creus, Li, Bo, Jin, Xin, Berseth, Glen, Zeng, Wenjun

arXiv.org Artificial IntelligenceMay-29-2024

Extrinsic rewards can effectively guide reinforcement learning (RL) agents in specific tasks. However, extrinsic rewards frequently fall short in complex environments due to the significant human effort needed for their design and annotation. This limitation underscores the necessity for intrinsic rewards, which offer auxiliary and dense signals and can enable agents to learn in an unsupervised manner. Although various intrinsic reward formulations have been proposed, their implementation and optimization details are insufficiently explored and lack standardization, thereby hindering research progress. To address this gap, we introduce RLeXplore, a unified, highly modularized, and plug-and-play framework offering reliable implementations of eight state-of-the-art intrinsic reward algorithms. Furthermore, we conduct an in-depth study that identifies critical implementation details and establishes well-justified standard practices in intrinsically-motivated RL.

intrinsic reward, machine learning, reinforcement learning, (16 more...)

arXiv.org Artificial Intelligence

2405.19548

Country:

Asia > China (0.28)
North America > Canada > Quebec (0.14)

Genre: Research Report > New Finding (0.69)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.95)

Add feedback

ReGenNet: Towards Human Action-Reaction Synthesis

Xu, Liang, Zhou, Yizhou, Yan, Yichao, Jin, Xin, Zhu, Wenhan, Rao, Fengyun, Yang, Xiaokang, Zeng, Wenjun

arXiv.org Artificial IntelligenceMar-18-2024

In this paper, we focus on generative models for static scenes and objects, while the dynamic human actionreaction human action-reaction synthesis, i.e., generating human reactions synthesis for ubiquitous causal human-human interactions given the action sequence of another as conditions. is less explored. Human-human interactions We believe this task will contribute to many applications in can be regarded as asymmetric with actors and reactors AR/VR, games, human-robot interaction, and embodied AI. in atomic interaction periods. In this paper, we comprehensively Modeling human-human interactions is a challenging analyze the asymmetric, dynamic, synchronous, task with the following features: 1) Asymmetric, i.e., the and detailed nature of human-human interactions and propose actor and reactor play asymmetric roles during a causal interaction, the first multi-setting human action-reaction synthesis where one person acts, and the other reacts [78]; benchmark to generate human reactions conditioned on 2) Dynamic, i.e., during the interaction period, the two people given human actions. To begin with, we propose to annotate constantly wave their body parts, move close/away, and the actor-reactor order of the interaction sequences change relative orientations, spatially and temporally; 3) for the NTU120, InterHuman, and Chi3D datasets. Based Synchronous, i.e., typically, one person responds instantly on them, a diffusion-based generative model with a Transformer with others such as an immediate evasion when someone decoder architecture called ReGenNet together with throws a punch, thus the online generation is required; 4) an explicit distance-based interaction loss is proposed to Detailed, i.e., the interaction between humans involves not predict human reactions in an online manner, where the future only coarse body movements together with relative position states of actors are unavailable to reactors.

artificial intelligence, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2403.11882

Country: Asia > China (0.28)

Genre: Research Report (1.00)

Industry: Leisure & Entertainment (0.34)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (0.69)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.67)
(2 more...)

Add feedback