AITopics | Wu, Lisheng

Collaborating Authors

Wu, Lisheng

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Goal Exploration via Adaptive Skill Distribution for Goal-Conditioned Reinforcement Learning

Wu, Lisheng, Chen, Ke

arXiv.org Artificial IntelligenceApr-19-2024

Exploration efficiency poses a significant challenge in goal-conditioned reinforcement learning (GCRL) tasks, particularly those with long horizons and sparse rewards. A primary limitation to exploration efficiency is the agent's inability to leverage environmental structural patterns. In this study, we introduce a novel framework, GEASD, designed to capture these patterns through an adaptive skill distribution during the learning process. This distribution optimizes the local entropy of achieved goals within a contextual horizon, enhancing goal-spreading behaviors and facilitating deep exploration in states containing familiar structural patterns. Our experiments reveal marked improvements in exploration efficiency using the adaptive skill distribution compared to a uniform skill distribution. Additionally, the learned skill distribution demonstrates robust generalization capabilities, achieving substantial exploration progress in unseen tasks containing similar local structures.

artificial intelligence, machine learning, reinforcement learning, (14 more...)

arXiv.org Artificial Intelligence

2404.12999

Country: Asia > Middle East > Qatar (0.14)

Genre: Research Report > New Finding (1.00)

Industry: Education (0.66)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Goal Exploration Augmentation via Pre-trained Skills for Sparse-Reward Long-Horizon Goal-Conditioned Reinforcement Learning

Wu, Lisheng, Chen, Ke

arXiv.org Artificial IntelligenceDec-19-2023

Reinforcement learning (RL) often struggles to accomplish a sparse-reward long-horizon task in a complex environment. Goal-conditioned reinforcement learning (GCRL) has been employed to tackle this difficult problem via a curriculum of easy-to-reach sub-goals. In GCRL, exploring novel sub-goals is essential for the agent to ultimately find the pathway to the desired goal. How to explore novel sub-goals efficiently is one of the most challenging issues in GCRL. Several goal exploration methods have been proposed to address this issue but still struggle to find the desired goals efficiently. In this paper, we propose a novel learning objective by optimizing the entropy of both achieved and new goals to be explored for more efficient goal exploration in sub-goal selection based GCRL. To optimize this objective, we first explore and exploit the frequently occurring goal-transition patterns mined in the environments similar to the current task to compose skills via skill learning. Then, the pretrained skills are applied in goal exploration. Evaluation on a variety of spare-reward long-horizon benchmark tasks suggests that incorporating our method into several state-of-the-art GCRL baselines significantly boosts their exploration efficiency while improving or maintaining their performance. The source code is available at: https://github.com/GEAPS/GEAPS.

artificial intelligence, machine learning, reinforcement learning, (15 more...)

arXiv.org Artificial Intelligence

2210.16058

Country: Europe > United Kingdom (0.14)

Genre: Research Report (1.00)

Industry: Education (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Bias Resilient Multi-Step Off-Policy Goal-Conditioned Reinforcement Learning

Wu, Lisheng, Chen, Ke

arXiv.org Artificial IntelligenceNov-29-2023

In goal-conditioned reinforcement learning (GCRL), sparse rewards present significant challenges, often obstructing efficient learning. Although multi-step GCRL can boost this efficiency, it can also lead to off-policy biases in target values. This paper dives deep into these biases, categorizing them into two distinct categories: "shooting" and "shifting". Recognizing that certain behavior policies can hasten policy refinement, we present solutions designed to capitalize on the positive aspects of these biases while minimizing their drawbacks, enabling the use of larger step sizes to speed up GCRL. An empirical study demonstrates that our approach ensures a resilient and robust improvement, even in ten-step learning scenarios, leading to superior learning efficiency and performance that generally surpass the baseline and several state-of-the-art multi-step GCRL benchmarks.

machine learning, off-policy bias, reinforcement learning, (15 more...)

arXiv.org Artificial Intelligence

2311.17565

Country: North America > United States (0.14)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Multi-View Reinforcement Learning

Li, Minne, Wu, Lisheng, WANG, Jun, Ammar, Haitham Bou

Neural Information Processing SystemsMar-18-2020, 21:01:25 GMT

This paper is concerned with multi-view reinforcement learning (MVRL), which allows for decision making when agents share common dynamics but adhere to different observation models. We define the MVRL framework by extending partially observable Markov decision processes (POMDPs) to support more than one observation model and propose two solution methods through observation augmentation and cross-view policy transfer. We empirically evaluate our method and demonstrate its effectiveness in a variety of environments. Specifically, we show reductions in sample complexities and computational time for acquiring policies that handle multi-view environments. Papers published at the Neural Information Processing Systems Conference.

artificial intelligence, machine learning, reinforcement learning, (1 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Add feedback

Multi-View Reinforcement Learning

Li, Minne, Wu, Lisheng, Ammar, Haitham Bou, Wang, Jun

arXiv.org Machine LearningOct-18-2019

deep learning, ground transportation, neural network, (19 more...)

arXiv.org Machine Learning

1910.08285

Country: Europe (0.46)

Genre: Research Report > New Finding (0.46)

Industry: Leisure & Entertainment (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Add feedback

Learning Shared Dynamics with Meta-World Models

Wu, Lisheng, Li, Minne, Wang, Jun

arXiv.org Artificial IntelligenceNov-5-2018

Humans have consciousness as the ability to perceive events and objects: a mental model of the world developed from the most impoverished of visual stimuli, enabling humans to make rapid decisions and take actions. Although spatial and temporal aspects of different scenes are generally diverse, the underlying physics among environments still work the same way, thus learning an abstract description of shared physical dynamics helps human to understand the world. In this paper, we explore building this mental world with neural network models through multi-task learning, namely the meta-world model. We show through extensive experiments that our proposed meta-world models successfully capture the common dynamics over the compact representations of visually different environments from Atari Games. We also demonstrate that agents equipped with our meta-world model possess the ability of visual self-recognition, i.e., recognize themselves from the reflected mirrored environment derived from the classic mirror self-recognition test (MSR).

computer game, deep learning, meta-world model, (19 more...)

arXiv.org Artificial Intelligence

1811.01741

Genre: Research Report (0.64)

Industry: Leisure & Entertainment > Games > Computer Games (0.55)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Learning Multi-agent Implicit Communication Through Actions: A Case Study in Contract Bridge, a Collaborative Imperfect-Information Game

Tian, Zheng, Zou, Shihao, Warr, Tim, Wu, Lisheng, Wang, Jun

arXiv.org Artificial IntelligenceOct-10-2018

In situations where explicit communication is limited, a human collaborator is typically able to learn to: (i) infer the meaning behind their partner's actions and (ii) balance between taking actions that are exploitative given their current understanding of the state vs. those that can convey private information about the state to their partner. The first component of this learning process has been well-studied in multi-agent systems, whereas the second --- which is equally crucial for a successful collaboration --- has not. In this work, we complete the learning process and introduce our novel algorithm, Policy-Belief-Iteration ("P-BIT"), which mimics both components mentioned above. A belief module models the other agent's private information by observing their actions, whilst a policy module makes use of the inferred private information to return a distribution over actions. They are mutually reinforced with an EM-like algorithm. We use a novel auxiliary reward to encourage information exchange by actions. We evaluate our approach on the non-competitive bidding problem from contract bridge and show that by self-play agents are able to effectively collaborate with implicit communication, and P-BIT outperforms several meaningful baselines that have been considered.

agent, artificial intelligence, machine learning, (18 more...)

arXiv.org Artificial Intelligence

1810.04444

Country: North America > United States (0.14)

Genre: Research Report (0.64)

Industry: Leisure & Entertainment > Games > Bridge (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback