AITopics | hwrn

Collaborating Authors

hwrn

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

A Review for Deep Reinforcement Learning in Atari:Benchmarks, Challenges, and Solutions

Fan, Jiajun

arXiv.org Artificial IntelligenceDec-10-2021

The Arcade Learning Environment (ALE) is proposed as an evaluation platform for empirically assessing the generality of agents across dozens of Atari 2600 games. ALE offers various challenging problems and has drawn significant attention from the deep reinforcement learning (RL) community. From Deep Q-Networks (DQN) to Agent57, RL agents seem to achieve superhuman performance in ALE. However, is this the case? In this paper, to explore this problem, we first review the current evaluation metrics in the Atari benchmarks and then reveal that the current evaluation criteria of achieving superhuman performance are inappropriate, which underestimated the human performance relative to what is possible. To handle those problems and promote the development of RL research, we propose a novel Atari benchmark based on human world records (HWR), which puts forward higher requirements for RL agents on both final performance and learning efficiency. Furthermore, we summarize the state-of-the-art (SOTA) methods in Atari benchmarks and provide benchmark results over new evaluation metrics based on human world records. We concluded that at least four open challenges hinder RL agents from achieving superhuman performance from those new benchmark results. Finally, we also discuss some promising ways to handle those problems.

algorithm, hwrn, saber, (17 more...)

arXiv.org Artificial Intelligence

2112.04145

Country: Asia > China > Guangdong Province > Shenzhen (0.04)

Genre:

Research Report (1.00)
Overview (0.67)

Industry:

Leisure & Entertainment > Sports (0.94)
Leisure & Entertainment > Games > Computer Games (0.47)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

GDI: Rethinking What Makes Reinforcement Learning Different From Supervised Learning

Fan, Jiajun, Xiao, Changnan, Huang, Yue

arXiv.org Artificial IntelligenceJun-15-2021

Deep Q Network (DQN) firstly kicked the door of deep reinforcement learning (DRL) via combining deep learning (DL) with reinforcement learning (RL), which has noticed that the distribution of the acquired data would change during the training process. DQN found this property might cause instability for training, so it proposed effective methods to handle the downside of the property. Instead of focusing on the unfavourable aspects, we find it critical for RL to ease the gap between the estimated data distribution and the ground truth data distribution while supervised learning (SL) fails to do so. From this new perspective, we extend the basic paradigm of RL called the Generalized Policy Iteration (GPI) into a more generalized version, which is called the Generalized Data Distribution Iteration (GDI). We see massive RL algorithms and techniques can be unified into the GDI paradigm, which can be considered as one of the special cases of GDI. We provide theoretical proof of why GDI is better than GPI and how it works. Several practical algorithms based on GDI have been proposed to verify the effectiveness and extensiveness of it. Empirical experiments prove our state-of-the-art (SOTA) performance on Arcade Learning Environment (ALE), wherein our algorithm has achieved 9620.98% mean human normalized score (HNS), 1146.39% median HNS and 22 human world record breakthroughs (HWRB) using only 200 training frames. Our work aims to lead the RL research to step into the journey of conquering the human world records and seek real superhuman agents on both performance and efficiency.

algorithm, gdi-i 3, saber, (14 more...)

arXiv.org Artificial Intelligence

2106.06232

Country: Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.81)

Industry:

Leisure & Entertainment > Sports (0.69)
Leisure & Entertainment > Games > Computer Games (0.68)
Education (0.66)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.87)

Add feedback