AITopics | atari 2600

These counts are then used to compute a reward bonus according to the classic count-based exploration theory. We find that simple hash functions can achieve surprisingly good results on many challenging tasks. Furthermore, we show that a domain-dependent learned hash code may further improve these results.

artificial intelligence, machine learning, reinforcement learning, (15 more...)

Neural Information Processing Systems

Country:

North America > United States > California > Los Angeles County > Long Beach (0.04)
Europe > Belgium > Flanders (0.04)
Asia > Middle East > Jordan (0.04)
Asia > Afghanistan > Parwan Province > Charikar (0.04)

Genre: Research Report (0.46)

Industry: Leisure & Entertainment > Games (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Add feedback

Reconciling λ-Returns with Experience Replay

Brett Daley, Christopher Amato

Neural Information Processing SystemsOct-3-2025, 08:18:50 GMT

A unique benefit to this approach is that each transition's TD error can be

arxiv preprint arxiv, learning, reinforcement learning, (13 more...)

Neural Information Processing Systems

Country:

North America > United States > Massachusetts > Suffolk County > Boston (0.04)
North America > Canada (0.04)

Genre: Research Report (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Add feedback

Export Reviews, Discussions, Author Feedback and Meta-Reviews

Neural Information Processing SystemsOct-3-2025, 00:07:09 GMT

This paper studies a number of variations on the topic of training a deep network using data generated by a Monte-Carlo Tree Search (MCTS) agent. The paper focuses on the Atari 2600 platform and is motivated by the observation that, while MCTS performs extremely well on Atari 2600 games, it is also too computationally expensive to be used in a realistic setting. The authors provide empirical results on a number of Atari 2600 games.

cc paperinformation reviewerinstruction, deep network, trajectory, (10 more...)

Neural Information Processing Systems

Country: North America > Canada > Quebec > Montreal (0.04)

Genre: Research Report > New Finding (0.47)

Industry: Leisure & Entertainment > Games (0.53)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (0.35)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.32)

Add feedback

over the local

Neural Information Processing SystemsAug-15-2025, 13:29:03 GMT

We provide a visual overview of all eight environments considered in Figure 7. CartPole

executor, graph, transition model, (13 more...)

Neural Information Processing Systems

Industry: Leisure & Entertainment (0.69)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.71)

Add feedback

Adaptive Action Duration with Contextual Bandits for Deep Reinforcement Learning in Dynamic Environments

Verma, Abhishek, V, Nallarasan, Ravindran, Balaraman

arXiv.org Artificial IntelligenceJul-2-2025

Deep Reinforcement Learning (DRL) has achieved remarkable success in complex sequential decision-making tasks, such as playing Atari 2600 games and mastering board games. A critical yet underexplored aspect of DRL is the temporal scale of action execution. We propose a novel paradigm that integrates contextual bandits with DRL to adaptively select action durations, enhancing policy flexibility and computational efficiency. Our approach augments a Deep Q-Network (DQN) with a contextual bandit module that learns to choose optimal action repetition rates based on state contexts. Experiments on Atari 2600 games demonstrate significant performance improvements over static duration baselines, highlighting the efficacy of adaptive temporal abstractions in DRL. This paradigm offers a scalable solution for real-time applications like gaming and robotics, where dynamic action durations are critical.

artificial intelligence, deep reinforcement learning, machine learning, (11 more...)

arXiv.org Artificial Intelligence

2507.0003

Country: Asia > India > Tamil Nadu (0.15)

Genre: Research Report (0.40)

Industry: Leisure & Entertainment > Games > Computer Games (0.30)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

ChatGPT gets 'wrecked' by a simple 1977 Atari chess program

PCWorldJun-11-2025, 16:12:37 GMT

Despite ever-growing interest in AI tools and assistants, it's worth remembering that they're still quite limited with numerous shortcomings. They are not as smart as they might seem on the surface. Case in point, ChatGPT is pretty useless when it comes to playing chess. As reported by Futurism, ChatGPT lost a chess game against the classic Atari 2600 gaming console. Robert Caruso, an engineer at Citrix, organised the game between the AI and a simple 1977 chess program released for the Atari 2600.

chess program, large language model, machine learning, (9 more...)

PCWorld

Industry: Leisure & Entertainment > Games > Chess (1.00)

Technology: