AITopics | exploration game

Collaborating Authors

exploration game

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Playing hard exploration games by watching YouTube

Yusuf Aytar, Tobias Pfaff, David Budden, Thomas Paine, Ziyu Wang, Nando de Freitas

Neural Information Processing SystemsNov-20-2025, 23:26:03 GMT

Deep reinforcement learning methods traditionally struggle with tasks where environment rewards are particularly sparse.

artificial intelligence, machine learning, reinforcement learning, (17 more...)

Neural Information Processing Systems

Country:

North America > United States > Illinois > Cook County > Chicago (0.04)
North America > Canada > Quebec > Montreal (0.04)
Europe > United Kingdom > England > Greater London > London (0.04)

Industry: Leisure & Entertainment > Games (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Review for NeurIPS paper: Non-Crossing Quantile Regression for Distributional Reinforcement Learning

Neural Information Processing SystemsJan-27-2025, 20:55:50 GMT

Weaknesses: - Baseline algorithm: While all quantile-based distributional RL algorithms suffer from the crossing quantile issue, QR-DQN is the least affected one since the quantiles are uniformly fixed. IQN[1], which uses randomly sampled quantiles, and FQF[2], which optimizes over chosen quantiles for better distribution approximation, are both expected to suffer much more from crossing quantiles than QR-DQN. While it may be non-trivial to adapt NC architecture to IQN since the quantiles are randommly sampled, it shouldn't be hard to adapt to FQF. Besides, IQN and FQF both have achieved much higher scores than QR-DQN, hence I believe implementing NC architecture on IQN and FQF would greatly strenghthen empirical validations. Can authors explain why only 49 out of 57 games are used for evaluation? - Number of quantiles: I believe that N 100 quantiles is a reasonable choice.

distributional reinforcement learning, non-crossing quantile regression, quantile, (7 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.78)

Add feedback

Ensuring AI works with the right dose of curiosity

#artificialintelligenceNov-10-2022, 17:14:55 GMT

Friday night has rolled around, and you're trying to pick a restaurant for dinner. Should you visit your most beloved watering hole or try a new establishment, in the hopes of discovering something superior? Potentially, but that curiosity comes with a risk: If you explore the new option, the food could be worse. On the flip side, if you stick with what you know works well, you won't grow out of your narrow pathway. Curiosity drives artificial intelligence to explore the world, now in boundless use cases -- autonomous navigation, robotic decision-making, optimizing health outcomes, and more.

agent, algorithm, curiosity, (13 more...)

#artificialintelligence

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.40)
North America > United States > California (0.05)
Europe > Germany > Hesse > Darmstadt Region > Darmstadt (0.05)

Industry:

Health & Medicine (0.70)
Government > Military (0.50)

Technology:

Information Technology > Artificial Intelligence > Robots (0.36)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.32)

Add feedback

The Joy of Walking in Games

WIREDAug-27-2021, 12:00:00 GMT

When the world locked down, I chose to walk. In a world that was slowly closing in, wandering the vast landscapes of walking simulator games felt like a release. I immersed myself in the lives of others: people who were on journeys of their own, my outer and inner worlds blending into one. I haven't been alone, either. People turned to video games in droves during the pandemic, and game companies recorded record profits.

exploration game, simulator, yaughton, (5 more...)

WIRED

Country:

Europe > United Kingdom > England > Shropshire (0.06)
North America > United States > Wyoming (0.05)
Europe > United Kingdom > Scotland (0.05)

Industry: Leisure & Entertainment > Games > Computer Games (1.00)

Technology: Information Technology > Artificial Intelligence (0.35)

Add feedback

Temporal Difference Uncertainties as a Signal for Exploration

Flennerhag, Sebastian, Wang, Jane X., Sprechmann, Pablo, Visin, Francesco, Galashov, Alexandre, Kapturowski, Steven, Borsa, Diana L., Heess, Nicolas, Barreto, Andre, Pascanu, Razvan

arXiv.org Artificial IntelligenceOct-5-2020

An effective approach to exploration in reinforcement learning is to rely on an agent's uncertainty over the optimal policy, which can yield near-optimal exploration strategies in tabular settings. However, in non-tabular settings that involve function approximators, obtaining accurate uncertainty estimates is almost as challenging a problem. In this paper, we highlight that value estimates are easily biased and temporally inconsistent. In light of this, we propose a novel method for estimating uncertainty over the value function that relies on inducing a distribution over temporal difference errors. This exploration signal controls for state-action transitions so as to isolate uncertainty in value that is due to uncertainty over the agent's parameters. Instead, we incorporate it as an intrinsic reward and treat exploration as a separate learning problem, induced by the agent's temporal difference uncertainties. We introduce a distinct exploration policy that learns to collect data with high estimated uncertainty, which gives rise to a "curriculum" that smoothly changes throughout learning and vanishes in the limit of perfect value estimates. We evaluate our method on hard-exploration tasks, including Deep Sea and Atari 2600 environments and find that our proposed form of exploration facilitates both diverse and deep exploration. Striking the right balance between exploration and exploitation is fundamental to the reinforcement learning problem. A common approach is to derive exploration from the policy being learned. Dithering strategies, such as ɛ-greedy exploration, render a reward-maximising policy stochastic around its reward maximising behaviour (Williams & Peng, 1991). Other methods encourage higher entropy in the policy (Ziebart et al., 2008), introduce an intrinsic reward (Singh et al., 2005), or drive exploration by sampling from the agent's belief over the MDP (Strens, 2000). While greedy or entropy-maximising policies cannot facilitate temporally extended exploration (Osband et al., 2013; 2016a), the efficacy of intrinsic rewards depends crucially on how they relate to the extrinsic reward that comes from the environment (Burda et al., 2018a).

artificial intelligence, exploration, upstream oil & gas, (17 more...)

arXiv.org Artificial Intelligence

2010.02255

Country: North America > Canada > Alberta (0.14)

Genre: Research Report (1.00)

Industry:

Leisure & Entertainment > Sports (0.68)
Energy > Oil & Gas > Upstream (0.54)
Education > Focused Education > Special Education (0.44)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Playing hard exploration games by watching YouTube

Aytar, Yusuf, Pfaff, Tobias, Budden, David, Paine, Thomas, Wang, Ziyu, Freitas, Nando de

Neural Information Processing SystemsDec-31-2018

Deep reinforcement learning methods traditionally struggle with tasks where environment rewards are particularly sparse. One successful method of guiding exploration in these domains is to imitate trajectories provided by a human demonstrator. However, these demonstrations are typically collected under artificial conditions, i.e. with access to the agent’s exact environment setup and the demonstrator’s action and reward trajectories. Here we propose a method that overcomes these limitations in two stages. First, we learn to map unaligned videos from multiple sources to a common representation using self-supervised objectives constructed over both time and modality (i.e. vision and sound). Second, we embed a single YouTube video in this representation to learn a reward function that encourages an agent to imitate human gameplay. This method of one-shot imitation allows our agent to convincingly exceed human-level performance on the infamously hard exploration games Montezuma’s Revenge, Pitfall! and Private Eye for the first time, even if the agent is not presented with any environment rewards.

artificial intelligence, machine learning, reinforcement learning, (18 more...)

Neural Information Processing Systems

Country: North America > United States (0.28)

Industry: Leisure & Entertainment > Games (1.00)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Playing hard exploration games by watching YouTube

Aytar, Yusuf, Pfaff, Tobias, Budden, David, Paine, Thomas, Wang, Ziyu, Freitas, Nando de

Neural Information Processing SystemsDec-31-2018

artificial intelligence, machine learning, reinforcement learning, (18 more...)

Neural Information Processing Systems

Country: North America > United States (0.28)

Industry: Leisure & Entertainment > Games (1.00)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

The Most Promising Indie Games That Showed Up at E3, From 'Sable' to 'NeoCab'

WIREDJun-18-2018, 14:17:15 GMT

E3 is widely considered a conference for big games, and understandably so; the largest publishers in the industry dominate the event, debuting trailers and news for the most expensive and expansive videogames they could possibly produce. But it's not impossible to find compelling independent games at the show, either: here are our picks for five that you'll want to keep your eyes on in the months to come. NeoCab is a game about the emotional labor of the gig economy, in a moody cyberpunk futurescape. You play one of the last human cab drivers, competing against an army of automated cars. The narrative forces you to balance the emotional health of your protagonist with the brutal needs of the job, as you struggle to barely--just barely--eke out a living.

artificial intelligence, neocab, promising indie game, (7 more...)

WIRED

Industry: Leisure & Entertainment > Games > Computer Games (0.92)

Technology: Information Technology > Artificial Intelligence > Games > Computer Games (0.42)

Add feedback

Fair Information Sharing for Treasure Hunting

Chen, Yiling (Harvard University) | Nissim, Kobbi (Ben-Gurion University and Harvard University ) | Waggoner, Bo (Harvard University)

AAAI ConferencesMar-6-2015

In a search task, a group of agents compete to be the first to find the solution. Each agent has different private information to incorporate into its search. This problem is inspired by settings such as scientific research, Bitcoin hash inversion, or hunting for some buried treasure. A social planner such as a funding agency, mining pool, or pirate captain might like to convince the agents to collaborate, share their information, and greatly reduce the cost of searching. However, this cooperation is in tension with the individuals' competitive desire to each be the first to win the search. The planner's proposal should incentivize truthful information sharing, reduce the total cost of searching, and satisfy fairness properties that preserve the spirit of the competition. We design contract-based mechanisms for information sharing without money. The planner solicits the agents' information and assigns search locations to the agents, who may then search only within their assignments. Truthful reporting of information to the mechanism maximizes an agent's chance to win the search. Epsilon-voluntary participation is satisfied for large search spaces. In order to formalize the planner's goals of fairness and reduced search cost, we propose a simplified, simulated game as a benchmark and quantify fairness and search cost relative to this benchmark scenario. The game is also used to implement our mechanisms. Finally, we extend to the case where coalitions of agents may participate in the mechanism, forming larger coalitions recursively.

agent, artificial intelligence, mechanism, (17 more...)

AAAI Conferences

Twenty-Ninth AAAI Conference on Artificial Intelligence

Country: North America > United States (0.46)

Industry: Leisure & Entertainment > Games (0.46)

Technology:

Information Technology > Communications > Collaboration (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)

Add feedback