AITopics | Şimşek, Özgür

Collaborating Authors

Şimşek, Özgür

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Interpretability in Action: Exploratory Analysis of VPT, a Minecraft Agent

Jucys, Karolis, Adamopoulos, George, Hamidi, Mehrab, Milani, Stephanie, Samsami, Mohammad Reza, Zholus, Artem, Joseph, Sonia, Richards, Blake, Rish, Irina, Şimşek, Özgür

arXiv.org Artificial IntelligenceJul-16-2024

Understanding the mechanisms behind decisions taken by large foundation models in sequential decision making tasks is critical to ensuring that such systems operate transparently and safely. In this work, we perform exploratory analysis on the Video PreTraining (VPT) Minecraft playing agent, one of the largest open-source vision-based agents. We aim to illuminate its reasoning mechanisms by applying various interpretability techniques. First, we analyze the attention mechanism while the agent solves its training task - crafting a diamond pickaxe. The agent pays attention to the last four frames and several key-frames further back in its six-second memory. This is a possible mechanism for maintaining coherence in a task that takes 3-10 minutes, despite the short memory span. Secondly, we perform various interventions, which help us uncover a worrying case of goal misgeneralization: VPT mistakenly identifies a villager wearing brown clothes as a tree trunk when the villager is positioned stationary under green tree leaves, and punches it to death.

large language model, machine learning, reinforcement learning, (19 more...)

arXiv.org Artificial Intelligence

2407.12161

Country:

North America > United States (0.28)
North America > Canada > Quebec > Montreal (0.14)

Genre: Research Report (0.65)

Industry: Leisure & Entertainment > Games > Computer Games (0.74)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.89)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.69)
(2 more...)

Add feedback

Colour versus Shape Goal Misgeneralization in Reinforcement Learning: A Case Study

Ramanauskas, Karolis, Şimşek, Özgür

arXiv.org Artificial IntelligenceDec-5-2023

We explore colour versus shape goal misgeneralization originally demonstrated by Di Langosco et al. (2022) in the Procgen Maze environment, where, given an ambiguous choice, the agents seem to prefer generalization based on colour rather than shape. After training over 1,000 agents in a simplified version of the environment and evaluating them on over 10 million episodes, we conclude that the behaviour can be attributed to the agents learning to detect the goal object through a specific colour channel. This choice is arbitrary. Additionally, we show how, due to underspecification, the preferences can change when retraining the agents using exactly the same procedure except for using a different random seed for the training run. Finally, we demonstrate the existence of outliers in out-of-distribution behaviour based on training random seed alone.

artificial intelligence, machine learning, reinforcement learning, (14 more...)

arXiv.org Artificial Intelligence

2312.03762

Country: Europe > United Kingdom (0.14)

Genre: Research Report (0.64)

Industry: Health & Medicine > Therapeutic Area (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.83)

Add feedback

Creating Multi-Level Skill Hierarchies in Reinforcement Learning

Evans, Joshua B., Şimşek, Özgür

arXiv.org Artificial IntelligenceJun-16-2023

What is a useful skill hierarchy for an autonomous agent? We propose an answer based on the graphical structure of an agent's interaction with its environment. Our approach uses hierarchical graph partitioning to expose the structure of the graph at varying timescales, producing a skill hierarchy with multiple levels of abstraction. At each level of the hierarchy, skills move the agent between regions of the state space that are well connected within themselves but weakly connected to each other. We illustrate the utility of the proposed skill hierarchy in a wide variety of domains in the context of reinforcement learning.

artificial intelligence, machine learning, reinforcement learning, (16 more...)

arXiv.org Artificial Intelligence

2306.0998

Country:

Europe > United Kingdom > England (0.28)
North America > United States > California > San Francisco County > San Francisco (0.14)

Genre: Research Report (0.82)

Industry: Transportation > Passenger (0.48)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Explaining Reinforcement Learning with Shapley Values

Beechey, Daniel, Smith, Thomas M. S., Şimşek, Özgür

arXiv.org Artificial IntelligenceJun-9-2023

For reinforcement learning systems to be widely adopted, their users must understand and trust them. We present a theoretical analysis of explaining reinforcement learning using Shapley values, following a principled approach from game theory for identifying the contribution of individual players to the outcome of a cooperative game. We call this general framework Shapley Values for Explaining Reinforcement Learning (SVERL). Our analysis exposes the limitations of earlier uses of Shapley values in reinforcement learning. We then develop an approach that uses Shapley values to explain agent performance. In a variety of domains, SVERL produces meaningful explanations that match and supplement human intuition.

artificial intelligence, machine learning, reinforcement learning, (15 more...)

arXiv.org Artificial Intelligence

2306.0581

Country: North America > United States > Hawaii (0.14)

Genre: Research Report (1.00)

Industry:

Leisure & Entertainment > Games (0.87)
Transportation > Ground > Road (0.47)

Technology:

Information Technology > Game Theory (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Resource-Constrained Station-Keeping for Helium Balloons using Reinforcement Learning

Saunders, Jack, Prenevost, Loïc, Şimşek, Özgür, Hunter, Alan, Li, Wenbin

arXiv.org Artificial IntelligenceMar-2-2023

High altitude balloons have proved useful for ecological aerial surveys, atmospheric monitoring, and communication relays. However, due to weight and power constraints, there is a need to investigate alternate modes of propulsion to navigate in the stratosphere. Very recently, reinforcement learning has been proposed as a control scheme to maintain the balloon in the region of a fixed location, facilitated through diverse opposing wind-fields at different altitudes. Although air-pump based station keeping has been explored, there is no research on the control problem for venting and ballasting actuated balloons, which is commonly used as a low-cost alternative. We show how reinforcement learning can be used for this type of balloon. Specifically, we use the soft actor-critic algorithm, which on average is able to station-keep within 50\;km for 25\% of the flight, consistent with state-of-the-art. Furthermore, we show that the proposed controller effectively minimises the consumption of resources, thereby supporting long duration flights. We frame the controller as a continuous control reinforcement learning problem, which allows for a more diverse range of trajectories, as opposed to current state-of-the-art work, which uses discrete action spaces. Furthermore, through continuous control, we can make use of larger ascent rates which are not possible using air-pumps. The desired ascent-rate is decoupled into desired altitude and time-factor to provide a more transparent policy, compared to low-level control commands used in previous works. Finally, by applying the equations of motion, we establish appropriate thresholds for venting and ballasting to prevent the agent from exploiting the environment. More specifically, we ensure actions are physically feasible by enforcing constraints on venting and ballasting.

balloon, machine learning, reinforcement learning, (14 more...)

arXiv.org Artificial Intelligence

2303.01173

Country: North America > United States (1.00)

Genre: Research Report (0.64)

Industry:

Energy > Renewable > Solar (0.47)
Energy > Oil & Gas > Upstream (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

The Game of Tetris in Machine Learning

Algorta, Simón, Şimşek, Özgür

arXiv.org Artificial IntelligenceMay-10-2019

The game of Tetris is an important benchmark for research in artificial intelligence and machine learning. This paper provides a historical account of the algorithmic developments in Tetris and discusses open challenges. Handcrafted controllers, genetic algorithms, and reinforcement learning have all contributed to good solutions. However, existing solutions fall far short of what can be achieved by expert players playing without time pressure. Further study of the game has the potential to contribute to important areas of research, including feature discovery, autonomous learning of action hierarchies, and sample-efficient reinforcement learning.

artificial intelligence, computer game, tetris, (19 more...)

arXiv.org Artificial Intelligence

1905.01652

Country:

Oceania > Australia (0.14)
Europe > United Kingdom (0.14)
Europe > Russia (0.14)
(2 more...)

Genre: Research Report (0.40)

Industry: Leisure & Entertainment > Games > Computer Games (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Evolutionary Systems (1.00)

Add feedback

Learning From Small Samples: An Analysis of Simple Decision Heuristics

Şimşek, Özgür, Buckmann, Marcus

Neural Information Processing SystemsDec-31-2015

Simple decision heuristics are models of human and animal behavior that use few pieces of information--perhaps only a single piece of information--and integrate the pieces in simple ways, for example, by considering them sequentially, one at a time, or by giving them equal weight. We focus on three families of heuristics: single-cue decision making, lexicographic decision making, and tallying. It is unknown how quickly these heuristics can be learned from experience. We show, analytically and empirically, that substantial progress in learning can be made with just a few training samples. When training samples are very few, tallying performs substantially better than the alternative methods tested. Our empirical analysis is the most extensive to date, employing 63 natural data sets on diverse subjects.

accuracy, air transportation, inductive learning, (19 more...)

Neural Information Processing Systems

Country:

Europe > United Kingdom > England (0.14)
North America > United States > Arizona (0.14)

Genre: Research Report (0.30)

Industry: Transportation > Air (0.94)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.98)
Information Technology > Artificial Intelligence > Machine Learning > Supervised Learning (0.72)

Add feedback

Linear decision rule as aspiration for simple decision heuristics

Şimşek, Özgür

Neural Information Processing SystemsDec-31-2013

Many attempts to understand the success of simple decision heuristics have examined heuristics as an approximation to a linear decision rule. This research has identified three environmental structures that aid heuristics: dominance, cumulative dominance, and noncompensatoriness. Here, we further develop these ideas and examine their empirical relevance in 51 natural environments. We find that all three structures are prevalent, making it possible for some simple rules to reach the accuracy levels of the linear decision rule using less information.

artificial intelligence, decision rule, machine learning, (18 more...)

Neural Information Processing Systems

Country: Europe > Germany (0.14)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Skill Characterization Based on Betweenness

Şimşek, Özgür, Barto, Andrew G.

Neural Information Processing SystemsDec-31-2009

We present a characterization of a useful class of skills based on a graphical representation ofan agent's interaction with its environment. Our characterization uses betweenness, a measure of centrality on graphs. It captures and generalizes (at least intuitively) the bottleneck concept, which has inspired many of the existing skill-discovery algorithms. Our characterization may be used directly to form a set of skills suitable for a given task. More importantly, it serves as a useful guide for developing incremental skill-discovery algorithms that do not rely on knowing or representing the interaction graph in its entirety.

artificial intelligence, betweenness, book review, (18 more...)

Neural Information Processing Systems

Country: North America > United States > Massachusetts > Hampshire County > Amherst (0.14)

Genre:

Summary/Review (0.54)
Research Report (0.47)

Industry:

Transportation > Passenger (0.35)
Leisure & Entertainment > Games (0.30)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.94)

Add feedback