AITopics | bottleneck state

Collaborating Authors

bottleneck state

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Successor-Predecessor Intrinsic Exploration Changmin Y u 1,2 Neil Burgess

Neural Information Processing SystemsFeb-17-2026, 17:12:30 GMT

artificial intelligence, machine learning, reinforcement learning, (18 more...)

Neural Information Processing Systems

Country:

North America > United States (0.04)
Europe > United Kingdom > England > Greater London > London (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Industry: Leisure & Entertainment > Games (0.47)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)

Add feedback

e6f2b968c4ee8ba260cd7077e39590dd-Paper-Conference.pdf

Neural Information Processing SystemsOct-9-2025, 10:19:55 GMT

artificial intelligence, machine learning, reinforcement learning, (18 more...)

Neural Information Processing Systems

Country:

North America > United States (0.04)
Europe > United Kingdom > England > Greater London > London (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Industry: Leisure & Entertainment > Games (0.47)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)

Add feedback

Inferring Implicit Goals Across Differing Task Models

Tulli, Silvia, Vasileiou, Stylianos Loukas, Chetouani, Mohamed, Sreedharan, Sarath

arXiv.org Artificial IntelligenceJan-29-2025

This should be all well and good, provided value-aligned behavior is to not only account for the human bottleneck states are also bottleneck states for the the specified user objectives but also any implicit agent. Otherwise, the agent must make an effort to figure out or unspecified user requirements. The existence what the user's underlying subgoals may be. of such implicit requirements could be particularly To see how such problems may arise, consider an agent common in settings where the user's understanding tasked with guiding a tourist to a famous art museum. The of the task model may differ from the agent's estimate tourist simply says, "Get me a plan to get to the art museum," of the model. Under this scenario, the user unaware of the city's metro system and expecting an may incorrectly expect some agent behavior to be above-ground route passing certain landmarks. The agent, inevitable or guaranteed. This paper addresses such however, might plan a route using the metro system. For the expectation mismatch in the presence of differing agent's metro route, bottlenecks migh include entering the models by capturing the possibility of unspecified metro, making transfers, and exiting at the correct station.

bottleneck state, implicit subgoal, subgoal, (16 more...)

arXiv.org Artificial Intelligence

2501.17704

Country:

Oceania > Australia > Victoria > Melbourne (0.04)
North America > United States > Colorado (0.04)
North America > United States > California (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (1.00)

Add feedback

Subgoal Discovery Using a Free Energy Paradigm and State Aggregations

Mesbah, Amirhossein, Hosseini, Reshad, Shariatpanahi, Seyed Pooya, Ahmadabadi, Majid Nili

arXiv.org Artificial IntelligenceDec-21-2024

Reinforcement learning (RL) plays a major role in solving complex sequential decision-making tasks. Hierarchical and goal-conditioned RL are promising methods for dealing with two major problems in RL, namely sample inefficiency and difficulties in reward shaping. These methods tackle the mentioned problems by decomposing a task into simpler subtasks and temporally abstracting a task in the action space. One of the key components for task decomposition of these methods is subgoal discovery. We can use the subgoal states to define hierarchies of actions and also use them in decomposing complex tasks. Under the assumption that subgoal states are more unpredictable, we propose a free energy paradigm to discover them. This is achieved by using free energy to select between two spaces, the main space and an aggregation space. The $model \; changes$ from neighboring states to a given state shows the unpredictability of a given state, and therefore it is used in this paper for subgoal discovery. Our empirical results on navigation tasks like grid-world environments show that our proposed method can be applied for subgoal discovery without prior knowledge of the task. Our proposed method is also robust to the stochasticity of environments.

data mining, machine learning, reinforcement learning, (16 more...)

arXiv.org Artificial Intelligence

2412.16687

Genre: Research Report (1.00)

Industry:

Education (0.46)
Health & Medicine (0.46)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)

Add feedback

JUICER: Data-Efficient Imitation Learning for Robotic Assembly

Ankile, Lars, Simeonov, Anthony, Shenfeld, Idan, Agrawal, Pulkit

arXiv.org Artificial IntelligenceApr-9-2024

While learning from demonstrations is powerful for acquiring visuomotor policies, high-performance imitation without large demonstration datasets remains challenging for tasks requiring precise, long-horizon manipulation. This paper proposes a pipeline for improving imitation learning performance with a small human demonstration budget. We apply our approach to assembly tasks that require precisely grasping, reorienting, and inserting multiple parts over long horizons and multiple task phases. Our pipeline combines expressive policy architectures and various techniques for dataset expansion and simulation-based data augmentation. These help expand dataset support and supervise the model with locally corrective actions near bottleneck regions requiring high precision. We demonstrate our pipeline on four furniture assembly tasks in simulation, enabling a manipulator to assemble up to five parts over nearly 2500 time steps directly from RGB images, outperforming imitation and data augmentation baselines. Project website: https://imitation-juicer.github.io/.

arxiv, demonstration, learning, (16 more...)

arXiv.org Artificial Intelligence

2404.03729

Country:

North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.04)
North America > United States > New York > Richmond County > New York City (0.04)
North America > United States > New York > Queens County > New York City (0.04)
(7 more...)

Genre: Research Report > New Finding (0.68)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Successor-Predecessor Intrinsic Exploration

Yu, Changmin, Burgess, Neil, Sahani, Maneesh, Gershman, Samuel J.

arXiv.org Artificial IntelligenceJan-25-2024

Exploration is essential in reinforcement learning, particularly in environments where external rewards are sparse. Here we focus on exploration with intrinsic rewards, where the agent transiently augments the external rewards with self-generated intrinsic rewards. Although the study of intrinsic rewards has a long history, existing methods focus on composing the intrinsic reward based on measures of future prospects of states, ignoring the information contained in the retrospective structure of transition sequences. Here we argue that the agent can utilise retrospective information to generate explorative behaviour with structure-awareness, facilitating efficient exploration based on global instead of local information. We propose Successor-Predecessor Intrinsic Exploration (SPIE), an exploration algorithm based on a novel intrinsic reward combining prospective and retrospective information. We show that SPIE yields more efficient and ethologically plausible exploratory behaviour in environments with sparse rewards and bottleneck states than competing methods. We also implement SPIE in deep reinforcement learning agents, and show that the resulting agent achieves stronger empirical performance than existing methods on sparse-reward Atari games.

exploration, information, intrinsic reward, (16 more...)

arXiv.org Artificial Intelligence

2305.15277

Country:

North America > United States (0.04)
Europe > United Kingdom > England > Greater London > London (0.04)
Europe > Netherlands > North Holland > Amsterdam (0.04)

Genre: Research Report (0.40)

Industry: Leisure & Entertainment > Games > Computer Games (0.55)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

MO2: Model-Based Offline Options

Salter, Sasha, Wulfmeier, Markus, Tirumala, Dhruva, Heess, Nicolas, Riedmiller, Martin, Hadsell, Raia, Rao, Dushyant

arXiv.org Artificial IntelligenceSep-5-2022

The ability to discover useful behaviours from past experience and transfer them to new tasks is considered a core component of natural embodied intelligence. Inspired by neuroscience, discovering behaviours that switch at bottleneck states have been long sought after for inducing plans of minimum description length across tasks. Prior approaches have either only supported online, on-policy, bottleneck state discovery, limiting sample-efficiency, or discrete state-action domains, restricting applicability. To address this, we introduce Model-Based Offline Options (MO2), an offline hindsight framework supporting sample-efficient bottleneck option discovery over continuous state-action spaces. Once bottleneck options are learnt offline over source domains, they are transferred online to improve exploration and value estimation on the transfer domain. Our experiments show that on complex long-horizon continuous control tasks with sparse, delayed rewards, MO2's properties are essential and lead to performance exceeding recent option learning methods. Additional ablations further demonstrate the impact on option predictability and credit assignment.

abstraction, arxiv preprint arxiv, equation, (13 more...)

arXiv.org Artificial Intelligence

2209.01947

Country: Europe > United Kingdom > England > Greater London > London (0.04)

Genre: Research Report (0.82)

Industry: Education > Educational Setting (0.47)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)

Add feedback

Resource-rational Task Decomposition to Minimize Planning Costs

Correa, Carlos G., Ho, Mark K., Callaway, Fred, Griffiths, Thomas L.

arXiv.org Artificial IntelligenceJul-27-2020

People often plan hierarchically. That is, rather than planning over a monolithic representation of a task, they decompose the task into simpler subtasks and then plan to accomplish those. Although much work explores how people decompose tasks, there is less analysis of why people decompose tasks in the way they do. Here, we address this question by formalizing task decomposition as a resource-rational representation problem. Specifically, we propose that people decompose tasks in a manner that facilitates efficient use of limited cognitive resources given the structure of the environment and their own planning algorithms. Using this model, we replicate several existing findings. Our account provides a normative explanation for how people identify subtasks as well as a framework for studying how people reason, plan, and act using resource-rational representations.

artificial intelligence, machine learning, planning & scheduling, (17 more...)

arXiv.org Artificial Intelligence

2007.13862

Country:

Asia > Vietnam > Hanoi > Hanoi (0.05)
North America > United States > New Jersey > Mercer County > Princeton (0.04)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (0.95)
Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (0.88)

Add feedback

How to Avoid Being Eaten by a Grue: Structured Exploration Strategies for Textual Worlds

Ammanabrolu, Prithviraj, Tien, Ethan, Hausknecht, Matthew, Riedl, Mark O.

arXiv.org Artificial IntelligenceJun-12-2020

Text-based games are long puzzles or quests, characterized by a sequence of sparse and potentially deceptive rewards. They provide an ideal platform to develop agents that perceive and act upon the world using a combinatorially sized natural language state-action space. Standard Reinforcement Learning agents are poorly equipped to effectively explore such spaces and often struggle to overcome bottlenecks---states that agents are unable to pass through simply because they do not see the right action sequence enough times to be sufficiently reinforced. We introduce Q*BERT, an agent that learns to build a knowledge graph of the world by answering questions, which leads to greater sample efficiency. To overcome bottlenecks, we further introduce MC!Q*BERT an agent that uses an knowledge-graph-based intrinsic motivation to detect bottlenecks and a novel exploration strategy to efficiently learn a chain of policy modules to overcome them. We present an ablation study and results demonstrating how our method outperforms the current state-of-the-art on nine text games, including the popular game, Zork, where, for the first time, a learning agent gets past the bottleneck where the player is eaten by a Grue.

machine learning, natural language, reinforcement learning, (17 more...)

arXiv.org Artificial Intelligence

2006.07409

Country: North America > United States > Texas > Travis County > Austin (0.04)

Genre:

Workflow (0.88)
Research Report (0.70)

Industry: Leisure & Entertainment > Games (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Add feedback

Hierarchical model-based policy optimization: from actions to action sequences and back

McNamee, Daniel

arXiv.org Artificial IntelligenceNov-28-2019

We develop a normative framework for hierarchical model-based policy optimization based on applying second-order methods in the space of all possible state-action paths. The resulting natural path gradient performs policy updates in a manner which is sensitive to the long-range correlational structure of the induced stationary state-action densities. We demonstrate that the natural path gradient can be computed exactly given an environment dynamics model and depends on expressions akin to higher-order successor representations. In simulation, we show that the priorization of local policy updates in the resulting policy flow indeed reflects the intuitive state-space hierarchy in several toy problems.

action preference, gradient, policy optimization, (16 more...)

arXiv.org Artificial Intelligence

1912.01448

Country:

Asia > Vietnam > Hanoi > Hanoi (0.05)
North America > Canada (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.69)

Add feedback