AITopics | subgoal generator

Collaborating Authors

subgoal generator

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

W(leaf,i) r+ γ V(s0) s env.RESET() solution [ ].List of actions N(leaf,i) 1 for 1 Lp do Q(leaf,i) W(leaf,i) actions PLANNER(s) function UPDATE(path, leaf)

Neural Information Processing SystemsApr-24-2026, 11:50:34 GMT

A.1 MCTS-kSubS algorithm In Algorithm 4 we present a general MCTS solver based on AlphaZero. Solver repeatedly queries the planner for a list of actions and executes them one by one. Baseline planner returns only a single action at a time, whereas MCTS-kSubS gives around kactions - to reach the desired subgoal (number of actions depends on a subgoal distance, which not always equals k in practice). MCTS-kSubS operates on a high-level subgoal graph: nodes are subgoals proposed by the generator (see Algorithm 3) and edges - lists of actions informing how to move from one subgoal to another (computed by the low-level conditional policy in Algorithm 2). The graph structure is represented by treevariable. For every subgoal, it keeps up to C3 best nearby subgoals (according to generator scores) along with a mentioned list of actions and sum of rewards to obtain while moving from the parent to the child subgoal. Most of MCTS implementation is shared between MCTS-kSubS and AlphaZero baseline, as we can treat the behavioral-cloning policy as a subgoal generator with k = 1. MCTS-kSubS and the baseline are encapsulated in GEN_CHILDREN function (Algorithms 5 and 6).

artificial intelligence, machine learning, subgoal, (17 more...)

Neural Information Processing Systems

Genre: Workflow (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.30)

Add feedback

05d8cccb5f47e5072f0a05b5f514941a-Paper.pdf

Neural Information Processing SystemsFeb-7-2026, 07:56:41 GMT

sokoban, subgoal, subgoal generator, (13 more...)

Neural Information Processing Systems

Country:

North America > Canada > Ontario > Toronto (0.14)
Europe > Switzerland > Zürich > Zürich (0.14)
Europe > Poland > Masovia Province > Warsaw (0.05)
(13 more...)

Genre: Research Report (0.46)

Industry: Leisure & Entertainment > Games (0.48)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (1.00)
(3 more...)

Add feedback

Subgoal-Guided Policy Heuristic Search with Learned Subgoals

Tuero, Jake, Buro, Michael, Lelis, Levi H. S.

arXiv.org Artificial IntelligenceDec-3-2025

Policy tree search is a family of tree search algorithms that use a policy to guide the search. These algorithms provide guarantees on the number of expansions required to solve a given problem that are based on the quality of the policy. While these algorithms have shown promising results, the process in which they are trained requires complete solution trajectories to train the policy. Search trajectories are obtained during a trial-and-error search process. When the training problem instances are hard, learning can be prohibitively costly, especially when starting from a randomly initialized policy. As a result, search samples are wasted in failed attempts to solve these hard instances. This paper introduces a novel method for learning subgoal-based policies for policy tree search algorithms. The subgoals and policies conditioned on subgoals are learned from the trees that the search expands while attempting to solve problems, including the search trees of failed attempts. We empirically show that our policy formulation and training method improve the sample efficiency of learning a policy and heuristic function in this online setting.

algorithm, artificial intelligence, subgoal-guided policy heuristic search, (14 more...)

arXiv.org Artificial Intelligence

2506.07255

Country: North America > Canada > Alberta (0.28)

Genre: Research Report > Promising Solution (0.34)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)

Add feedback

Subgoal Search For Complex Reasoning Tasks

Neural Information Processing SystemsOct-1-2025, 22:03:43 GMT

In this paper, we implement kSubS using a transformer-based subgoal module coupled with the classical best-first search framework.

subgoal, subgoal generator, subgoal search, (13 more...)

Neural Information Processing Systems

Country:

North America > Canada > Ontario > Toronto (0.14)
Europe > Switzerland > Zürich > Zürich (0.14)
Europe > Poland > Masovia Province > Warsaw (0.05)
(13 more...)

Genre: Research Report (0.46)

Industry: Leisure & Entertainment > Games (0.48)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (1.00)
(3 more...)

Add feedback

VSC-RL: Advancing Autonomous Vision-Language Agents with Variational Subgoal-Conditioned Reinforcement Learning

Wu, Qingyuan, Liu, Jianheng, Hao, Jianye, Wang, Jun, Shao, Kun

arXiv.org Artificial IntelligenceFeb-11-2025

State-of-the-art (SOTA) reinforcement learning (RL) methods enable the vision-language agents to learn from interactions with the environment without human supervision. However, they struggle with learning inefficiencies in tackling real-world complex sequential decision-making tasks, especially with sparse reward signals and long-horizon dependencies. To effectively address the issue, we introduce Variational Subgoal-Conditioned RL (VSC-RL), which reformulates the vision-language sequential decision-making task as a variational goal-conditioned RL problem, allowing us to leverage advanced optimization methods to enhance learning efficiency. Specifically, VSC-RL optimizes the SubGoal Evidence Lower BOund (SGC-ELBO), which consists of (a) maximizing the subgoal-conditioned return via RL and (b) minimizing the subgoal-conditioned difference with the reference policy. We theoretically demonstrate that SGC-ELBO is equivalent to the original optimization objective, ensuring improved learning efficiency without sacrificing performance guarantees. Additionally, for real-world complex decision-making tasks, VSC-RL leverages the vision-language model to autonomously decompose the goal into feasible subgoals, enabling efficient learning. Across various benchmarks, including challenging real-world mobile device control tasks, VSC-RL significantly outperforms the SOTA vision-language agents, achieving superior performance and remarkable improvement in learning efficiency.

machine learning, reinforcement learning, vsc-rl, (15 more...)

arXiv.org Artificial Intelligence

2502.07949

Country:

Asia > Japan > Honshū > Chūbu > Toyama Prefecture > Toyama (0.04)
Asia > China > Tianjin Province > Tianjin (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

SubgoalXL: Subgoal-based Expert Learning for Theorem Proving

Zhao, Xueliang, Zheng, Lin, Bo, Haige, Hu, Changran, Thakker, Urmish, Kong, Lingpeng

arXiv.org Artificial IntelligenceAug-20-2024

Formal theorem proving, a field at the intersection of mathematics and computer science, has seen renewed interest with advancements in large language models (LLMs). This paper introduces SubgoalXL, a novel approach that synergizes subgoal-based proofs with expert learning to enhance LLMs' capabilities in formal theorem proving within the Isabelle environment. SubgoalXL addresses two critical challenges: the scarcity of specialized mathematics and theorem-proving data, and the need for improved multi-step reasoning abilities in LLMs. Leveraging the Isabelle environment's advantages in subgoal-based proofs, SubgoalXL achieves a new state-of-the-art performance of 56.1% in Isabelle on the standard miniF2F dataset, marking an absolute improvement of 4.9%. Notably, SubgoalXL successfully solves 41 AMC12, 9 AIME, and 3 IMO problems from miniF2F. These results underscore the effectiveness of maximizing limited data utility and employing targeted guidance for complex reasoning in formal theorem proving, contributing to the ongoing advancement of AI reasoning capabilities. Formal theorem proving, a field at the intersection of mathematics and computer science, has flourished alongside the development of languages like Lean (de Moura et al., 2015) and Isabelle (Paulson, 1994). These two prominent communities have been instrumental in advancing the field's core challenge: mechanizing mathematical reasoning and proof verification (Li et al., 2020).

dataset, generator, informal proof, (13 more...)

arXiv.org Artificial Intelligence

2408.11172

Country:

Europe > Germany > Berlin (0.04)
Asia > Indonesia > Java > Yogyakarta > Yogyakarta (0.04)
Asia > China > Hong Kong (0.04)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Add feedback

Implicit Subgoal Planning with Variational Autoencoders for Long-Horizon Sparse Reward Robotic Tasks

Wang, Fangyuan, Duan, Anqing, Zhou, Peng, Huo, Shengzeng, Guo, Guodong, Yang, Chenguang, Navarro-Alarcon, David

arXiv.org Artificial IntelligenceDec-24-2023

The challenges inherent to long-horizon tasks in robotics persist due to the typical inefficient exploration and sparse rewards in traditional reinforcement learning approaches. To alleviate these challenges, we introduce a novel algorithm, Variational Autoencoder-based Subgoal Inference (VAESI), to accomplish long-horizon tasks through a divide-and-conquer manner. VAESI consists of three components: a Variational Autoencoder (VAE)-based Subgoal Generator, a Hindsight Sampler, and a Value Selector. The VAE-based Subgoal Generator draws inspiration from the human capacity to infer subgoals and reason about the final goal in the context of these subgoals. It is composed of an explicit encoder model, engineered to generate subgoals, and an implicit decoder model, designed to enhance the quality of the generated subgoals by predicting the final goal. Additionally, the Hindsight Sampler selects valid subgoals from an offline dataset to enhance the feasibility of the generated subgoals. The Value Selector utilizes the value function in reinforcement learning to filter the optimal subgoals from subgoal candidates. To validate our method, we conduct several long-horizon tasks in both simulation and the real world, including one locomotion task and three manipulation tasks. The obtained quantitative and qualitative data indicate that our approach achieves promising performance compared to other baseline methods. These experimental results can be seen in the website \url{https://sites.google.com/view/vaesi/home}.

algorithm, long-horizon task, subgoal, (16 more...)

arXiv.org Artificial Intelligence

2312.15578

Country:

North America > United States > Wisconsin > Dane County > Madison (0.14)
Asia > China > Zhejiang Province > Ningbo (0.04)
Asia > China > Heilongjiang Province > Harbin (0.04)
(15 more...)

Genre: Research Report (0.82)

Industry: Education (0.67)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Fast and Precise: Adjusting Planning Horizon with Adaptive Subgoal Search

Zawalski, Michał, Tyrolski, Michał, Czechowski, Konrad, Odrzygóźdź, Tomasz, Stachura, Damian, Piękos, Piotr, Wu, Yuhuai, Kuciński, Łukasz, Miłoś, Piotr

arXiv.org Artificial IntelligenceApr-5-2023

Complex reasoning problems contain states that vary in the computational cost required to determine a good action plan. Taking advantage of this property, we propose Adaptive Subgoal Search (AdaSubS), a search method that adaptively adjusts the planning horizon. To this end, AdaSubS generates diverse sets of subgoals at different distances. A verification mechanism is employed to filter out unreachable subgoals swiftly, allowing to focus on feasible further subgoals. In this way, AdaSubS benefits from the efficiency of planning with longer subgoals and the fine control with the shorter ones, and thus scales well to difficult planning problems. We show that AdaSubS significantly surpasses hierarchical planning algorithms on three complex reasoning tasks: Sokoban, the Rubik's Cube, and inequality proving benchmark INT.

machine learning, natural language, reinforcement learning, (21 more...)

arXiv.org Artificial Intelligence

2206.00702

Country:

Europe > Poland > Masovia Province > Warsaw (0.04)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
(6 more...)

Genre: Research Report (1.00)

Industry: Leisure & Entertainment > Games (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(3 more...)

Add feedback

Subgoal Search For Complex Reasoning Tasks

Czechowski, Konrad, Odrzygóźdź, Tomasz, Zbysiński, Marek, Zawalski, Michał, Olejnik, Krzysztof, Wu, Yuhuai, Kuciński, Łukasz, Miłoś, Piotr

arXiv.org Artificial IntelligenceAug-25-2021

Humans excel in solving complex reasoning tasks through a mental process of moving from one idea to a related one. Inspired by this, we propose Subgoal Search (kSubS) method. Its key component is a learned subgoal generator that produces a diversity of subgoals that are both achievable and closer to the solution. Using subgoals reduces the search space and induces a high-level search graph suitable for efficient planning. In this paper, we implement kSubS using a transformer-based subgoal module coupled with the classical best-first search framework. We show that a simple approach of generating $k$-th step ahead subgoals is surprisingly efficient on three challenging domains: two popular puzzle games, Sokoban and the Rubik's Cube, and an inequality proving benchmark INT. kSubS achieves strong results including state-of-the-art on INT within a modest computational budget.

sokoban, subgoal, subgoal generator, (15 more...)

arXiv.org Artificial Intelligence

2108.11204

Country:

North America > Canada > Ontario > Toronto (0.14)
Europe > Switzerland > Zürich > Zürich (0.14)
Europe > Poland > Masovia Province > Warsaw (0.04)
(11 more...)

Genre: Research Report (0.82)

Industry: Leisure & Entertainment > Games (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.88)

Add feedback