pickaxe
- North America > United States > California > Los Angeles County > Los Angeles (0.14)
- Asia > China > Beijing > Beijing (0.04)
- Leisure & Entertainment > Games > Computer Games (0.73)
- Materials > Metals & Mining > Diamonds (0.46)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.93)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
- North America > United States > California > Los Angeles County > Los Angeles (0.14)
- Asia > Vietnam > Hanoi > Hanoi (0.04)
- Asia > China > Beijing > Beijing (0.04)
- (2 more...)
- Leisure & Entertainment > Games (0.72)
- Materials > Metals & Mining > Iron (0.31)
- North America > United States > Michigan (0.05)
- North America > United States > California > San Mateo County > Menlo Park (0.04)
- North America > Canada > Quebec > Montreal (0.04)
- Asia > China > Beijing > Beijing (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Europe > Sweden > Skåne County > Malmö (0.04)
- Asia > Middle East > Jordan (0.04)
Subgoal Graph-Augmented Planning for LLM-Guided Open-World Reinforcement Learning
Fan, Shanwei, Zhang, Bin, Xu, Zhiwei, Teng, Yingxuan, Dai, Siqi, Cheng, Lin, Fan, Guoliang
Large language models (LLMs) offer strong high-level planning capabilities for reinforcement learning (RL) by decomposing tasks into subgoals. However, their practical utility is limited by poor planning-execution alignment, which reflects a critical gap between abstract plans and actionable, environment-compatible behaviors. This misalignment arises from two interrelated limitations: (1) LLMs often produce subgoals that are semantically plausible but infeasible or irrelevant in the target environment due to insufficient grounding in environment-specific knowledge, and (2) single-LLM planning conflates generation with self-verification, resulting in overconfident yet unreliable subgoals that frequently fail during execution. To address these challenges, we propose Subgoal Graph-Augmented Actor-Critic-Refiner (SGA-ACR), a framework that integrates an environment-specific subgoal graph and structured entity knowledge with a multi-LLM planning pipeline that explicitly separates generation, critique, and refinement to produce executable and verifiable subgoals. A subgoal tracker further monitors execution progress, provides auxiliary rewards, and adaptively updates the subgoal graph to maintain alignment between plans and actions. Experimental results on 22 diverse tasks in the open-world game "Crafter" demonstrate the effectiveness of our proposed method.
- Asia > China > Beijing > Beijing (0.05)
- North America > United States > District of Columbia > Washington (0.04)
- Asia > China > Shandong Province (0.04)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.68)
One Life to Learn: Inferring Symbolic World Models for Stochastic Environments from Unguided Exploration
Khan, Zaid, Prasad, Archiki, Stengel-Eskin, Elias, Cho, Jaemin, Bansal, Mohit
Symbolic world modeling requires inferring and representing an environment's transitional dynamics as an executable program. Prior work has focused on largely deterministic environments with abundant interaction data, simple mechanics, and human guidance. We address a more realistic and challenging setting, learning in a complex, stochastic environment where the agent has only "one life" to explore a hostile environment without human guidance. We introduce OneLife, a framework that models world dynamics through conditionally-activated programmatic laws within a probabilistic programming framework. Each law operates through a precondition-effect structure, activating in relevant world states. This creates a dynamic computation graph that routes inference and optimization only through relevant laws, avoiding scaling challenges when all laws contribute to predictions about a complex, hierarchical state, and enabling the learning of stochastic dynamics even with sparse rule activation. To evaluate our approach under these demanding constraints, we introduce a new evaluation protocol that measures (a) state ranking, the ability to distinguish plausible future states from implausible ones, and (b) state fidelity, the ability to generate future states that closely resemble reality. We develop and evaluate our framework on Crafter-OO, our reimplementation of the Crafter environment that exposes a structured, object-oriented symbolic state and a pure transition function that operates on that state alone. OneLife can successfully learn key environment dynamics from minimal, unguided interaction, outperforming a strong baseline on 16 out of 23 scenarios tested. We also test OneLife's planning ability, with simulated rollouts successfully identifying superior strategies. Our work establishes a foundation for autonomously constructing programmatic world models of unknown, complex environments.
- North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
- Asia > Middle East > Jordan (0.04)
- Research Report (0.64)
- Workflow (0.46)
- Leisure & Entertainment > Games > Computer Games (0.93)
- Law (0.68)
- Information Technology > Artificial Intelligence > Natural Language (1.00)
- Information Technology > Artificial Intelligence > Machine Learning (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Object-Oriented Architecture (0.34)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Scripts & Frames (0.34)
- North America > United States > California > Los Angeles County > Los Angeles (0.14)
- North America > Mexico > Gulf of Mexico (0.04)
- Asia > China > Beijing > Beijing (0.04)
- Leisure & Entertainment > Games > Computer Games (0.73)
- Materials > Metals & Mining > Diamonds (0.46)
Mars: Situated Inductive Reasoning in an Open-World Environment Xiaojuan Tang
Large Language Models (LLMs) trained on massive corpora have shown remarkable success in knowledge-intensive tasks. Y et, most of them rely on pre-stored knowledge. Inducing new general knowledge from a specific environment and performing reasoning with the acquired knowledge-- situated inductive reasoning, is crucial and challenging for machine intelligence. In this paper, we design Mars, an interactive environment devised for situated inductive reasoning. It introduces counter-commonsense game mechanisms by modifying terrain, survival setting and task dependency while adhering to certain principles.
- Asia > China (0.04)
- North America > United States > Massachusetts (0.04)
- North America > Montserrat (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Workflow (0.93)
- Research Report > New Finding (0.92)
- Materials > Metals & Mining > Diamonds (0.46)
- Materials > Metals & Mining > Coal (0.46)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.83)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.67)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)
- North America > United States > California > Los Angeles County > Los Angeles (0.14)
- Asia > Vietnam > Hanoi > Hanoi (0.04)
- Asia > China > Beijing > Beijing (0.04)
- (2 more...)
- Materials > Metals & Mining (0.96)
- Leisure & Entertainment > Games (0.72)