AITopics | unstack

Previous work has claimed that this can be mitigated with chain of thought prompting-a method of demonstrating solution procedures-with the intuition that it is possible to in-context teach an LLM an algorithm for solving the problem.

arxiv preprint arxiv, current state, expression, (14 more...)

Neural Information Processing Systems

Country: North America > United States > Arizona (0.04)

Genre:

Research Report > Experimental Study (0.93)
Research Report > New Finding (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (0.93)

Add feedback

efb2072a358cefb75886a315a6fcf880-Supplemental-Conference.pdf

Neural Information Processing SystemsOct-9-2025, 11:20:40 GMT

artificial intelligence, large language model, natural language, (20 more...)

Neural Information Processing Systems

Genre: Research Report > New Finding (0.46)

Industry: Transportation (0.40)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.77)
Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (0.49)

Add feedback

A Appendix Contents

Neural Information Processing SystemsOct-8-2025, 23:01:05 GMT

PlanBench seeks to cover a broad range of tasks related to planning and reasoning.

artificial intelligence, large language model, natural language, (20 more...)

Neural Information Processing Systems

Industry: Transportation (0.31)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (0.46)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.30)

Add feedback

30699996ff411d48903c9752b782a5c1-Paper-Conference.pdf

Neural Information Processing SystemsOct-8-2025, 09:36:27 GMT

cylinder, obj, robot, (17 more...)

Neural Information Processing Systems

Country:

South America > Argentina > Pampas > Buenos Aires F.D. > Buenos Aires (0.04)
North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.04)
North America > Canada (0.04)

Genre: Workflow (0.67)

Industry: Information Technology > Security & Privacy (0.45)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Add feedback

Deliberate Planning in Language Models with Symbolic Representation

Xiong, Siheng, Liu, Zhangding, Zhou, Jieyu, Su, Yusen

arXiv.org Artificial IntelligenceOct-7-2025

Planning remains a core challenge for large language models (LLMs), particularly in domains that require coherent multi-step action sequences grounded in external constraints. We introduce SymPlanner, a novel framework that equips LLMs with structured planning capabilities by interfacing them with a symbolic environment that serves as an explicit world model. Rather than relying purely on natural language reasoning, SymPlanner grounds the planning process in a symbolic state space, where a policy model proposes actions and a symbolic environment deterministically executes and verifies their effects. To enhance exploration and improve robustness, we introduce Iterative Correction (IC), which refines previously proposed actions by leveraging feedback from the symbolic environment to eliminate invalid decisions and guide the model toward valid alternatives. Additionally, Contrastive Ranking (CR) enables fine-grained comparison of candidate plans by evaluating them jointly. Conceptually, SymPlanner operationalizes two cognitive faculties: (i) error monitoring and repair via externalized feedback (IC) and (ii) preference formation among alternatives via pairwise comparison (CR), advancing cognitively plausible, symbol-grounded planning aligned with the rich structure in intelligent systems. We evaluate SymPlanner on PlanBench, demonstrating that it produces more coherent, diverse, and verifiable plans than pure natural language baselines.

artificial intelligence, large language model, natural language, (17 more...)

arXiv.org Artificial Intelligence

2505.01479

Country: North America > United States (0.28)

Genre: Workflow (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

Large Reasoning Models in Agent Scenarios: Exploring the Necessity of Reasoning Capabilities

Zhou, Xueyang, Tie, Guiyao, Zhang, Guowen, Wang, Weidong, Zuo, Zhigang, Wu, Di, Chu, Duanfeng, Zhou, Pan, Sun, Lichao, Gong, Neil Zhenqiang

arXiv.org Artificial IntelligenceMar-14-2025

The rise of Large Reasoning Models (LRMs) signifies a paradigm shift toward advanced computational reasoning. Yet, this progress disrupts traditional agent frameworks, traditionally anchored by execution-oriented Large Language Models (LLMs). To explore this transformation, we propose the LaRMA framework, encompassing nine tasks across Tool Usage, Plan Design, and Problem Solving, assessed with three top LLMs (e.g., Claude3.5-sonnet) and five leading LRMs (e.g., DeepSeek-R1). Our findings address four research questions: LRMs surpass LLMs in reasoning-intensive tasks like Plan Design, leveraging iterative reflection for superior outcomes; LLMs excel in execution-driven tasks such as Tool Usage, prioritizing efficiency; hybrid LLM-LRM configurations, pairing LLMs as actors with LRMs as reflectors, optimize agent performance by blending execution speed with reasoning depth; and LRMs' enhanced reasoning incurs higher computational costs, prolonged processing, and behavioral challenges, including overthinking and fact-ignoring tendencies. This study fosters deeper inquiry into LRMs' balance of deep thinking and overthinking, laying a critical foundation for future agent design advancements.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2503.11074

Country: