On the Planning Abilities of Large Language Models: A Critical Investigation

May-25-2025, 16:07:19 GMT–Neural Information Processing Systems

Intrigued by the claims of emergent reasoning capabilities in LLMs trained on general web corpora, in this paper, we set out to investigate their planning capabilities. We aim to evaluate (1) the effectiveness of LLMs in generating plans autonomously in commonsense planning tasks and (2) the potential of LLMs as a source of heuristic guidance for other agents (AI planners) in their planning tasks. We conduct a systematic study by generating a suite of instances on domains similar to the ones employed in the International Planning Competition and evaluate LLMs in two distinct modes: autonomous and heuristic. Our findings reveal that LLMs' ability to generate executable plans autonomously is rather limited, with the best model (GPT-4) having an average success rate of 12% across the domains.

large language model, machine learning, natural language, (16 more...)

Neural Information Processing Systems

May-25-2025, 16:07:19 GMT

Conferences PDF

Add feedback

Country:
- North America > United States > Arizona (0.14)

Genre:
- Research Report > New Finding (1.00)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning > Neural Networks
    - Deep Learning (0.73)
  - Natural Language > Large Language Model (1.00)