Policy Improvement using Language Feedback Models

Feb-12-2026, 18:56:11 GMT–Neural Information Processing Systems

First, by using LFMs to identify desirable behaviour to imitate, we improve in task-completion rate over strong behavioural cloning baselines on three distinct language grounding environments (Touchdown, ScienceWorld, and ALFWorld). Second, imitation learning using LFMs outperform using LLMs as experts to directly predict actions, when controlling for the number of LLM output tokens.

large language model, machine learning, natural language, (19 more...)

Neural Information Processing Systems

Feb-12-2026, 18:56:11 GMT

Conferences PDF

Add feedback

Country:
- South America > Colombia > Meta Department > Villavicencio (0.04)

Genre:
- Research Report > Experimental Study (0.93)

Industry:
- Education > Educational Setting (0.46)

Technology:
- Information Technology > Artificial Intelligence
  - Robots (1.00)
  - Natural Language > Large Language Model (1.00)
  - Machine Learning > Neural Networks
    - Deep Learning (0.94)

Duplicate Docs Excel Report

Title
4d4f7cf206bb00f9a38a5b6ae92cf79a-Paper-Conference.pdf

Similar Docs Excel Report more

Title	Similarity	Source
None found