Reinforcing LLM Agents via Policy Optimization with Action Decomposition

Feb-17-2026, 19:41:22 GMT–Neural Information Processing Systems

Beginning with the simplification of flattening all actions, we theoretically explore the discrepancies between action-level optimization and this naive token-level optimization.

large language model, machine learning, reinforcement learning, (17 more...)

Neural Information Processing Systems

Feb-17-2026, 19:41:22 GMT

Conferences PDF

Country:
- Asia > China > Shanghai > Shanghai (0.04)

Genre:
- Research Report > Experimental Study (0.93)

Industry:
- Information Technology (0.67)
- Education > Curriculum
  - Subject-Specific Education (1.00)

Technology:
- Information Technology > Artificial Intelligence
  - Representation & Reasoning (1.00)
  - Natural Language > Large Language Model (1.00)
  - Machine Learning
    - Reinforcement Learning (0.95)
    - Neural Networks > Deep Learning (0.47)

Duplicate Docs Excel Report

Title
Reinforcing LLM Agents via Policy Optimization with Action Decomposition

Similar Docs Excel Report more

Title	Similarity	Source
None found