Fine-Tuning Large Vision-Language Models as Decision-Making Agents via Reinforcement Learning

Oct-10-2025, 16:29:55 GMT–Neural Information Processing Systems

Large vision-language models (VLMs) fine-tuned on specialized visual instruction-following data have exhibited impressive language reasoning capabilities across various scenarios.

arxiv preprint arxiv, cabinet 1, cot reasoning, (14 more...)

Neural Information Processing Systems

Oct-10-2025, 16:29:55 GMT

Conferences PDF

Country:
- North America > United States
  - Washington > King County
    - Seattle (0.04)
  - Pennsylvania > Allegheny County
    - Pittsburgh (0.04)
- Europe > Sweden
  - Stockholm > Stockholm (0.04)
- Asia
  - Middle East > Jordan (0.04)
  - China
    - Hong Kong (0.04)
    - Guangdong Province > Shenzhen (0.04)

Genre:
- Research Report
  - Experimental Study (1.00)
  - New Finding (0.67)

Industry:
- Leisure & Entertainment > Games (1.00)

Technology:
- Information Technology > Artificial Intelligence
  - Vision (1.00)
  - Representation & Reasoning (1.00)
  - Natural Language > Large Language Model (1.00)
  - Machine Learning
    - Reinforcement Learning (1.00)
    - Neural Networks > Deep Learning (1.00)

Duplicate Docs Excel Report

Title
c848b7d3adc08fcd0bf1df3101ba6728-Paper-Conference.pdf

Similar Docs Excel Report more

Title	Similarity	Source
None found