Cal-QL: Calibrated Offline RL Pre-Training for Efficient Online Fine-Tuning
–Neural Information Processing Systems
However, existing offline RL methods tend to behave poorly during fine-tuning. In this paper, we study the fine-tuning problem in the context of conservative offline RL methods and we devise an approach for learning an effective initialization from offline data that also enables fast online fine-tuning capabilities.
Neural Information Processing Systems
Oct-9-2025, 06:48:08 GMT
- Country:
- Asia > China
- Guangdong Province > Shenzhen (0.04)
- Europe > United Kingdom
- England > Cambridgeshire > Cambridge (0.04)
- North America > United States
- Montana (0.04)
- Pennsylvania > Allegheny County
- Pittsburgh (0.04)
- Asia > China
- Genre:
- Research Report (0.93)
- Technology: