Q-Adapter: Training Your LLM Adapter as a Residual Q-Function

Open in new window