SAD: State-Action Distillation for In-Context Reinforcement Learning under Random Policies

Open in new window