Aligning AI Agents via Information-Directed Sampling