Approximated Variational Bayesian Inverse Reinforcement Learning for Large Language Model Alignment

Open in new window