How to Provably Improve Return Conditioned Supervised Learning?