TGPO: Tree-Guided Preference Optimization for Robust Web Agent Reinforcement Learning

Open in new window