Reinforcement Learning for Long-Horizon Interactive LLM Agents