EPO: Hierarchical LLM Agents with Environment Preference Optimization