Solving the Granularity Mismatch: Hierarchical Preference Learning for Long-Horizon LLM Agents