LIFT: Improving Long Context Understanding Through Long Input Fine-Tuning