SAGE-LD: Towards Scalable and Generalizable End-to-End Language Diarization via Simulated Data Augmentation