Distilling Efficient Language-Specific Models for Cross-Lingual Transfer