Enabling automatic transcription of child-centered audio recordings from real-world environments