Enhancing Multilingual LLM Pretraining with Model-Based Data Selection