Minder: Faulty Machine Detection for Large-scale Distributed Model Training