Weighted Mutual Learning with Diversity-Driven Model Compression