On Model Compression for Neural Networks: Framework, Algorithm, and Convergence Guarantee