Fast Convergence in Learning Two-Layer Neural Networks with Separable Data