Greedy Low-Rank Gradient Compression for Distributed Learning with Convergence Guarantees