On Biased Compression for Distributed Learning