Slim attention: cut your context memory in half without loss of accuracy -- K-cache is all you need for MHA