Attention and Compression is all you need for Controllably Efficient Language Models