Efficient Audio Captioning Transformer with Patchout and Text Guidance