Single layer tiny Co$^4$ outpaces GPT-2 and GPT-BERT