Distilling Large Language Models into Tiny and Effective Students using pQRNN

Open in new window