KWT-Tiny: RISC-V Accelerated, Embedded Keyword Spotting Transformer

Al-Qawlaq, Aness, M, Ajay Kumar, John, Deepu

arXiv.org Artificial Intelligence 

University College Dublin, Ireland Abstract -- This paper explores the adaptation of Transformer - based models for edge devices through the quantis ation and hardware acceleration of the ARM Keyword Transformer (KWT) model on a RISC - V platform. The model was targeted to run on 64kB RAM in bare - metal C using a custom - developed edge AI library. KWT - 1 was retrained to be 369 times smaller, with only a 10 % loss in accuracy through reducing output classes from 35 to 2. The retraining and quantis ation reduced model size from 2.42 MB to 1.65 kB. The integration of custom RISC - V instructions that accelerated GELU and SoftMax operations enabled a 5x speedup and thus ~5x power reduction in inference, with inference clock cycle counts decreasing from 26 million to 5.5 million clock cycles while incurring a small area overhead of approximately 29 % . The results demonstrate a viable method for porting and accelerating Transformer - based models in low - power IoT devices.