FlashRNN: Optimizing Traditional RNNs on Modern Hardware