Flash Inference: Near Linear Time Inference for Long Convolution Sequence Models and Beyond

Open in new window