SequenceLayers: Sequence Processing and Streaming Neural Networks Made Easy

Skerry-Ryan, RJ, Salazar, Julian, Mariooryad, Soroosh, Kao, David, Stanton, Daisy, Battenberg, Eric, Shannon, Matt, Weiss, Ron J., Scheibler, Robin, Rothfuss, Jonas, Bagby, Tom

arXiv.org Artificial Intelligence 

We introduce a neural network layer API and library for sequence modeling, designed for easy creation of sequence models that can be executed both layer-by-layer (e.g., teacher-forced training) and step-by-step (e.g., autoregressive sampling). To achieve this, layers define an explicit representation of their state over time (e.g., a Transformer KV cache, a convolution buffer, an RNN hidden state), and a step method that evolves that state, tested to give identical results to a stateless layer-wise invocation. This and other aspects of the SequenceLayers contract enables complex models to be immediately streamable, mitigates a wide range of common bugs arising in both streaming and parallel sequence processing, and can be implemented in any deep learning library.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found