Sample-Efficient Language Modeling with Linear Attention and Lightweight Enhancements

Open in new window