Goto

Collaborating Authors

 tattention


Reviews: A Tensorized Transformer for Language Modeling

Neural Information Processing Systems

This code failed to compile, and had numerous confusing aspects, and the authors did not link to the actual code used in training the model. April 2019), but I could find no comparison with that work. However I would also like to see the total flops usage compared to the baseline, as flops are frequently the limiting factor for training and deployment of models.