How to code The Transformer in Pytorch – Towards Data Science
Could The Transformer be another nail in the coffin for RNNs? Doing away with the clunky for loops, it finds a way to allow whole sentences to simultaneously enter the network in batches. The miracle is that NLP can then reclaim the advantage of python's highly efficient linear algebra libraries. The time-saving is then spent deploying more layers into the model. So far it seems the result is faster convergence and better results.
Oct-2-2018, 16:22:07 GMT
- Technology: