Google Brain's Universal Transformers: an extension to its standard translation system Packt Hub
Last year in August Google released the Transformer, a novel neural network architecture based on a self-attention mechanism particularly well suited for language understanding. Before the Transformer, most neural network based approaches to machine translation relied on recurrent neural networks (RNNs) which operated sequentially using recurrence. In contrast to RNN-based approaches, the Transformer used no recurrence, instead it processed all words or symbols in the sequence and let each word attend the other word over multiple processing steps using a self-attention mechanism to incorporate context from words farther away. This approach led Transformer to train the recurrent models much faster and yield better translation results than RNNs. "However, on smaller and more structured language understanding tasks, or even simple algorithmic tasks such as copying a string (e.g. to transform an input of "abc" to "abcabc"), the Transformer does not perform very well.", says Stephan Gouws and Mostafa Dehghani from the Google Brain team. Hence this year the team has come up with Universal Transformers, an extension to standard Transformer which is computationally universal using a novel and efficient flavor of parallel-in-time recurrence.
Sep-27-2018, 12:02:08 GMT
- Technology: