Hash Embeddings for Efficient Word Representations
Svenstrup, Dan Tito, Hansen, Jonas, Winther, Ole
–Neural Information Processing Systems
A hash embedding may be seen as an interpolation between a standard word embedding and a word embedding created using a random hash function (the hashing trick). In hash embeddings each token is represented by $k$ $d$-dimensional embeddings vectors and one $k$ dimensional weight vector. The final $d$ dimensional representation of the token is the product of the two. Rather than fitting the embedding vectors for each token these are selected by the hashing trick from a shared pool of $B$ embedding vectors. Our experiments show that hash embeddings can easily deal with huge vocabularies consisting of millions tokens.
Neural Information Processing Systems
Feb-14-2020, 16:27:30 GMT
- Technology: