Pointer Networks

Oriol Vinyals, Meire Fortunato, Navdeep Jaitly

Feb-6-2025, 15:31:32 GMT–Neural Information Processing Systems

We introduce a new neural architecture to learn the conditional probability of an output sequence with elements that are discrete tokens corresponding to positions in an input sequence. Such problems cannot be trivially addressed by existent approaches such as sequence-to-sequence [1] and Neural Turing Machines [2], because the number of target classes in each step of the output depends on the length of the input, which is variable. Problems such as sorting variable sized sequences, and various combinatorial optimization problems belong to this class. Our model solves the problem of variable size output dictionaries using a recently proposed mechanism of neural attention. It differs from the previous attention attempts in that, instead of using attention to blend hidden units of an encoder to a context vector at each decoder step, it uses attention as a pointer to select a member of the input sequence as the output.

artificial intelligence, machine learning, natural language, (17 more...)

Neural Information Processing Systems

Feb-6-2025, 15:31:32 GMT

Conferences PDF

Add feedback

Genre:
- Research Report > New Finding (0.47)