Pointer Networks with Transformers
The original Pointer Networks paper[1] was originally accepted to NeurIPS 2015, making it quite old in deep learning years. Nonetheless, it has amassed over 1700 citations to date and continues to be integrated into modern solutions[2, 3], has received many improvements [4, 5], and has inspired alternative architectures[6]. It even plays a small, but important role in a state-of-the-art model for playing StarCraft II created by Tencent AI Lab [7]. What is it about pointer networks that makes them so applicable even today? This simple and elegant architecture addresses a subtle complication in sequence prediction problems.
Jun-19-2021, 00:05:11 GMT