Autoregressive Modeling with Lookahead Attention
Du, Li, Mei, Hongyuan, Eisner, Jason
–arXiv.org Artificial Intelligence
To predict the next token, autoregressive models However, those NP-hard distributions are artificial. For naturally ordinarily examine the past. Could they also benefit occurring sequences, why might one expect lookahead from also examining hypothetical futures? We to help autoregressive modeling? We argue that when the consider a novel Transformer-based autoregressive sequences represent an agent's behavior, an autoregressive architecture that estimates the next-token distribution parameterization is not always the simplest description. If by extrapolating multiple continuations the behavior is goal-directed--for example, an agent trying of the past, according to some proposal distribution, to achieve high reward in a Markov Decision Process--then and attending to these extended strings. This the simplest description may include a characterization of architecture draws insights from classical AI systems the agent's environment and goals. Even if the agent explicitly such as board game players: when making consults an autoregressive policy p(action | state) a local decision, a policy may benefit from exploring at each step, that policy is not arbitrary: while it may appear possible future trajectories and analyzing complex, it was shaped by reinforcement learning or them. On multiple tasks including morphological by natural selection so as to achieve high-reward trajectories.
arXiv.org Artificial Intelligence
May-20-2023
- Country:
- North America
- United States
- New York (0.04)
- Washington > King County
- Seattle (0.04)
- Illinois > Cook County
- Chicago (0.04)
- California > San Francisco County
- San Francisco (0.14)
- Canada > British Columbia
- United States
- Europe > Finland
- North America
- Genre:
- Research Report (1.00)
- Industry:
- Leisure & Entertainment > Games (0.68)
- Technology: