Mind the Gap: Assessing Temporal Generalization in Neural Language Models
–Neural Information Processing Systems
In the case of GPT -3 (Brown et al., 2020), such tasks include LAMBADA (Paperno et al., 2016), TriviaQA First, they do not assess a language model's ability to generalize well to future data from beyond their training period--an Augenstein et al., 2019), forecasting stock prices from the latest news articles (Ding et al., 2015), and answering knowledge-intensive questions like "How many people have been infected by COVID-19?" Second, the temporal overlap between the training and evaluation data increases the risk of "test data Nevertheless, language modelling data are not i.i.d. Brown et al. (2020) used This can potentially induce a correlation between the training and evaluation sets that LMs can exploit.
Neural Information Processing Systems
Nov-16-2025, 03:17:33 GMT
- Country:
- Asia > Middle East
- Jordan (0.04)
- Europe > United Kingdom
- England > Greater London > London (0.04)
- North America
- Canada > Ontario
- Toronto (0.04)
- United States > California
- Monterey County > Pacific Grove (0.04)
- Canada > Ontario
- Oceania > New Zealand (0.04)
- Asia > Middle East
- Genre:
- Research Report > New Finding (0.93)
- Industry:
- Education > Educational Setting (0.46)
- Government > Regional Government (0.67)
- Health & Medicine > Therapeutic Area (0.54)
- Technology: