Repetition In Repetition Out: Towards Understanding Neural Text Degeneration from the Data Perspective Huayang Li Tian Lan Zihao Fu Deng Cai Lemao Liu Nigel Collier
–Neural Information Processing Systems
In this work, we aim to advance our understanding by presenting a straightforward and fundamental explanation from the data perspective. Our preliminary investigation reveals a strong correlation between the degeneration issue and the presence of repetitions in training data. Subsequent experiments also demonstrate that by selectively dropping out the attention to repetitive words in training data, degeneration can be significantly minimized.
Neural Information Processing Systems
Feb-17-2026, 16:54:14 GMT
- Country:
- Oceania > Australia
- North America
- Dominican Republic (0.04)
- United States
- Washington > King County
- Seattle (0.04)
- Louisiana > Orleans Parish
- New Orleans (0.04)
- California
- Santa Clara County > Palo Alto (0.04)
- San Diego County > San Diego (0.04)
- Los Angeles County > Long Beach (0.04)
- Washington > King County
- Europe
- United Kingdom > England
- Cambridgeshire > Cambridge (0.04)
- Italy > Tuscany
- Florence (0.04)
- Denmark > Capital Region
- Copenhagen (0.04)
- United Kingdom > England
- Asia > Middle East
- UAE > Abu Dhabi Emirate > Abu Dhabi (0.04)
- Africa > Ethiopia
- Addis Ababa > Addis Ababa (0.04)
- Genre:
- Research Report > New Finding (0.67)
- Technology: