Large Language Model
Repetition In Repetition Out: Towards Understanding Neural Text Degeneration from the Data Perspective Huayang Li Tian Lan Zihao Fu Deng Cai Lemao Liu Nigel Collier
In this work, we aim to advance our understanding by presenting a straightforward and fundamental explanation from the data perspective. Our preliminary investigation reveals a strong correlation between the degeneration issue and the presence of repetitions in training data. Subsequent experiments also demonstrate that by selectively dropping out the attention to repetitive words in training data, degeneration can be significantly minimized.