AITopics | self-reinforcement effect

Collaborating Authors

self-reinforcement effect

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Appendix of Learning to Break the Loop Analyzing and Mitigating Repetitions for Neural Text Generation

Neural Information Processing SystemsApr-24-2026, 17:51:24 GMT

Previous work [2, 1] has observed that standard training and greedy decoding usually cause models to generate consecutive repetitive texts. These consecutive repetitive texts are redundant and do not convey new information, which is avoided in human language. There are three types of consecutive repetitions: word-level, phrase-level and sentence-level. The phrase-level means that a phrase consisting of several words is repeated consecutively. The sentence in our paper refers to a sequence split by '.!?' is repeated consecutively 2. We calculate the ratio of consecutive repetition in a sequence x as follows.

artificial intelligence, natural language, repetition, (13 more...)

Neural Information Processing Systems

Country: North America > United States (1.00)

Genre: Personal (0.46)

Industry:

Media > Film (1.00)
Government (1.00)
Leisure & Entertainment > Sports > Basketball (0.68)

Technology: Information Technology > Artificial Intelligence > Natural Language (1.00)

Add feedback

148c0aeea1c5da82f4fa86a09d4190da-Paper-Conference.pdf

Neural Information Processing SystemsApr-24-2026, 17:51:21 GMT

machine learning, natural language, repetition, (21 more...)

Neural Information Processing Systems

Country:

Asia > China (0.46)
Europe (0.28)

Genre: Research Report > New Finding (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

148c0aeea1c5da82f4fa86a09d4190da-Supplemental-Conference.pdf

Neural Information Processing SystemsFeb-7-2026, 14:32:28 GMT

probability, repetition, self-reinforcement effect, (13 more...)

Neural Information Processing Systems

Country:

North America > United States > Utah > Salt Lake County > Salt Lake City (0.04)
North America > United States > Ohio (0.04)
North America > United States > Missouri > Buchanan County > Saint Joseph (0.04)
(6 more...)

Industry:

Media > Film (1.00)
Government (1.00)
Leisure & Entertainment > Sports > Basketball (0.46)

Technology: Information Technology > Artificial Intelligence > Natural Language (0.67)

Add feedback

Learning to Break the Loop: Analyzing and Mitigating Repetitions for Neural Text Generation

Neural Information Processing SystemsDec-23-2025, 19:41:30 GMT

While large-scale neural language models, such as GPT2 and BART,have achieved impressive results on various text generation tasks, they tend to get stuck in undesirable sentence-level loops with maximization-based decoding algorithms (\textit{e.g.}, greedy search). This phenomenon is counter-intuitive since there are few consecutive sentence-level repetitions in the human corpus (e.g., 0.02\% in Wikitext-103). To investigate the underlying reasons for generating consecutive sentence-level repetitions, we study the relationship between the probability of repetitive tokens and their previous repetitions in context. Through our quantitative experiments, we find that 1) Models have a preference to repeat the previous sentence; 2) The sentence-level repetitions have a \textit{self-reinforcement effect}: the more times a sentence is repeated in the context, the higher the probability of continuing to generate that sentence; 3) The sentences with higher initial probabilities usually have a stronger self-reinforcement effect. Motivated by our findings, we propose a simple and effective training method \textbf{DITTO} (Pseu\underline{D}o-Repet\underline{IT}ion Penaliza\underline{T}i\underline{O}n), where the model learns to penalize probabilities of sentence-level repetitions from synthetic repetitive data. Although our method is motivated by mitigating repetitions, our experiments show that DITTO not only mitigates the repetition issue without sacrificing perplexity, but also achieves better generation quality. Extensive experiments on open-ended text generation (Wikitext-103) and text summarization (CNN/DailyMail) demonstrate the generality and effectiveness of our method.

analyzing and mitigating repetition, repetition, sentence-level repetition, (12 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Natural Language (1.00)

Add feedback

Learning to Break the Loop: Analyzing and Mitigating Repetitions for Neural Text Generation

Neural Information Processing SystemsOct-9-2024, 21:25:03 GMT

analyzing and mitigating repetition, repetition, sentence-level repetition, (9 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Natural Language (1.00)

Add feedback

Penalty Decoding: Well Suppress the Self-Reinforcement Effect in Open-Ended Text Generation

Zhu, Wenhong, Hao, Hongkun, Wang, Rui

arXiv.org Artificial IntelligenceOct-23-2023

The decoding algorithm is critical for open-ended text generation, transforming latent representations into coherent and meaningful outputs. This paper investigates the self-reinforcement effect in text generation and the effectiveness of a repetition penalty to mitigate it. However, determining the optimal repetition penalty value is challenging. To tackle this, we propose a forgetting mechanism that disregards distant tokens, reducing the burden of penalty selection. In addition, we introduce a length penalty to address overly short sentences caused by excessive penalties. Our penalty decoding approach incorporating three strategies helps resolve issues with sampling methods deviating from factual information. Experimental results demonstrate the efficacy of our approach in generating high-quality sentences resembling human output.

penalty, repetition penalty, text generation, (13 more...)

arXiv.org Artificial Intelligence

2310.14971

Country:

Asia > China > Shanghai > Shanghai (0.05)
North America > United States > Hawaii > Honolulu County > Honolulu (0.04)
South America > Bolivia (0.04)
(8 more...)

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.97)

Add feedback

Understanding In-Context Learning from Repetitions

Yan, Jianhao, Xu, Jin, Song, Chiyu, Wu, Chenming, Li, Yafu, Zhang, Yue

arXiv.org Artificial IntelligenceOct-9-2023

This paper explores the elusive mechanism underpinning in-context learning in Large Language Models (LLMs). Our work provides a novel perspective by examining in-context learning via the lens of surface repetitions. We quantitatively investigate the role of surface features in text generation, and empirically establish the existence of token co-occurrence reinforcement, a principle that strengthens the relationship between two tokens based on their contextual co-occurrences. By investigating the dual impacts of these features, our research illuminates the internal workings of in-context learning and expounds on the reasons for its failures. This paper provides an essential contribution to the understanding of in-context learning and its potential limitations, providing a fresh perspective on this exciting capability. The impressive ability of Large Language Models (LLMs; Touvron et al. (2023); Chowdhery et al. (2022); OpenAI (2023)) to execute in-context learning (ICL) is a standout characteristic. This behavior mirrors human learning and reasoning from analogy (Winston, 1980), enabling LLMs to rapidly adapt to a range of downstream tasks. Without being explicitly pretrained to learn from demonstrations, LLMs can predict responses to unseen test queries from a few demonstrations and without any instruction given (Brown et al., 2020; Zhang et al., 2022; Chowdhery et al., 2022). An example of in-context learning can be found in Figure 1(a), where a pre-trained LLaMA model is given demonstrations for a binary classification task, and learns to make predictions correctly. Despite the success in applications, the working mechanism of in-context learning is still an open question. We take a feature-centric view to understand ICL, analyzing the key patterns in the input context that correlate with ICL behavior. In particular, as Figure 1(b) shows, in-context demonstrations can result not only in desired effects but also cause errors. In this example, the same LLaMA model makes the incorrect prediction'True' given the input "Circulation revenue has decreased by 5% in Finland.", which is likely because of the repeated pattern "Answer:" -> "True" from the demonstrations. In the same perspective, the success case in Figure 1(a) can be attributed to learning desired patterns such as "Answer:" -> "True|False" in the demonstrations.

demonstration, in-context learning, reinforcement, (13 more...)

arXiv.org Artificial Intelligence

2310.00297

Country:

Europe > Finland (0.25)
Europe > Ireland > Leinster > County Dublin > Dublin (0.04)
Asia > China > Guangxi Province > Nanning (0.04)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.66)

Add feedback

Learning to Break the Loop: Analyzing and Mitigating Repetitions for Neural Text Generation

Xu, Jin, Liu, Xiaojiang, Yan, Jianhao, Cai, Deng, Li, Huayang, Li, Jian

arXiv.org Artificial IntelligenceOct-9-2022

While large-scale neural language models, such as GPT2 and BART, have achieved impressive results on various text generation tasks, they tend to get stuck in undesirable sentence-level loops with maximization-based decoding algorithms (\textit{e.g.}, greedy search). This phenomenon is counter-intuitive since there are few consecutive sentence-level repetitions in human corpora (e.g., 0.02\% in Wikitext-103). To investigate the underlying reasons for generating consecutive sentence-level repetitions, we study the relationship between the probabilities of the repetitive tokens and their previous repetitions in the context. Through our quantitative experiments, we find that 1) Language models have a preference to repeat the previous sentence; 2) The sentence-level repetitions have a \textit{self-reinforcement effect}: the more times a sentence is repeated in the context, the higher the probability of continuing to generate that sentence; 3) The sentences with higher initial probabilities usually have a stronger self-reinforcement effect. Motivated by our findings, we propose a simple and effective training method \textbf{DITTO} (Pseu\underline{D}o-Repet\underline{IT}ion Penaliza\underline{T}i\underline{O}n), where the model learns to penalize probabilities of sentence-level repetitions from pseudo repetitive data. Although our method is motivated by mitigating repetitions, experiments show that DITTO not only mitigates the repetition issue without sacrificing perplexity, but also achieves better generation quality. Extensive experiments on open-ended text generation (Wikitext-103) and text summarization (CNN/DailyMail) demonstrate the generality and effectiveness of our method.

machine learning, natural language, repetition, (18 more...)

arXiv.org Artificial Intelligence

2206.02369

Country:

North America > United States > Utah > Salt Lake County > Salt Lake City (0.04)
North America > United States > Ohio (0.04)
North America > United States > New York (0.04)
(11 more...)

Genre: Research Report > New Finding (0.88)

Industry:

Media > Film (1.00)
Government (1.00)
Law (0.86)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.88)

Add feedback