AITopics | self-feedback

Collaborating Authors

self-feedback

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Self-Refine: Iterative Refinement with Self-Feedback

Neural Information Processing SystemsDec-26-2025, 08:30:08 GMT

Like humans, large language models (LLMs) do not always generate the best output on their first try. Motivated by how humans refine their written text, we introduce Self-Refine, an approach for improving initial outputs from LLMs through iterative feedback and refinement. The main idea is to generate an initial output using an LLMs; then, the same LLMs provides *feedback* for its output and uses it to *refine* itself, iteratively. Self-Refine does not require any supervised training data, additional training, or reinforcement learning, and instead uses a single LLM as the generator, refiner and the feedback provider. We evaluate Self-Refine across 7 diverse tasks, ranging from dialog response generation to mathematical reasoning, using state-of-the-art (GPT-3.5,

iterative refinement, name change, self-refine, (6 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Self-Refine: Iterative Refinement with Self-Feedback

Neural Information Processing SystemsJan-19-2025, 15:30:01 GMT

iterative refinement, self-feedback, self-refine, (4 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.50)

Add feedback

Internal Consistency and Self-Feedback in Large Language Models: A Survey

Liang, Xun, Song, Shichao, Zheng, Zifan, Wang, Hanyu, Yu, Qingchen, Li, Xunkai, Li, Rong-Hua, Wang, Yi, Wang, Zhonghao, Xiong, Feiyu, Li, Zhiyu

arXiv.org Artificial IntelligenceSep-18-2024

Large language models (LLMs) often exhibit deficient reasoning or generate hallucinations. To address these, studies prefixed with "Self-" such as Self-Consistency, Self-Improve, and Self-Refine have been initiated. They share a commonality: involving LLMs evaluating and updating themselves. Nonetheless, these efforts lack a unified perspective on summarization, as existing surveys predominantly focus on categorization. In this paper, we use a unified perspective of internal consistency, offering explanations for reasoning deficiencies and hallucinations. Internal consistency refers to the consistency in expressions among LLMs' latent, decoding, or response layers based on sampling methodologies. Then, we introduce an effective theoretical framework capable of mining internal consistency, named Self-Feedback. This framework consists of two modules: Self-Evaluation and Self-Update. The former captures internal consistency signals, while the latter leverages the signals to enhance either the model's response or the model itself. This framework has been employed in numerous studies. We systematically classify these studies by tasks and lines of work; summarize relevant evaluation methods and benchmarks; and delve into the concern, "Does Self-Feedback Really Work?" We also propose several critical viewpoints, including the "Hourglass Evolution of Internal Consistency", "Consistency Is (Almost) Correctness" hypothesis, and "The Paradox of Latent and Explicit Reasoning". The relevant resources are open-sourced at https://github.com/IAAR-Shanghai/ICSFSurvey.

arxiv preprint arxiv, consistency, language model, (12 more...)

arXiv.org Artificial Intelligence

2407.14507

Country:

Asia > China > Shanghai > Shanghai (0.24)
Asia > China > Beijing > Beijing (0.05)
Asia > China > Hong Kong (0.04)
(4 more...)

Genre:

Research Report (0.81)
Overview (0.67)

Industry: Education > Educational Setting > Higher Education (0.45)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Crystal: Introspective Reasoners Reinforced with Self-Feedback

Liu, Jiacheng, Pasunuru, Ramakanth, Hajishirzi, Hannaneh, Choi, Yejin, Celikyilmaz, Asli

arXiv.org Artificial IntelligenceOct-18-2023

Extensive work has shown that the performance and interpretability of commonsense reasoning can be improved via knowledge-augmented reasoning methods, where the knowledge that underpins the reasoning process is explicitly verbalized and utilized. However, existing implementations, including "chain-of-thought" and its variants, fall short in capturing the introspective nature of knowledge required in commonsense reasoning, and in accounting for the mutual adaptation between the generation and utilization of knowledge. We propose a novel method to develop an introspective commonsense reasoner, Crystal. To tackle commonsense problems, it first introspects for knowledge statements related to the given question, and subsequently makes an informed prediction that is grounded in the previously introspected knowledge. The knowledge introspection and knowledge-grounded reasoning modes of the model are tuned via reinforcement learning to mutually adapt, where the reward derives from the feedback given by the model itself. Experiments show that Crystal significantly outperforms both the standard supervised finetuning and chain-of-thought distilled methods, and enhances the transparency of the commonsense reasoning process. Our work ultimately validates the feasibility and potential of reinforcing a neural model with self-feedback.

introspective reasoner reinforced, self-feedback

arXiv.org Artificial Intelligence

2310.04921

Genre: Research Report (0.69)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)

Add feedback

[2303.17651] Self-Refine: Iterative Refinement with Self-Feedback

#artificialintelligenceApr-3-2023, 00:01:19 GMT

Like people, LLMs do not always generate the best text for a given generation problem on their first try (e.g., summaries, answers, explanations). Just as people then refine their text, we introduce SELF-REFINE, a framework for similarly improving initial outputs from LLMs through iterative feedback and refinement. The main idea is to generate an output using an LLM, then allow the same model to provide multi-aspect feedback for its own output; finally, the same model refines its previously generated output given its own feedback. Unlike earlier work, our iterative refinement framework does not require supervised training data or reinforcement learning, and works with a single LLM. We experiment with 7 diverse tasks, ranging from review rewriting to math reasoning, demonstrating that our approach outperforms direct generation. In all tasks, outputs generated with SELF-REFINE are preferred by humans and by automated metrics over those generated directly with GPT-3.5 and GPT-4, improving on average by absolute 20% across tasks.

iterative refinement, self-feedback, self-refine, (2 more...)

#artificialintelligence

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.44)

Add feedback

Towards Human-like AI. An attempt to make AI more general with…

#artificialintelligenceDec-8-2022, 19:35:29 GMT

After making many permutations to the thought stream lookback range, model temperature, and few-shot examples, the messages produced seem to be qualitatively worse when using a long thought stream than when using an arbitrarily short one, though this requires more experimentation, and a good benchmark. Intuitively this makes sense because a GPT trained on the internet wouldn't have many training examples of what a human was thinking (at least in a direct access format like this) before they said or wrote something. I'll need to rethink the way thoughts are incorporated or if they can be removed entirely. Perhaps thinking is an emergent property of intelligence and does not need to be explicitly included.

agent, fine-tuning, general intelligence, (16 more...)

#artificialintelligence

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.71)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.71)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.49)

Add feedback