AITopics | rlprompt

Collaborating Authors

rlprompt

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Appendix for " Preference-grounded Token-level Guidance for 657 Language Model Fine-tuning " 658 Table of Contents

Neural Information Processing SystemsFeb-11-2026, 14:07:34 GMT

F.3 Sparse Reward with KL Penalty . . . . . . . . . . . . . . . . . . . . . . . . . . .

machine learning, natural language, section 4, (18 more...)

Neural Information Processing Systems

Genre: Collection (0.40)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

4d4a3b6a34332d80349137bcc98164a5-Supplemental-Conference.pdf

Neural Information Processing SystemsOct-8-2025, 15:51:56 GMT

machine learning, natural language, section 4, (18 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Towards Interpretable Soft Prompts

Patel, Oam, Wang, Jason, Nayak, Nikhil Shivakumar, Srinivas, Suraj, Lakkaraju, Himabindu

arXiv.org Machine LearningApr-2-2025

Soft prompts have been popularized as a cheap and easy way to improve task-specific LLM performance beyond few-shot prompts. Despite their origin as an automated prompting method, however, soft prompts and other trainable prompts remain a black-box method with no immediately interpretable connections to prompting. We create a novel theoretical framework for evaluating the interpretability of trainable prompts based on two desiderata: faithfulness and scrutability. We find that existing methods do not naturally satisfy our proposed interpretability criterion. Instead, our framework inspires a new direction of trainable prompting methods that explicitly optimizes for interpretability. To this end, we formulate and test new interpretability-oriented objective functions for two state-of-the-art prompt tuners: Hard Prompts Made Easy (PEZ) and RLPrompt. Our experiments with GPT-2 demonstrate a fundamental trade-off between interpretability and the task-performance of the trainable prompt, explicating the hardness of the soft prompt interpretability problem and revealing odd behavior that arises when one optimizes for an interpretability proxy.

large language model, machine learning, natural language, (16 more...)

arXiv.org Machine Learning

2504.02144

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.71)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.69)

Add feedback

Hard Prompts Made Interpretable: Sparse Entropy Regularization for Prompt Tuning with RL

Choi, Yunseon, Bae, Sangmin, Ban, Seonghyun, Jeong, Minchan, Zhang, Chuheng, Song, Lei, Zhao, Li, Bian, Jiang, Kim, Kee-Eung

arXiv.org Artificial IntelligenceJul-19-2024

With the advent of foundation models, prompt tuning has positioned itself as an important technique for directing model behaviors and eliciting desired responses. Prompt tuning regards selecting appropriate keywords included into the input, thereby adapting to the downstream task without adjusting or fine-tuning the model parameters. There is a wide range of work in prompt tuning, from approaches that directly harness the backpropagated gradient signals from the model, to those employing black-box optimization such as reinforcement learning (RL) methods. Our primary focus is on RLPrompt, which aims to find optimal prompt tokens leveraging soft Q-learning. While the results show promise, we have observed that the prompts frequently appear unnatural, which impedes their interpretability. We address this limitation by using sparse Tsallis entropy regularization, a principled approach to filtering out unlikely tokens from consideration. We extensively evaluate our approach across various tasks, including few-shot text classification, unsupervised text style transfer, and textual inversion from images. The results indicate a notable improvement over baselines, highlighting the efficacy of our approach in addressing the challenges of prompt tuning. Moreover, we show that the prompts discovered using our method are more natural and interpretable compared to those from other baselines.

algorithm, dataset, rlprompt, (16 more...)

arXiv.org Artificial Intelligence

2407.14733

Country:

North America > United States > Washington > King County > Seattle (0.14)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
South America > Uruguay (0.04)
(14 more...)

Genre: Research Report > New Finding (0.88)

Industry: Leisure & Entertainment (0.93)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Add feedback

Flatness-Aware Prompt Selection Improves Accuracy and Sample Efficiency

Shen, Lingfeng, Tan, Weiting, Zheng, Boyuan, Khashabi, Daniel

arXiv.org Artificial IntelligenceOct-22-2023

With growing capabilities of large language models, prompting them has become the dominant way to access them. This has motivated the development of strategies for automatically selecting effective language prompts. In this paper, we introduce prompt flatness, a new metric to quantify the expected utility of a language prompt. This metric is inspired by flatness regularization in statistical learning that quantifies the robustness of the model towards its parameter perturbations. We provide theoretical foundations for this metric and its relationship with other prompt selection metrics, providing a comprehensive understanding of existing methods. Empirically, we show that combining prompt flatness with existing metrics improves both performance and sample efficiency. Our metric outperforms the previous prompt selection metrics with an average increase of 5% in accuracy and 10% in Pearson correlation across 6 classification benchmarks.

flatness, hypothesis, prompt selection, (15 more...)

arXiv.org Artificial Intelligence

2305.10713

Country: North America > United States > Maryland > Baltimore (0.04)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Add feedback

RLPrompt: Optimizing discrete text prompts with reinforcement learning

AIHubMar-7-2023, 12:15:03 GMT

Figure 1: Overview of RL Prompt for discrete prompt optimization. All language models (LMs) are frozen. We build our policy network by training a task-specific multi-layer perceptron (MLP) network inserted into a frozen pre-trained LM. The figure above illustrates 1) generation of a prompt (left), 2) example usages in a masked LM for classification (top right) and a left-to-right LM for generation (bottom right), and 3) update of the MLP using RL reward signals (red arrows). TL;DR: Prompting enables large language models (LLMs) to perform various NLP tasks without changing the model.

lms, reinforcement, rlprompt, (12 more...)

AIHub

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.91)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.55)

Add feedback

Meet RLPrompt: A New Prompt Optimization Approach with Reinforcement Learning (RL) - MarkTechPost

#artificialintelligenceMar-1-2023, 14:25:11 GMT

Prompting is a promising approach to solving NLP problems with pre-trained language models (LMs) such as GPTs and BERT. Unlike conventional fine-tuning that updates the massive LM parameters for each downstream task, prompting concatenates inputs with additional text to steer the LM towards producing the desired outputs. A key question is finding optimal prompts to improve the LM's performance on various tasks with few training examples. Reinforcement Learning (RL) for prompt optimization challenges learning efficiency as the large black-box language model navigates a complex environment involving multiple transitions before computing rewards. This complexity makes it challenging to learn from the unstable reward signals.

new prompt optimization approach, reinforcement learning, rlprompt, (7 more...)

#artificialintelligence

Country: Asia > India > West Bengal > Kharagpur (0.06)

Genre: Research Report (0.37)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.57)

Add feedback