AITopics | reformulation model

Collaborating Authors

reformulation model

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Improving Image Captioning by Mimicking Human Reformulation Feedback at Inference-time

Berger, Uri, Abend, Omri, Frermann, Lea, Stanovsky, Gabriel

arXiv.org Artificial IntelligenceJan-8-2025

Incorporating automatically predicted human feedback into the process of training generative models has attracted substantial recent interest, while feedback at inference time has received less attention. The typical feedback at training time, i.e., preferences of choice given two samples, does not naturally transfer to the inference phase. We introduce a novel type of feedback -- caption reformulations -- and train models to mimic reformulation feedback based on human annotations. Our method does not require training the image captioning model itself, thereby demanding substantially less computational effort. We experiment with two types of reformulation feedback: first, we collect a dataset of human reformulations that correct errors in the generated captions. We find that incorporating reformulation models trained on this data into the inference phase of existing image captioning models results in improved captions, especially when the original captions are of low quality. We apply our method to non-English image captioning, a domain where robust models are less prevalent, and gain substantial improvement. Second, we apply reformulations to style transfer. Quantitative evaluations reveal state-of-the-art performance on German image captioning and English style transfer, while human validation with a detailed comparative framework exposes the specific axes of improvement.

caption, reformulation, reformulation model, (16 more...)

arXiv.org Artificial Intelligence

2501.04513

Country:

Asia > Middle East > Israel > Jerusalem District > Jerusalem (0.04)
Oceania > Australia > Victoria > Melbourne (0.04)
North America > United States > Pennsylvania (0.04)
(14 more...)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.46)

Add feedback

Capability-aware Prompt Reformulation Learning for Text-to-Image Generation

Zhan, Jingtao, Ai, Qingyao, Liu, Yiqun, Chen, Jia, Ma, Shaoping

arXiv.org Artificial IntelligenceMar-27-2024

Text-to-image generation systems have emerged as revolutionary tools in the realm of artistic creation, offering unprecedented ease in transforming textual prompts into visual art. However, the efficacy of these systems is intricately linked to the quality of user-provided prompts, which often poses a challenge to users unfamiliar with prompt crafting. This paper addresses this challenge by leveraging user reformulation data from interaction logs to develop an automatic prompt reformulation model. Our in-depth analysis of these logs reveals that user prompt reformulation is heavily dependent on the individual user's capability, resulting in significant variance in the quality of reformulation pairs. To effectively use this data for training, we introduce the Capability-aware Prompt Reformulation (CAPR) framework. CAPR innovatively integrates user capability into the reformulation process through two key components: the Conditional Reformulation Model (CRM) and Configurable Capability Features (CCF). CRM reformulates prompts according to a specified user capability, as represented by CCF. The CCF, in turn, offers the flexibility to tune and guide the CRM's behavior. This enables CAPR to effectively learn diverse reformulation strategies across various user capacities and to simulate high-capability user reformulation during inference. Extensive experiments on standard text-to-image generation benchmarks showcase CAPR's superior performance over existing baselines and its remarkable robustness on unseen systems. Furthermore, comprehensive analyses validate the effectiveness of different components. CAPR can facilitate user-friendly interaction with text-to-image systems and make advanced artistic creation more achievable for a broader range of users.

interaction log, prompt reformulation, reformulation, (14 more...)

arXiv.org Artificial Intelligence

2403.19716

Country:

North America > United States > District of Columbia > Washington (0.05)
Asia > China > Beijing > Beijing (0.05)
North America > United States > New York > New York County > New York City (0.04)

Genre: Research Report > New Finding (0.93)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.95)

Add feedback