Visual Instruction Tuning with Polite Flamingo
Chen, Delong, Liu, Jianfeng, Dai, Wenliang, Wang, Baoyuan
–arXiv.org Artificial Intelligence
Recent research has demonstrated that the multi-task fine-tuning of multi-modal Large Language Models (LLMs) using an assortment of annotated downstream vision-language datasets significantly enhances their performance. Yet, during this process, a side effect, which we termed as the "multi-modal alignment tax", surfaces. This side effect negatively impacts the model's ability to format responses appropriately -- for instance, its "politeness" -- due to the overly succinct and unformatted nature of raw annotations, resulting in reduced human preference. In this paper, we introduce Polite Flamingo, a multi-modal response rewriter that transforms raw annotations into a more appealing, "polite" format. Polite Flamingo is trained to reconstruct high-quality responses from their automatically distorted counterparts and is subsequently applied to a vast array of vision-language datasets for response rewriting. After rigorous filtering, we generate the PF-1M dataset and further validate its value by fine-tuning a multi-modal LLM with it. Combined with novel methodologies including U-shaped multi-stage tuning and multi-turn augmentation, the resulting model, Clever Flamingo, demonstrates its advantages in both multi-modal understanding and response politeness according to automated and human evaluations.
arXiv.org Artificial Intelligence
Dec-15-2023
- Country:
- South America > Peru (0.04)
- Indian Ocean (0.04)
- Oceania > Australia
- New South Wales > Sydney (0.04)
- North America > United States
- Maryland > Baltimore (0.04)
- Louisiana > Orleans Parish
- New Orleans (0.04)
- Hawaii > Honolulu County
- Honolulu (0.04)
- California > Los Angeles County
- Long Beach (0.04)
- Europe
- Austria > Vienna (0.14)
- Spain > Andalusia
- Granada Province > Granada (0.04)
- Netherlands > North Holland
- Amsterdam (0.04)
- Italy > Tuscany
- Florence (0.04)
- Germany > Bavaria
- Upper Bavaria > Munich (0.04)
- Croatia > Dubrovnik-Neretva County
- Dubrovnik (0.04)
- Belgium > Brussels-Capital Region
- Brussels (0.04)
- Asia
- Indonesia (0.04)
- Middle East > Israel
- Tel Aviv District > Tel Aviv (0.04)
- Genre:
- Research Report > New Finding (0.68)
- Industry:
- Leisure & Entertainment > Sports (1.00)
- Transportation > Ground (0.67)
- Technology: