rationalization
- North America > United States > California > San Francisco County > San Francisco (0.14)
- Europe > Austria > Vienna (0.14)
- North America > United States > Maryland > Baltimore (0.04)
- (11 more...)
- Overview (1.00)
- Research Report > Experimental Study (0.46)
- Research Report > New Finding (0.46)
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- North America > United States > California > San Francisco County > San Francisco (0.14)
- North America > United States > Maryland > Baltimore (0.04)
- (16 more...)
- Overview (0.46)
- Research Report (0.46)
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- North America > United States > California > San Francisco County > San Francisco (0.14)
- North America > United States > Maryland > Baltimore (0.04)
- (15 more...)
- Overview (0.47)
- Research Report (0.46)
- Asia > Middle East > Jordan (0.04)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- North America > Canada (0.04)
- North America > Mexico (0.14)
- North America > United States > New York (0.05)
- Oceania > Australia (0.04)
- (11 more...)
- Media (1.00)
- Health & Medicine (1.00)
- Leisure & Entertainment > Sports (0.46)
- North America > United States > California > Santa Clara County > Palo Alto (0.04)
- Europe > France (0.04)
See, Think, Learn: A Self-Taught Multimodal Reasoner
Sharma, Sourabh, Gupta, Sonam, Sadbhawna, null
Vision-Language Models (VLMs) have achieved remarkable progress in integrating visual perception with language understanding. However, effective multimodal reasoning requires both accurate perception and robust reasoning, and weakness in either limits the performance of VLMs. Prior efforts to enhance reasoning often depend on high-quality chain-of-thought (CoT) data, obtained via labor-intensive human annotations, costly proprietary models, or self-training methods that overlook perception. To address these limitations, we propose a simple yet effective self-training framework called See-Think-Learn (STL). At its core, STL introduces a structured reasoning template that encourages the model to see before thinking, first extracting visual attributes in textual form, then using them to guide reasoning. The framework jointly improves perception and reasoning by having the model generate and learn from its own structured rationales in a self-training loop. Furthermore, we augment the training data with negative rationales, i.e. explanations that justify why certain answer choices are incorrect, to enhance the model's ability to distinguish between correct and misleading responses. This fosters more discriminative and robust learning. Experiments across diverse domains show that STL consistently outperforms baselines trained directly only on answers or self-generated reasoning, while qualitative analysis confirms the high quality of its rationales. STL thus provides a cost-effective solution to enhance multimodal reasoning ability of VLMs.
- North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
- Asia > China > Hubei Province (0.04)
- Research Report (0.46)
- Overview (0.34)
Learnable Game-theoretic Policy Optimization for Data-centric Self-explanation Rationalization
Zhao, Yunxiao, Wang, Zhiqiang, Yu, Xingtong, Li, Xiaoli, Liang, Jiye, Li, Ru
Rationalization, a data-centric framework, aims to build self-explanatory models to explain the prediction outcome by generating a subset of human-intelligible pieces of the input data. It involves a cooperative game model where a generator generates the most human-intelligible parts of the input (i.e., rationales), followed by a predictor that makes predictions based on these generated rationales. Conventional rationalization methods typically impose constraints via regularization terms to calibrate or penalize undesired generation. However, these methods are suffering from a problem called mode collapse, in which the predictor produces correct predictions yet the generator consistently outputs rationales with collapsed patterns. Moreover, existing studies are typically designed separately for specific collapsed patterns, lacking a unified consideration. In this paper, we systematically revisit cooperative rationalization from a novel game-theoretic perspective and identify the fundamental cause of this problem: the generator no longer tends to explore new strategies to uncover informative rationales, ultimately leading the system to converge to a suboptimal game equilibrium (correct predictions v.s collapsed rationales). To solve this problem, we then propose a novel approach, Game-theoretic Policy Optimization oriented RATionalization (PORAT), which progressively introduces policy interventions to address the game equilibrium in the cooperative game process, thereby guiding the model toward a more optimal solution state. We theoretically analyse the cause of such a suboptimal equilibrium and prove the feasibility of the proposed method. Furthermore, we validate our method on nine widely used real-world datasets and two synthetic settings, where PORAT achieves up to 8.1% performance improvements over existing state-of-the-art methods.
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- Asia > Singapore (0.04)
- Asia > Indonesia > Bali (0.04)
- (3 more...)
- Research Report > Promising Solution (0.68)
- Research Report > New Finding (0.67)
- North America > United States > California > San Francisco County > San Francisco (0.14)
- Europe > Austria > Vienna (0.14)
- North America > United States > Maryland > Baltimore (0.04)
- (11 more...)
- Overview (1.00)
- Research Report > Experimental Study (0.46)
- Research Report > New Finding (0.46)