Goto

Collaborating Authors

 contextual example


Language Models Can See Better: Visual Contrastive Decoding For LLM Multimodal Reasoning

arXiv.org Artificial Intelligence

Although Large Language Models (LLMs) excel in reasoning and generation for language tasks, they are not specifically designed for multimodal challenges. Training Multimodal Large Language Models (MLLMs), however, is resource-intensive and constrained by various training limitations. In this paper, we propose the Modular-based Visual Contrastive Decoding (MVCD) framework to move this obstacle. Our framework leverages LLMs' In-Context Learning (ICL) capability and the proposed visual contrastive-example decoding (CED), specifically tailored for this framework, without requiring any additional training. By converting visual signals into text and focusing on contrastive output distributions during decoding, we can highlight the new information introduced by contextual examples, explore their connections, and avoid over-reliance on prior encoded knowledge. MVCD enhances LLMs' visual perception to make it see and reason over the input visuals. To demonstrate MVCD's effectiveness, we conduct experiments with four LLMs across five question answering datasets. Our results not only show consistent improvement in model accuracy but well explain the effective components inside our decoding strategy. Our code will be available at https://github.com/Pbhgit/MVCD.


Visual Prompt Selection for In-Context Learning Segmentation

arXiv.org Artificial Intelligence

As a fundamental and extensively studied task in computer vision, image segmentation aims to locate and identify different semantic concepts at the pixel level. Recently, inspired by In-Context Learning (ICL), several generalist segmentation frameworks have been proposed, providing a promising paradigm for segmenting specific objects. However, existing works mostly ignore the value of visual prompts or simply apply similarity sorting to select contextual examples. In this paper, we focus on rethinking and improving the example selection strategy. By comprehensive comparisons, we first demonstrate that ICL-based segmentation models are sensitive to different contexts. Furthermore, empirical evidence indicates that the diversity of contextual prompts plays a crucial role in guiding segmentation. Based on the above insights, we propose a new stepwise context search method. Different from previous works, we construct a small yet rich candidate pool and adaptively search the well-matched contexts. More importantly, this method effectively reduces the annotation cost by compacting the search space. Extensive experiments show that our method is an effective strategy for selecting examples and enhancing segmentation performance.


Learning MAX-SAT from Contextual Examples for Combinatorial Optimisation

arXiv.org Artificial Intelligence

Combinatorial optimisation problems are ubiquitous in artificial intelligence. Designing the underlying models, however, requires substantial expertise, which is a limiting factor in practice. The models typically consist of hard and soft constraints, or combine hard constraints with an objective function. We introduce a novel setting for learning combinatorial optimisation problems from contextual examples. These positive and negative examples show - in a particular context - whether the solutions are good enough or not. We develop our framework using the MAX-SAT formalism as it is simple yet powerful setting having these features. We study the learnability of MAX-SAT models. Our theoretical results show that high-quality MAX-SAT models can be learned from contextual examples in the realisable and agnostic settings, as long as the data satisfies an intuitive "representativeness" condition. We also contribute two implementations based on our theoretical results: one leverages ideas from syntax-guided synthesis while the other makes use of stochastic local search techniques. The two implementations are evaluated by recovering synthetic and benchmark models from contextual examples. The experimental results support our theoretical analysis, showing that MAX-SAT models can be learned from contextual examples. Among the two implementations, the stochastic local search learner scales much better than the syntax-guided implementation while providing comparable or better models.


Learning Mixed-Integer Linear Programs from Contextual Examples

arXiv.org Artificial Intelligence

Mixed-integer linear programs (MILPs) are widely used in artificial intelligence and operations research to model complex decision problems like scheduling and routing. Designing such programs however requires both domain and modelling expertise. In this paper, we study the problem of acquiring MILPs from contextual examples, a novel and realistic setting in which examples capture solutions and non-solutions within a specific context. The resulting learning problem involves acquiring continuous parameters -- namely, a cost vector and a feasibility polytope -- but has a distinctly combinatorial flavor. To solve this complex problem, we also contribute MISSLE, an algorithm for learning MILPs from contextual examples. MISSLE uses a variant of stochastic local search that is guided by the gradient of a continuous surrogate loss function. Our empirical evaluation on synthetic data shows that MISSLE acquires better MILPs faster than alternatives based on stochastic local search and gradient descent.