AITopics | progressively

EvolvedGRPO: Unlocking Reasoning in LVLMs via Progressive Instruction Evolution

Neural Information Processing SystemsJun-13-2026, 07:21:14 GMT

Recent advances in reinforcement learning (RL) methods such as Grouped Relative Policy Optimization (GRPO) have strengthened the reasoning capabilities of Large Vision-Language Models (LVLMs). However, due to the inherent entanglement between visual and textual modalities, applying GRPO to LVLMs often leads to reward convergence across different responses to the same sample as training progresses, hindering effective gradient updates and causing the enhancement of chain-of-thought reasoning to stagnate or even collapse. To address this issue, we propose a progressive instruction evolution framework, EvolvedGRPO, to gradually generate more complex questions via editing instructions in an adversarial way, progressively aligned with the model's evolving capabilities. Specifically, we design two instruction editing strategies across modalities, incorporating incrementally increasing editing instructions and RL-based adversarial data augmentation to improve the effectiveness of model training. To address GRPO's limitations on overly difficult problems, we first train on basic subproblem versions of complex multi-modal questions in both the visual and textual modalities, progressively increasing difficulty to enable prefix-style process rewards, effectively combining the strengths of both process rewards and group-wise relative rewards. Finally, EvolvedGRPO achieves state-of-the-art performance among open-source RL models on multi-modal reasoning tasks, even approaching the closed-source GPT-4o in reasoning capabilities, and demonstrates better performance on unseen LVLM general benchmarks.

artificial intelligence, machine learning, natural language, (10 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language (0.82)

Add feedback

Boosting Weakly Supervised Referring Image Segmentation via Progressive Comprehension

Neural Information Processing SystemsMar-21-2026, 23:56:18 GMT

This paper explores the weakly-supervised referring image segmentation (WRIS) problem, and focuses on a challenging setup where target localization is learned directly from image-text pairs. We note that the input text description typically already contains detailed information on how to localize the target object, and we also observe that humans often follow a step-by-step comprehension process (\ie, progressively utilizing target-related attributes and relations as cues) to identify the target object. Hence, we propose a novel Progressive Comprehension Network (PCNet) to leverage target-related textual cues from the input description for progressively localizing the target object.Specifically, we first use a Large Language Model (LLM) to decompose the input text description into short phrases. These short phrases are taken as target-related cues and fed into a Conditional Referring Module (CRM) in multiple stages, to allow updating the referring text embedding and enhance the response map for target localization in a multi-stage manner.Based on the CRM, we then propose a Region-aware Shrinking (RaS) loss to constrain the visual localization to be conducted progressively in a coarse-to-fine manner across different stages.Finally, we introduce an Instance-aware Disambiguation (IaD) loss to suppress instance localization ambiguity by differentiating overlapping response maps generated by different referring texts on the same image. Extensive experiments show that our method outperforms SOTA methods on three common benchmarks.

artificial intelligence, large language model, natural language, (8 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.59)

Add feedback

50ca96a1a9ebe0b5e5688a504feb6107-Supplemental-Conference.pdf

Neural Information Processing SystemsFeb-19-2026, 03:15:51 GMT

epoch, joindet, supervision, (14 more...)

Neural Information Processing Systems

Country:

Asia > China > Shanghai > Shanghai (0.05)
North America > United States > Colorado (0.04)
Asia > China > Jiangsu Province > Nanjing (0.04)

Industry: Energy (0.47)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Vision (0.66)

Add feedback

a97f0218b49bc17ea3f121a0e724f028-Paper-Conference.pdf

Neural Information Processing SystemsFeb-17-2026, 07:37:47 GMT

large language model, machine learning, natural language, (20 more...)

Neural Information Processing Systems

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > China > Hong Kong (0.04)

Genre: Research Report > Experimental Study (0.93)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.96)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.67)

Add feedback

Appendix

Neural Information Processing SystemsFeb-9-2026, 14:44:34 GMT

Details regarding the datasets used in the experiments are included in Table 2. For Yang et al. [2020], we progressively doubled the number of regions searched which is the only adjustable hyperparameter. To make this figure, we run all the experiments (all attacks, datasets, and choices of hyperparameters)onaserverwith40coresofIntel(R)Xeon(R)Gold6230CPU@2.10GHz. This outcome is seemingly perplexing than the previous one. We explain it for different values ofm, namely the small-mandthelarge-mregions.

artificial intelligence, hyperparameter, wangetal, (18 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence (0.69)

Add feedback

Boosting Weakly-Supervised Referring Image Segmentation via Progressive Comprehension

Neural Information Processing SystemsOct-10-2025, 12:43:19 GMT

In this work, we focus on obtaining supervision from text descriptions only.

localization, response map, segmentation, (17 more...)

Neural Information Processing Systems

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > China > Hong Kong (0.04)

Genre: Research Report > Experimental Study (0.93)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.96)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.67)

Add feedback

Progressive Weight Loading: Accelerating Initial Inference and Gradually Boosting Performance on Resource-Constrained Environments

Kim, Hyunwoo, Lee, Junha, Choi, Mincheol, Lee, Jeonghwan, Cho, Jaeshin

arXiv.org Artificial IntelligenceOct-2-2025

Deep learning models have become increasingly large and complex, resulting in higher memory consumption and computational demands. Consequently, model loading times and initial inference latency have increased, posing significant challenges in mobile and latency-sensitive environments where frequent model loading and unloading are required, which directly impacts user experience. While Knowledge Distillation (KD) offers a solution by compressing large teacher models into smaller student ones, it often comes at the cost of reduced performance. To address this trade-off, we propose Progressive Weight Loading (PWL), a novel technique that enables fast initial inference by first deploying a lightweight student model, then incrementally replacing its layers with those of a pre-trained teacher model. To support seamless layer substitution, we introduce a training method that not only aligns intermediate feature representations between student and teacher layers, but also improves the overall output performance of the student model. Our experiments on VGG, ResNet, and ViT architectures demonstrate that models trained with PWL maintain competitive distillation performance and gradually improve accuracy as teacher layers are loaded--matching the final accuracy of the full teacher model without compromising initial inference speed. This makes PWL particularly suited for dynamic, resource-constrained deployments where both responsiveness and performance are critical.

machine learning, natural language, student model, (18 more...)

arXiv.org Artificial Intelligence

2509.22319

Genre: Research Report > Promising Solution (0.34)

Industry: Education > Educational Technology > Educational Software (0.60)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Vision (0.94)

Add feedback

ee1abc6b5f7c6acb34ad076b05d40815-Paper.pdf

Neural Information Processing SystemsAug-18-2025, 16:01:43 GMT

artificial intelligence, dataset, machine learning, (19 more...)

Neural Information Processing Systems

Country:

North America > United States > New York (0.04)
North America > United States > Colorado > Pueblo County > Pueblo (0.04)

Genre: Research Report > New Finding (0.93)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

50ca96a1a9ebe0b5e5688a504feb6107-Supplemental-Conference.pdf

Neural Information Processing SystemsAug-14-2025, 19:43:35 GMT

epoch, joindet, supervision, (14 more...)

Neural Information Processing Systems

Country:

Asia > China > Shanghai > Shanghai (0.05)
North America > United States > Colorado (0.04)
Asia > China > Jiangsu Province > Nanjing (0.04)

Industry: Energy (0.47)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Improving Constrained Generation in Language Models via Self-Distilled Twisted Sequential Monte Carlo

Kim, Sooyeon, Nam, Giung, Lee, Juho

arXiv.org Machine LearningJul-4-2025

Recent work has framed constrained text generation with autoregressive language models as a probabilistic inference problem. Among these, Zhao et al. (2024) introduced a promising approach based on twisted Sequential Monte Carlo, which incorporates learned twist functions and twist-induced proposals to guide the generation process. However, in constrained generation settings where the target distribution concentrates on outputs that are unlikely under the base model, learning becomes challenging due to sparse and uninformative reward signals. We show that iteratively refining the base model through self-distillation alleviates this issue by making the model progressively more aligned with the target, leading to substantial gains in generation quality.

artificial intelligence, language model, natural language, (16 more...)

arXiv.org Machine Learning

2507.02315

Country: Asia > South Korea > Seoul > Seoul (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)

Add feedback

Filters

Collaborating Authors

progressively

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

EvolvedGRPO: Unlocking Reasoning in LVLMs via Progressive Instruction Evolution

Boosting Weakly Supervised Referring Image Segmentation via Progressive Comprehension

50ca96a1a9ebe0b5e5688a504feb6107-Supplemental-Conference.pdf

a97f0218b49bc17ea3f121a0e724f028-Paper-Conference.pdf

Appendix

Boosting Weakly-Supervised Referring Image Segmentation via Progressive Comprehension

Progressive Weight Loading: Accelerating Initial Inference and Gradually Boosting Performance on Resource-Constrained Environments

ee1abc6b5f7c6acb34ad076b05d40815-Paper.pdf

50ca96a1a9ebe0b5e5688a504feb6107-Supplemental-Conference.pdf

Improving Constrained Generation in Language Models via Self-Distilled Twisted Sequential Monte Carlo