AITopics

2510.02423

Genre: Research Report (1.00)

Industry:

Media > Film (1.00)
Leisure & Entertainment (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (0.47)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.30)

Neural Information Processing SystemsOct-3-2025, 08:38:36 GMT

A Appendix: An Example of Single-Agent Preference Models Example 2. In a single-agent Mallows ' model M

Let q N and Π be a closed set of strictly positive distributions over [q ] . Let CH (Π) denote the convex hull of Π .

artificial intelligence, denote, hist, (15 more...)

Country: Asia > Middle East > Jordan (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (0.41)

Neural Information Processing SystemsOct-3-2025, 08:03:24 GMT

existence of multiple representations of the same environment for a few sample neurons, we performed hypothesis tests for multiple

We thank all reviewers for their careful reviews and many positive comments. We feel that the typos and minor issues are easily addressable and will be corrected. We will incorporate this analysis into a revision of the paper. We thank R1 for bringing this highly related work to our attention. That work focuses on environments for which mice have previously developed spatial maps.

multiple representation, representation, spatial map, (15 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.60)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (0.47)

Neural Information Processing SystemsOct-3-2025, 07:33:51 GMT

Bridging Machine Learning and Logical Reasoning by Abductive Learning

Wang-Zhou Dai, Qiuling Xu, Yang Yu, Zhi-Hua Zhou

Perception and reasoning are two representative abilities of intelligence that are integrated seamlessly during human problem-solving processes.

knowledge, logical abduction, reasoning, (12 more...)

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
Oceania > Australia > New South Wales > Sydney (0.04)
(11 more...)

Genre: Research Report (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Abductive Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (0.88)
Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (0.71)

Tuan A. Nguyen, Subbarao Kambhampati, Minh Do

Synthesizing Robust Plans under Incomplete Domain Models

Neural Information Processing SystemsOct-3-2025, 07:23:10 GMT

Neural Information Processing Systems http://nips.cc/

incomplete domain model, synthesizing robust plan

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.40)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (0.40)

Tianyi Zhou, Jeff A. Bilmes, Carlos None Guestrin

Divide-and-Conquer Learning by Anchoring a Conical Hull

Neural Information Processing SystemsOct-3-2025, 05:12:17 GMT

Neural Information Processing Systems http://nips.cc/

anchoring, conical hull, divide-and-conquer learning

Technology: Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (0.40)

Diversity-Enhanced Reasoning for Subjective Questions

Wang, Yumeng, Fan, Zhiyuan, Liu, Jiayu, Huang, Jen-tse, Fung, Yi R.

Large Reasoning Models (LRMs) with long chain-of-thought capabilities, optimized via reinforcement learning with verifiable rewards (RLVR), excel at objective reasoning tasks like mathematical problem solving and code generation. However, RLVR is known for degrading generation diversity, which causes LRMs to fall short on subjective reasoning that has multiple answers depending on different role perspectives. While recent studies recognize the importance of diversity-enhanced training in objective reasoning, limited attention has been given to subjective tasks. In this paper, we find that subjective reasoning can be improved by introducing perspective diversity and token-level diversity, with the former one providing a coherent scaffolding anchored to a real-world stakeholder group and the latter one broadening the answer search space. We propose MultiRole-R1, a diversity-enhanced training framework featuring an unsupervised data construction pipeline that synthesizes reasoning chains incorporating various role perspectives. It also employs reinforcement learning via Group Relative Policy Optimization with reward shaping, taking diversity as a reward signal in addition to verifiable reward. Training on subjective tasks solely, MultiRole-R1 increases the in-domain and out-of-domain accuracy by 14.1% and 7.64%, and even enhances the performance on advanced math reasoning such as AIME 2024. We further show that diversity is a more consistent indicator of accuracy than reasoning length.

large language model, machine learning, natural language, (20 more...)

2507.20187

Country:

Europe (1.00)
Asia (0.92)
North America > United States > California (0.45)
North America > United States > Minnesota (0.28)

Genre: Research Report (1.00)

Industry:

Education > Educational Setting (0.47)
Health & Medicine > Health Care Providers & Services (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.97)
(2 more...)

RewardMap: Tackling Sparse Rewards in Fine-grained Visual Reasoning via Multi-Stage Reinforcement Learning

Feng, Sicheng, Tuo, Kaiwen, Wang, Song, Kong, Lingdong, Zhu, Jianke, Wang, Huan

Fine-grained visual reasoning remains a core challenge for multimodal large language models (MLLMs). The recently introduced ReasonMap highlights this gap by showing that even advanced MLLMs struggle with spatial reasoning in structured and information-rich settings such as transit maps, a task of clear practical and scientific importance. However, standard reinforcement learning (RL) on such tasks is impeded by sparse rewards and unstable optimization. To address this, we first construct ReasonMap-Plus, an extended dataset that introduces dense reward signals through Visual Question Answering (VQA) tasks, enabling effective cold-start training of fine-grained visual understanding skills. Next, we propose RewardMap, a multi-stage RL framework designed to improve both visual understanding and reasoning capabilities of MLLMs. RewardMap incorporates two key designs. First, we introduce a difficulty-aware reward design that incorporates detail rewards, directly tackling the sparse rewards while providing richer supervision. Second, we propose a multi-stage RL scheme that bootstraps training from simple perception to complex reasoning tasks, offering a more effective cold-start strategy than conventional Supervised Fine-Tuning (SFT). Experiments on ReasonMap and ReasonMap-Plus demonstrate that each component of RewardMap contributes to consistent performance gains, while their combination yields the best results. Moreover, models trained with RewardMap achieve an average improvement of 3.47% across 6 benchmarks spanning spatial reasoning, fine-grained visual reasoning, and general tasks beyond transit maps, underscoring enhanced visual understanding and reasoning capabilities.

large language model, machine learning, reinforcement learning, (19 more...)

2510.0224

Genre: Research Report (1.00)

Industry: Information Technology (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.86)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
(2 more...)

Look Less, Reason More: Rollout-Guided Adaptive Pixel-Space Reasoning

Li, Xuchen, Li, Xuzhao, Gao, Jiahui, Pi, Renjie, Hu, Shiyu, Zhang, Wentao

Vision-Language Models (VLMs) excel at many multimodal tasks, yet they frequently struggle with tasks requiring precise understanding and handling of fine-grained visual elements. This is mainly due to information loss during image encoding or insufficient attention to critical regions. Recent work has shown promise by incorporating pixel-level visual information into the reasoning process, enabling VLMs to access high-resolution visual details during their thought process. However, this pixel-level information is often overused, leading to inefficiency and distraction from irrelevant visual details. To address these challenges, we propose the first framework for adaptive pixel reasoning that dynamically determines necessary pixel-level operations based on the input query. Specifically, we first apply operation-aware supervised fine-tuning to establish baseline competence in textual reasoning and visual operations, then design a novel rollout-guided reinforcement learning framework relying on feedback of the model's own responses, which enables the VLM to determine when pixel operations should be invoked based on query difficulty. Experiments on extensive multimodal reasoning benchmarks show that our model achieves superior performance while significantly reducing unnecessary visual operations. Impressively, our model achieves 73.4% accuracy on HR-Bench 4K while maintaining a tool usage ratio of only 20.1%, improving accuracy and simultaneously reducing tool usage by 66.5% compared to the previous methods. Vision-Language Models (VLMs) have achieved remarkable progress, leveraging large language models and powerful vision encoders. However, VLMs frequently encounter difficulties in capturing fine-grained visual elements, largely because of information loss in the image encoding process or the limited allocation of attention to critical regions (Ge et al., 2024; He et al., 2024). Recently, advanced models (Su et al., 2025a; Wang et al., 2025c; Zhang et al., 2025b; Zheng et al., 2025; Zhou et al., 2025) have been proposed, which are capable of executing pixel-level operations--an ability we refer to as pixel-space reasoning.

arxiv preprint arxiv, large language model, machine learning, (18 more...)

2510.01681

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.90)
(2 more...)

Chitty-Venkata, Krishna Teja, Emani, Murali

ImageNet-Think-250K: A Large-Scale Synthetic Dataset for Multimodal Reasoning for Vision Language Models

W e develop ImageNet-Think, a multimodal reasoning dataset designed to aid the development of Vision Language Models (VLMs) with explicit reasoning capabilities. Our dataset is built on 250,000 images from ImageNet-21k dataset, providing structured thinking tokens and corresponding answers. Our synthetic dataset is generated by two state-of-the-art VLMs: GLM-4.1V-9B-Thinking and Kimi-VL-A3B-Thinking-2506. Each image is accompanied by two pairs of thinking-answer sequences, creating a resource for training and evaluating multimodal reasoning models. W e capture the step-by-step reasoning process of VLMs and the final descriptive answers. Our goal with this dataset is to enable the development of more robust VLMs while contributing to the broader understanding of multi-modal reasoning mechanisms. The dataset and evaluation benchmarks will be publicly available to aid research in reasoning/thinking multimodal VLMs. The dataset is available here on HuggingFace.

large language model, machine learning, natural language, (20 more...)

2510.01582

Genre: Research Report (1.00)

Industry: Education (0.93)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
(2 more...)