AITopics | prediction reward

Multi-agentactiveperceptionwithpredictionrewards

Neural Information Processing SystemsFeb-9-2026, 13:44:37 GMT

Active perception,collecting observations to reduce uncertainty about ahidden variable, isone of the fundamental capabilities of an intelligent agent [2]. In multi-agent active perceptiona team of autonomous agents cooperatively gathers observations to infer the value of a hidden variable.

artificial intelligence, machine learning, prediction reward, (16 more...)

Neural Information Processing Systems

Country:

North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
Europe > Netherlands > South Holland > Delft (0.04)
Europe > Germany > Hamburg (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.78)

Add feedback

Multi-agent active perception with prediction rewards

Neural Information Processing SystemsDec-24-2025, 09:01:57 GMT

Multi-agent active perception is a task where a team of agents cooperatively gathers observations to compute a joint estimate of a hidden variable. The task is decentralized and the joint estimate can only be computed after the task ends by fusing observations of all agents. The objective is to maximize the accuracy of the estimate. The accuracy is quantified by a centralized prediction reward determined by a centralized decision-maker who perceives the observations gathered by all agents after the task ends. In this paper, we model multi-agent active perception as a decentralized partially observable Markov decision process (Dec-POMDP) with a convex centralized prediction reward. We prove that by introducing individual prediction actions for each agent, the problem is converted into a standard Dec-POMDP with a decentralized prediction reward. The loss due to decentralization is bounded, and we give a sufficient condition for when it is zero. Our results allow application of any Dec-POMDP solution algorithm to multi-agent active perception problems, and enable planning to reduce uncertainty without explicit computation of joint estimates. We demonstrate the empirical usefulness of our results by applying a standard Dec-POMDP algorithm to multi-agent active perception problems, showing increased scalability in the planning horizon.

active perception, multi-agent active perception, prediction reward, (10 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Add feedback

PreferThinker: Reasoning-based Personalized Image Preference Assessment

Xu, Shengqi, Zhou, Xinpeng, Zhang, Yabo, Liu, Ming, Liang, Tao, Zhang, Tianyu, Bai, Yalong, Wu, Zuxuan, Zuo, Wangmeng

arXiv.org Artificial IntelligenceNov-11-2025

Personalized image preference assessment aims to evaluate an individual user's image preferences by relying only on a small set of reference images as prior information. Existing methods mainly focus on general preference assessment, training models with large-scale data to tackle well-defined tasks such as text-image alignment. However, these approaches struggle to handle personalized preference because user-specific data are scarce and not easily scalable, and individual tastes are often diverse and complex. To overcome these challenges, we introduce a common preference profile that serves as a bridge across users, allowing large-scale user data to be leveraged for training profile prediction and capturing complex personalized preferences. Building on this idea, we propose a reasoning-based personalized image preference assessment framework that follows a \textit{predict-then-assess} paradigm: it first predicts a user's preference profile from reference images, and then provides interpretable, multi-dimensional scores and assessments of candidate images based on the predicted profile. To support this, we first construct a large-scale Chain-of-Thought (CoT)-style personalized assessment dataset annotated with diverse user preference profiles and high-quality CoT-style reasoning, enabling explicit supervision of structured reasoning. Next, we adopt a two-stage training strategy: a cold-start supervised fine-tuning phase to empower the model with structured reasoning capabilities, followed by reinforcement learning to incentivize the model to explore more reasonable assessment paths and enhance generalization. Furthermore, we propose a similarity-aware prediction reward to encourage better prediction of the user's preference profile, which facilitates more reasonable assessments exploration. Extensive experiments demonstrate the superiority of the proposed method.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2511.00609

Country: Asia (0.28)

Genre: Research Report (0.64)

Industry: Information Technology (0.34)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
(2 more...)

Add feedback

Multi-agent active perception with prediction rewards (Supplementary material)

Neural Information Processing SystemsAug-15-2025, 10:16:41 GMT

This supplementary document is structured as follows.

individual prediction action, nullm, prediction reward, (15 more...)

Neural Information Processing Systems

Country:

Europe > Netherlands > South Holland > Delft (0.04)
North America > Canada (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Germany > Hamburg (0.04)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)

Add feedback

Multi-agent active perception with prediction rewards

Neural Information Processing SystemsAug-15-2025, 10:16:33 GMT

Multi-agent active perception is a task where a team of agents cooperatively gathers observations to compute a joint estimate of a hidden variable.

dec-pomdp, nullm, prediction reward, (13 more...)

Neural Information Processing Systems

Country:

Europe > Netherlands > South Holland > Delft (0.04)
North America > Canada (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Germany > Hamburg (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.84)

Add feedback

Review for NeurIPS paper: Multi-agent active perception with prediction rewards

Neural Information Processing SystemsJan-27-2025, 01:00:45 GMT

Weaknesses: The paper is well written and easy to follow. The problem is active perception is also interesting. There are a few areas where more clarification is needed as pointed below: -- The authors have highlighted a number of previous models for the problem of active perception such as Dec-\rhoPOMDP, POMDP-IR etc. Given the focus on converting this problem to a decentralized framework, it is not clearly conveyed why decentralizing the problem is significant? There are hints available in the paper such as less communication overhead, but there is no empirical evidence presented towards justifying decentralized approaches over such previous approaches (e.g., how much communication overhead is reduced) -- The technical approach presented by the authors is elegant and simple, but it is essentially a heuristic approach. The bound provided in theorem 1 would seem to be loose in the worst case (and its values in experiments is not shown).

multi-agent active perception, neurips paper, prediction reward, (7 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.72)

Add feedback

Review for NeurIPS paper: Multi-agent active perception with prediction rewards

Neural Information Processing SystemsJan-27-2025, 01:00:39 GMT

This paper addresses the problem of multiagent active perception, a somewhat nascent area, and proposes a new reformulation of Dec-rho-POMDPs into a DEC-POMDP though the addition of a final-stage "predictive action." The reviewers appreciated the novelty of this contribution as well as the theoretical analysis/loss bounds. The original reviews raised a number of questions however, and the author response addressed many of these. However, there remain some issues that undercut the significance of the contribution, including: the somewhat incremental combination/adaptation of existing techniques; the fact that the claimed scalability is not demonstrated very convincingly in the experiments; among others. On my reading of the paper, I largely concur and do not reiterate the positive contributions in the other reviews, but point out some concerns about importance/impact: 1.

contribution, multi-agent active perception, prediction reward, (4 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.88)

Add feedback

Multi-agent active perception with prediction rewards

Neural Information Processing SystemsOct-10-2024, 22:25:32 GMT

Multi-agent active perception is a task where a team of agents cooperatively gathers observations to compute a joint estimate of a hidden variable. The task is decentralized and the joint estimate can only be computed after the task ends by fusing observations of all agents. The objective is to maximize the accuracy of the estimate. The accuracy is quantified by a centralized prediction reward determined by a centralized decision-maker who perceives the observations gathered by all agents after the task ends. In this paper, we model multi-agent active perception as a decentralized partially observable Markov decision process (Dec-POMDP) with a convex centralized prediction reward.

active perception, multi-agent active perception, prediction reward, (7 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Add feedback

Dynamic feature selection in medical predictive monitoring by reinforcement learning

Chen, Yutong, Gao, Jiandong, Wu, Ji

arXiv.org Artificial IntelligenceMay-30-2024

In this paper, we investigate dynamic feature selection within multivariate time-series scenario, a common occurrence in clinical prediction monitoring where each feature corresponds to a bio-test result. Many existing feature selection methods fall short in effectively leveraging time-series information, primarily because they are designed for static data. Our approach addresses this limitation by enabling the selection of time-varying feature subsets for each patient. Specifically, we employ reinforcement learning to optimize a policy under maximum cost restrictions. The prediction model is subsequently updated using synthetic data generated by trained policy. Our method can seamlessly integrate with non-differentiable prediction models. We conducted experiments on a sizable clinical dataset encompassing regression and classification tasks. The results demonstrate that our approach outperforms strong feature selection baselines, particularly when subjected to stringent cost limitations. Code will be released once paper is accepted.

feature selection, prediction, predictor, (16 more...)

arXiv.org Artificial Intelligence

2405.19729

Country:

Asia > China > Beijing > Beijing (0.04)
North America > United States > Nevada > Clark County > Las Vegas (0.04)

Genre: Research Report > New Finding (0.88)

Industry:

Health & Medicine > Therapeutic Area (0.93)
Health & Medicine > Diagnostic Medicine (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.47)

Add feedback

Multi-agent active perception with prediction rewards

Lauri, Mikko, Oliehoek, Frans A.

arXiv.org Artificial IntelligenceOct-22-2020

Multi-agent active perception is a task where a team of agents cooperatively gathers observations to compute a joint estimate of a hidden variable. The task is decentralized and the joint estimate can only be computed after the task ends by fusing observations of all agents. The objective is to maximize the accuracy of the estimate. The accuracy is quantified by a centralized prediction reward determined by a centralized decision-maker who perceives the observations gathered by all agents after the task ends. In this paper, we model multi-agent active perception as a decentralized partially observable Markov decision process (Dec-POMDP) with a convex centralized prediction reward. We prove that by introducing individual prediction actions for each agent, the problem is converted into a standard Dec-POMDP with a decentralized prediction reward. The loss due to decentralization is bounded, and we give a sufficient condition for when it is zero. Our results allow application of any Dec-POMDP solution algorithm to multi-agent active perception problems, and enable planning to reduce uncertainty without explicit computation of joint estimates. We demonstrate the empirical usefulness of our results by applying a standard Dec-POMDP algorithm to multi-agent active perception problems, showing increased scalability in the planning horizon.

dec-pomdp, individual prediction action, prediction reward, (13 more...)

arXiv.org Artificial Intelligence

2010.11835

Country: