AITopics | utility score

VIBE: Annotation-Free Video-to-Text Information Bottleneck Evaluation for TL;DR

Neural Information Processing SystemsJun-20-2026, 15:22:03 GMT

Many decision-making tasks, where both accuracy and efficiency matter, still require human supervision. For example, tasks like traffic officers reviewing hour-long dashcam footage or researchers screening conference videos can benefit from concise summaries that reduce cognitive load and save time. Yet current vision-language models (VLMs) often produce verbose, redundant outputs that hinder task performance. Existing video caption evaluation depends on costly human annotations and overlooks the summaries' utility in downstream tasks. We address these gaps with Video-to-text Information Bottleneck Evaluation (VIBE), an annotation-free method that scores VLM outputs using two metrics: grounding (how well the summary aligns with visual content) and utility (how informative it is for the task). VIBE selects from randomly sampled VLM outputs by ranking them according to the two scores to support effective human decision-making. Human studies on LearningPaper24, SUTD-TrafficQA, and LongVideoBench show that summaries selected by VIBE consistently improve performance--boosting task accuracy by up to 61.23% and reducing response time by 75.77% compared to naive VLM summaries or raw video. 2

information, machine learning, natural language, (20 more...)

Neural Information Processing Systems

Country: North America > United States > Texas (0.28)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)
Overview (0.68)

Industry:

Education (0.93)
Government (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Vision (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Model Editing for Vision Transformers

Neural Information Processing SystemsJun-17-2026, 18:22:08 GMT

Model editing offers a promising paradigm for efficiently and precisely updating knowledge in pre-trained transformers without costly retraining. While extensively studied in language models (LMs), model editing for vision transformers (ViTs) remains underexplored. Existing methods typically adapt LM-based techniques by modifying the multi-layer perceptron (MLP) modules, overlooking the unique characteristics of ViTs. In this work, we show that ViT predictions are more strongly influenced by the multi-head self-attention (MSA) modules than by the MLPs. Building on this observation, we propose a twostage framework for editing ViTs. First, we identify which attention heads are most responsible for incorrect predictions. Next, we selectively remove the corresponding features to correct the model's prediction. To further balance error correction with predictive stability on unrelated data, we learn a projection matrix that refines the image representations. Extensive experiments across multiple real-world datasets and model editing benchmarks demonstrate that our method consistently outperforms existing model editing methods for ViTs, achieving superior generalization and locality.

artificial intelligence, machine learning, natural language, (18 more...)

Neural Information Processing Systems

Country:

North America (0.28)
Asia (0.28)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.54)

Add feedback

Conditional Forecasts and Proper Scoring Rules for Reliable and Accurate Performative Predictions

Neural Information Processing SystemsJun-15-2026, 04:22:14 GMT

Performative predictions are forecasts which influence the outcomes they aim to predict, undermining the existence of correct forecasts and standard methods of elicitation and estimation. We show that conditioning forecasts on covariates that separate them from the outcome renders the target distribution forecast-invariant, guaranteeing well-posedness of the forecasting problem. However, even under this condition, classical proper scoring rules fail to elicit correct forecasts. We prove a general impossibility result and identify two solutions: (i) in decision-theoretic settings, elicitation of correct and incentive-compatible forecasts is possible if forecasts are separating; (ii) scoring with unbiased estimates of the divergence between the forecast and the induced distribution of the target variable yields correct forecasts. Applying these insights to parameter estimation, conditional forecasts and proper scoring rules enable performatively stable estimation of performatively correct parameters, resolving the issues raised by Perdomo et al. (2020). Our results expose fundamental limits of classical forecast evaluation and offer new tools for reliable and accurate forecasting in performative settings.

artificial intelligence, bayesian inference, machine learning, (20 more...)

Neural Information Processing Systems

Country:

Europe (0.93)
North America > United States > New York (0.28)

Genre: Research Report > Experimental Study (1.00)

Industry:

Leisure & Entertainment > Games (0.96)
Health & Medicine (0.68)
Government (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.48)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Add feedback

5e388103a391daabe3de1d76a6739ccd-AuthorFeedback.pdf

Neural Information Processing SystemsFeb-12-2026, 08:04:23 GMT

application, interpretation, subset, (10 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence (1.00)

Add feedback

Modeling Contextual Passage Utility for Multihop Question Answering

Jain, Akriti, Garimella, Aparna

arXiv.org Artificial IntelligenceDec-9-2025

Multihop Question Answering (QA) requires systems to identify and synthesize information from multiple text passages. While most prior retrieval methods assist in identifying relevant passages for QA, further assessing the utility of the passages can help in removing redundant ones, which may otherwise add to noise and inaccuracies in the generated answers. Existing utility prediction approaches model passage utility independently, overlooking a critical aspect of multihop reasoning: the utility of a passage can be context-dependent, influenced by its relation to other passages - whether it provides complementary information or forms a crucial link in conjunction with others. In this paper, we propose a lightweight approach to model contextual passage utility, accounting for inter-passage dependencies. We fine-tune a small transformer-based model to predict passage utility scores for multihop QA. We leverage the reasoning traces from an advanced reasoning model to capture the order in which passages are used to answer a question and obtain synthetic training data. Through comprehensive experiments, we demonstrate that our utility-based scoring of retrieved passages leads to improved reranking and downstream QA performance compared to relevance-based reranking methods.

large language model, machine learning, utility score, (17 more...)

arXiv.org Artificial Intelligence

2512.06464

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.91)

Add feedback

CoS: Towards Optimal Event Scheduling via Chain-of-Scheduling

Zhao, Yiming, Tang, Jiwei, Di, Shimin, Zheng, Libin, Yu, Jianxing, Yin, Jian

arXiv.org Artificial IntelligenceNov-18-2025

Recommending event schedules is a key issue in Event-based Social Networks (EBSNs) in order to maintain user activity. An effective recommendation is required to maximize the user's preference, subjecting to both time and geographical constraints. Existing methods face an inherent trade-off among efficiency, effectiveness, and generalization, due to the NP-hard nature of the problem. This paper proposes the Chain-of-Scheduling (CoS) framework, which activates the event scheduling capability of Large Language Models (LLMs) through a guided, efficient scheduling process. CoS enhances LLM by formulating the schedule task into three atomic stages, i.e., exploration, verification and integration. Then we enable the LLMs to generate CoS autonomously via Knowledge Distillation (KD). Experimental results show that CoS achieves near-theoretical optimal effectiveness with high efficiency on three real-world datasets in a interpretable manner. Moreover, it demonstrates strong zero-shot learning ability on out-of-domain data.

large language model, machine learning, utility score, (20 more...)

arXiv.org Artificial Intelligence

2511.12913

Country: Asia > China > Guangdong Province (0.28)

Genre: Research Report > New Finding (0.34)

Industry: Information Technology (0.66)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.71)

Add feedback

Conditional Forecasts and Proper Scoring Rules for Reliable and Accurate Performative Predictions

Boeken, Philip, Zoeter, Onno, Mooij, Joris M.

arXiv.org Machine LearningOct-27-2025

Performative predictions are forecasts which influence the outcomes they aim to predict, undermining the existence of correct forecasts and standard methods of elicitation and estimation. We show that conditioning forecasts on covariates that separate them from the outcome renders the target distribution forecast-invariant, guaranteeing well-posedness of the forecasting problem. However, even under this condition, classical proper scoring rules fail to elicit correct forecasts. We prove a general impossibility result and identify two solutions: (i) in decision-theoretic settings, elicitation of correct and incentive-compatible forecasts is possible if forecasts are separating; (ii) scoring with unbiased estimates of the divergence between the forecast and the induced distribution of the target variable yields correct forecasts. Applying these insights to parameter estimation, conditional forecasts and proper scoring rules enable performatively stable estimation of performatively correct parameters, resolving the issues raised by Perdomo et al. (2020). Our results expose fundamental limits of classical forecast evaluation and offer new tools for reliable and accurate forecasting in performative settings.

artificial intelligence, bayesian inference, machine learning, (19 more...)

arXiv.org Machine Learning

2510.21335

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
Europe > Netherlands > North Holland > Amsterdam (0.04)
North America > United States > New Jersey > Hudson County > Hoboken (0.04)
(2 more...)

Genre: Research Report (0.83)

Industry: Leisure & Entertainment > Games (0.97)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.48)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Add feedback

5e388103a391daabe3de1d76a6739ccd-AuthorFeedback.pdf

Neural Information Processing SystemsOct-2-2025, 20:04:15 GMT

application, interpretation, subset, (10 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence (1.00)

Add feedback

We thank reviewers for their constructive comments, please see below for our response

Neural Information Processing SystemsOct-2-2025, 17:22:25 GMT

We thank reviewers for their constructive comments, please see below for our response. We will make this clear in the revised version. We will include the new results in the revision. Reviewer#2-1-Why SVT suffers from low accuracy. PC's original privacy guarantee might not hold because the sensitivity of the utility score calculated with greedy search We will make the statement more clear in the revision.

artificial intelligence, machine learning, reviewer, (16 more...)

Neural Information Processing Systems

Genre: Research Report > New Finding (0.36)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.31)

Add feedback