AITopics | Overview

Collaborating Authors

Overview

Active Surrogate Estimators: An Active Learning Approach to Label-Efficient Model Evaluation Sebastian Farquhar 1,3 Yarin Gal 1 Tom Rainforth

Neural Information Processing SystemsMar-27-2025, 08:42:27 GMT

We propose Active Surrogate Estimators (ASEs), a new method for label-efficient model evaluation. Evaluating model performance is a challenging and important problem when labels are expensive. ASEs address this active testing problem using a surrogate-based estimation approach that interpolates the errors of points with unknown labels, rather than forming a Monte Carlo estimator. ASEs actively learn the underlying surrogate, and we propose a novel acquisition strategy, XWED, that tailors this learning to the final estimation task. We find that ASEs offer greater label-efficiency than the current state-of-the-art when applied to challenging model evaluation problems for deep neural networks.

acquisition strategy, artificial intelligence, machine learning, (17 more...)

Neural Information Processing Systems

Genre:

Research Report (1.00)
Overview (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.67)

Add feedback

When LLMs Meet Cunning Texts: A Fallacy Understanding Benchmark for Large Language Models Yinghui Li

Neural Information Processing SystemsMar-27-2025, 08:08:56 GMT

Recently, Large Language Models (LLMs) make remarkable evolutions in language understanding and generation. Following this, various benchmarks for measuring all kinds of capabilities of LLMs have sprung up. In this paper, we challenge the reasoning and understanding abilities of LLMs by proposing a FaLlacy Understanding Benchmark (FLUB) containing cunning texts that are easy for humans to understand but difficult for models to grasp. Specifically, the cunning texts that FLUB focuses on mainly consist of the tricky, humorous, and misleading texts collected from the real internet environment. And we design three tasks with increasing difficulty in the FLUB benchmark to evaluate the fallacy understanding ability of LLMs. Based on FLUB, we investigate the performance of multiple representative and advanced LLMs, reflecting our FLUB is challenging and worthy of more future study. Interesting discoveries and valuable insights are achieved in our extensive experiments and detailed analyses. We hope that our benchmark can encourage the community to improve LLMs' ability to understand fallacies. Our data and codes are available at https://github.com/THUKElab/FLUB.

large language model, machine learning, natural language, (22 more...)

Neural Information Processing Systems

Country:

Europe (0.67)
Asia > China > Guangdong Province (0.28)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre:

Overview (1.00)
Research Report > New Finding (0.46)
Research Report > Experimental Study (0.46)

Industry:

Law (1.00)
Information Technology (1.00)
Law Enforcement & Public Safety > Crime Prevention & Enforcement (0.68)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Add feedback

7f7fa581cc8a1970a4332920cdf87395-Paper-Conference.pdf

Neural Information Processing SystemsMar-27-2025, 07:58:40 GMT

data assimilation, survey article, trajectory, (15 more...)

Neural Information Processing Systems

Country: Europe (0.14)

Genre:

Overview (0.68)
Research Report (0.48)

Industry: Energy > Oil & Gas > Upstream (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.68)
Information Technology > Sensing and Signal Processing > Image Processing (0.68)

Add feedback

N-Agent Ad Hoc Teamwork

Neural Information Processing SystemsMar-27-2025, 07:58:00 GMT

Current approaches to learning cooperative multi-agent behaviors assume relatively restrictive settings. In standard fully cooperative multi-agent reinforcement learning, the learning algorithm controls all agents in the scenario, while in ad hoc teamwork, the learning algorithm usually assumes control over only a single agent in the scenario. However, many cooperative settings in the real world are much less restrictive. For example, in an autonomous driving scenario, a company might train its cars with the same learning algorithm, yet once on the road, these cars must cooperate with cars from another company. Towards expanding the class of scenarios that cooperative learning methods may optimally address, we introduce N-agent ad hoc teamwork (NAHT), where a set of autonomous agents must interact and cooperate with dynamically varying numbers and types of teammates. This paper formalizes the problem, and proposes the Policy Optimization with Agent Modelling (POAM) algorithm. POAM is a policy gradient, multi-agent reinforcement learning approach to the NAHT problem that enables adaptation to diverse teammate behaviors by learning representations of teammate behaviors. Empirical evaluation on tasks from the multi-agent particle environment and Star-Craft II shows that POAM improves cooperative task returns compared to baseline approaches, and enables out-of-distribution generalization to unseen teammates.

artificial intelligence, machine learning, reinforcement learning, (19 more...)

Neural Information Processing Systems

Country: North America > United States > Texas (0.14)

Genre:

Research Report > Experimental Study (1.00)
Overview (0.67)

Industry:

Leisure & Entertainment > Games (0.93)
Information Technology (0.87)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.86)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Add feedback

Motion Graph Unleashed: A Novel Approach to Video Prediction Bohan Tang

Neural Information Processing SystemsMar-27-2025, 07:46:36 GMT

We introduce motion graph, a novel approach to the video prediction problem, which predicts future video frames from limited past data. The motion graph transforms patches of video frames into interconnected graph nodes, to comprehensively describe the spatial-temporal relationships among them. This representation overcomes the limitations of existing motion representations such as image differences, optical flow, and motion matrix that either fall short in capturing complex motion patterns or suffer from excessive memory consumption. We further present a video prediction pipeline empowered by motion graph, exhibiting substantial performance improvements and cost reductions. Experiments on various datasets, including UCF Sports, KITTI and Cityscapes, highlight the strong representative ability of motion graph. Especially on UCF Sports, our method matches and outperforms the SOTA methods with a significant reduction in model size by 78% and a substantial decrease in GPU memory utilization by 47%. Please refer to this link for the official code.

machine learning, natural language, prediction, (15 more...)

Neural Information Processing Systems

Country: North America > United States > California (0.14)

Genre:

Research Report > Experimental Study (0.93)
Research Report > Promising Solution (0.60)
Overview > Innovation (0.60)

Industry: Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(2 more...)

Add feedback

Emotion-LLaMA: Multimodal Emotion Recognition and Reasoning with Instruction Tuning Jun-Yan He4 Jingdong Sun 3

Neural Information Processing SystemsMar-27-2025, 07:43:20 GMT

Accurate emotion perception is crucial for various applications, including humancomputer interaction, education, and counseling. However, traditional singlemodality approaches often fail to capture the complexity of real-world emotional expressions, which are inherently multimodal.

large language model, machine learning, natural language, (18 more...)

Neural Information Processing Systems

Country:

Asia (0.67)
North America > United States (0.45)

Genre:

Research Report > Experimental Study (1.00)
Overview (0.67)

Industry:

Government (1.00)
Education (1.00)
Health & Medicine > Therapeutic Area > Psychiatry/Psychology > Mental Health (0.92)

Technology:

Information Technology > Artificial Intelligence > Vision > Face Recognition (1.00)
Information Technology > Artificial Intelligence > Speech (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(5 more...)

Add feedback

Knowledge Graph Completion by Intermediate Variables Regularization

Neural Information Processing SystemsMar-27-2025, 07:31:07 GMT

Knowledge graph completion (KGC) can be framed as a 3-order binary tensor completion task. Tensor decomposition-based (TDB) models have demonstrated strong performance in KGC. In this paper, we provide a summary of existing TDB models and derive a general form for them, serving as a foundation for further exploration of TDB models. Despite the expressiveness of TDB models, they are prone to overfitting. Existing regularization methods merely minimize the norms of embeddings to regularize the model, leading to suboptimal performance.

artificial intelligence, machine learning, natural language, (21 more...)

Neural Information Processing Systems

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.93)
Overview (0.66)

Technology:

Information Technology > Artificial Intelligence > Natural Language (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.66)
Information Technology > Artificial Intelligence > Representation & Reasoning > Semantic Networks (0.61)

Add feedback

SlimGPT: Layer-wise Structured Pruning for Large Language Models

Neural Information Processing SystemsMar-27-2025, 06:43:05 GMT

Large language models (LLMs) have garnered significant attention for their remarkable capabilities across various domains, whose vast parameter scales present challenges for practical deployment. Structured pruning is an effective method to balance model performance with efficiency, but performance restoration under computational resource constraints is a principal challenge in pruning LLMs. Therefore, we present a low-cost and fast structured pruning method for LLMs named SlimGPT based on the Optimal Brain Surgeon framework. We propose Batched Greedy Pruning for rapid and near-optimal pruning, which enhances the accuracy of head-wise pruning error estimation through grouped Cholesky decomposition and improves the pruning efficiency of FFN via Dynamic Group Size, thereby achieving approximate local optimal pruning results within one hour. Besides, we explore the limitations of layer-wise pruning from the perspective of error accumulation and propose Incremental Pruning Ratio, a non-uniform pruning strategy to reduce performance degradation. Experimental results on the LLaMA benchmark show that SlimGPT outperforms other methods and achieves state-of-the-art results.

large language model, machine learning, pruning, (18 more...)

Neural Information Processing Systems

Genre:

Research Report > Experimental Study (0.93)
Overview (0.92)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

The PRISM Alignment Dataset: What Participatory, Representative and Individualised Human Feedback Reveals About the Subjective and Multicultural Alignment of Large Language Models Hannah Rose Kirk 1

Neural Information Processing SystemsMar-27-2025, 06:13:12 GMT

Human feedback is central to the alignment of Large Language Models (LLMs). However, open questions remain about methods (how), domains (where), people (who) and objectives (to what end) of feedback processes.

large language model, machine learning, natural language, (21 more...)

Neural Information Processing Systems

Country:

South America (1.00)
Oceania (1.00)
Europe > United Kingdom (1.00)
(3 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)
Questionnaire & Opinion Survey (1.00)
(2 more...)

Industry:

Media (1.00)
Leisure & Entertainment (1.00)
Information Technology > Security & Privacy (1.00)
(9 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Supplementary Material Infer Induced Sentiment of Comment Response to Video: A New Task, Dataset and Baseline 1 Lu Liu

Neural Information Processing SystemsMar-27-2025, 05:03:44 GMT

This section provides a comprehensive overview of the CSMV dataset. The CSMV dataset comprises micro videos and their corresponding comments, which have been updated from February 2020 to October 2022. This extensive time range allows for the inclusion of a diverse set of content, capturing the evolution of sentiments over the course of more than two years. In total, the CSMV dataset comprises 8,210 micro videos, totaling approximately 68.83 hours of video duration, along with 107,267 related comments. The CSMV dataset defines two distinct types of labels, opinion and emotion, for analyzing the sentiment expressed in the comments towards the micro videos. By leveraging the combination of video and textual content in this dataset, researchers can examine the interaction between language expressions and visual cues in sentiment analysis. To deepen our understanding of the CSMV dataset, we performed an analysis of the distribution of videos and related comments using specific hashtags. As depicted in Figure 1, this distribution exhibits a rich diversity of topics in video content. This diversity has brought rich expression of sentiment in user comments, giving the CSMV dataset an advantage in comprehending the complexity of induced sentiment. Moreover, this diversity expands the application of the dataset for multimodal sentiment analysis tasks.

artificial intelligence, large language model, natural language, (19 more...)

Neural Information Processing Systems

Genre: Overview (0.88)

Industry: