AITopics

Unsupervised Image Denoising with Score Function

Neural Information Processing SystemsMay-25-2025, 13:56:21 GMT

Though achieving excellent performance in some cases, current unsupervised learning methods for single image denoising usually have constraints in applications. In this paper, we propose a new approach which is more general and applicable to complicated noise models. Utilizing the property of score function, the gradient of logarithmic probability, we define a solving system for denoising. Once the score function of noisy images has been estimated, the denoised result can be obtained through the solving system. Our approach can be applied to multiple noise models, such as the mixture of multiplicative and additive noise combined with structured correlation. Experimental results show that our method is comparable when the noise model is simple, and has good performance in complicated cases where other methods are not applicable or perform poorly.

artificial intelligence, machine learning, noise model, (15 more...)

Neural Information Processing Systems

Genre: Research Report > New Finding (0.34)

Industry: Health & Medicine (0.68)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

LucidAction: A Hierarchical and Multi-model Dataset for Comprehensive Action Quality Assessment Linfeng Dong 1,2, Wei Wang, and Xiao Sun

Neural Information Processing SystemsMay-25-2025, 13:54:27 GMT

AQA dataset structured on curriculum learning principles.

action quality assessment, artificial intelligence, machine learning, (13 more...)

Neural Information Processing Systems

Country:

Europe > Netherlands (0.14)
Asia > Middle East > Israel (0.14)
Asia > Japan (0.14)
Asia > China (0.14)

Genre: Research Report (0.46)

Industry:

Education (0.68)
Leisure & Entertainment > Sports > Olympic Games (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Evaluating Cognitive Maps and Planning in Large Language Models with CogEval (Supplementary Materials)

Neural Information Processing SystemsMay-25-2025, 13:54:19 GMT

We chose a logistic regression to model the number of items the LLM answers correctly in a given dialog out of a total number of possible correct answers. We aggregated the results into an analysis of deviance table (the generalized linear model equivalent of Analysis of Variance or ANOVA), which highlights the contributions of each factor and their interactions to performance, along with significance statistics. In the presented study, the "score" of a dialog is the number of correct answers provided by the LLM out of a total number of correct answers possible for that dialog. We modeled the score using a logistic regression approach; the score follows a binomial distribution with a probability parameter determined by the three categorical variables (graph structure, condition, and domain) as well as model and temperature. We our regression model included second and third-order interaction terms between levels of these three terms.

large language model, machine learning, natural language, (18 more...)

Neural Information Processing Systems

Country:

Europe (0.29)
North America > United States (0.15)

Genre:

Research Report > New Finding (0.87)
Research Report > Experimental Study (0.55)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)

Add feedback

Evaluating Cognitive Maps and planning in Large Language Models with CogEval

Neural Information Processing SystemsMay-25-2025, 13:54:16 GMT

Yet, most rely on anecdotes, overlook contamination of training sets, or lack systematic Evaluation involving multiple tasks, control conditions, multiple iterations, and statistical robustness tests. Here we make two major contributions. First, we propose CogEval, a cognitive science-inspired protocol for the systematic evaluation of cognitive capacities in LLMs. The CogEval protocol can be followed for the evaluation of various abilities. Second, here we follow CogEval to systematically evaluate cognitive maps and planning ability across eight LLMs (OpenAI GPT-4, GPT-3.5-turbo-175B,

large language model, machine learning, natural language, (19 more...)

Neural Information Processing Systems

Country: North America > United States (0.46)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.93)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.95)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Module-wise Adaptive Distillation for Multimodality Foundation Models

Neural Information Processing SystemsMay-25-2025, 13:53:54 GMT

Pre-trained multimodal foundation models have demonstrated remarkable generalizability but pose challenges for deployment due to their large sizes. One effective approach to reducing their sizes is layerwise distillation, wherein small student models are trained to match the hidden representations of large teacher models at each layer. Motivated by our observation that certain architecture components, referred to as modules, contribute more significantly to the student's performance than others, we propose to track the contributions of individual modules by recording the loss decrement after distillation each module and choose the module with a greater contribution to distill more frequently. Such an approach can be naturally formulated as a multi-armed bandit (MAB) problem, where modules and loss decrements are considered as arms and rewards, respectively. We then develop a modified-Thompson sampling algorithm named OPTIMA to address the nonstationarity of module contributions resulting from model updating. Specifically, we leverage the observed contributions in recent history to estimate the changing contribution of each module and select modules based on these estimations to maximize the cumulative contribution. We evaluate the effectiveness of OPTIMA through distillation experiments on various multimodal understanding and image captioning tasks, using the CoCa-Large model [48] as the teacher model.

data mining, machine learning, natural language, (19 more...)

Neural Information Processing Systems

Country: North America > United States > Maryland (0.14)

Industry: Education (0.67)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Data Science > Data Mining > Big Data (0.67)

Add feedback

dc81297c791bb989deade65c6bd8c1d8-Paper-Conference.pdf

Neural Information Processing SystemsMay-25-2025, 13:53:37 GMT

classification, large language model, machine learning, (18 more...)

Neural Information Processing Systems

Country: North America > United States > Michigan > Ingham County (0.14)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

4D Panoptic Scene Graph Generation

Neural Information Processing SystemsMay-25-2025, 13:53:20 GMT

We are living in a three-dimensional space while moving forward through a fourth dimension: time. To allow artificial intelligence to develop a comprehensive understanding of such a 4D environment, we introduce 4D Panoptic Scene Graph (PSG-4D), a new representation that bridges the raw visual data perceived in a dynamic 4D world and high-level visual understanding. Specifically, PSG-4D abstracts rich 4D sensory data into nodes, which represent entities with precise location and status information, and edges, which capture the temporal relations. To facilitate research in this new area, we build a richly annotated PSG-4D dataset consisting of 3K RGB-D videos with a total of 1M frames, each of which is labeled with 4D panoptic segmentation masks as well as fine-grained, dynamic scene graphs. To solve PSG-4D, we propose PSG4DFormer, a Transformer-based model that can predict panoptic segmentation masks, track masks along the time axis, and generate the corresponding scene graphs via a relation component. Extensive experiments on the new dataset show that our method can serve as a strong baseline for future research on PSG-4D. In the end, we provide a real-world application example to demonstrate how we can achieve dynamic scene understanding by integrating a large language model into our PSG-4D system.

large language model, machine learning, natural language, (18 more...)

Neural Information Processing Systems

Country: