AITopics

Plotting

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

On the Expressivity and Sample Complexity of Node-Individualized Graph Neural Networks

Neural Information Processing SystemsMar-27-2025, 10:47:03 GMT

Graph neural networks (GNNs) employing message passing for graph classification are inherently limited by the expressive power of the Weisfeiler-Leman (WL) test for graph isomorphism. Node individualization schemes, which assign unique identifiers to nodes (e.g., by adding random noise to features), are a common approach for achieving universal expressiveness. However, the ability of GNNs endowed with individualization schemes to generalize beyond the training data is still an open question. To address this question, this paper presents a theoretical analysis of the sample complexity of such GNNs from a statistical learning perspective, employing Vapnik-Chervonenkis (VC) dimension and covering number bounds. We demonstrate that node individualization schemes that are permutation-equivariant result in lower sample complexity, and design novel individualization schemes that exploit these results. As an application of this analysis, we also develop a novel architecture that can perform substructure identification (i.e., subgraph isomorphism) while having a lower VC dimension compared to competing methods. Finally, our theoretical findings are validated experimentally on both synthetic and real-world datasets.

artificial intelligence, graph, machine learning, (16 more...)

Neural Information Processing Systems

Country: Europe > United Kingdom > England (0.14)

Genre: Research Report > Experimental Study (0.93)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Therapeutic Area > Oncology (0.67)
Health & Medicine > Therapeutic Area > Pulmonary/Respiratory Diseases (0.45)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.87)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.85)
Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.69)

Add feedback

Supplementary Materials Evaluation beyond Task Performance: Analyzing Concepts in AlphaZero playing Hex

Neural Information Processing SystemsMar-27-2025, 10:46:57 GMT

Appendix A reports implementation details, hyperparameters and compute requirements. Appendix B gives more details on each concept introduced in the main body of the paper. Appendix C demonstrates how AlphaZero often wastes moves. Appendix D has additional results across the different architectures. We use agents trained by Jones [5]. See Table 1 for hyperparameters and relative agent strengths.

alphazero, artificial intelligence, machine learning, (14 more...)

Neural Information Processing Systems

Industry: Leisure & Entertainment > Games (0.69)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.68)
Information Technology > Artificial Intelligence > Machine Learning (0.46)

Add feedback

Test-Time Adaptation Induces Stronger Accuracy and Agreement-on-the-Line 1

Neural Information Processing SystemsMar-27-2025, 10:46:47 GMT

Recently, Miller et al. [32] and Baek et al. [3] empirically demonstrated strong linear correlations between in-distribution (ID) versus out-of-distribution (OOD) accuracy and agreement. These trends, coined accuracy-on-the-line (ACL) and agreement-on-the-line (AGL), enable OOD model selection and performance estimation without labeled data. However, these phenomena also break for certain shifts, such as CIFAR10-C Gaussian Noise, posing a critical bottleneck. In this paper, we make a key finding that recent test-time adaptation (TTA) methods not only improve OOD performance, but drastically strengthen the ACL and AGL trends in models, even in shifts where models showed very weak correlations before. To analyze this, we revisit the theoretical conditions from Miller et al. [32] that outline the types of distribution shifts needed for perfect ACL in linear models. Surprisingly, these conditions are satisfied after applying TTA to deep models in the penultimate feature embedding space. In particular, TTA causes the data distribution to collapse complex shifts into those can be expressed by a singular "scaling" variable in the feature space. Our results show that by combining TTA with AGL-based estimation methods, we can estimate the OOD performance of models with high precision for a broader set of distribution shifts. This lends us a simple system for selecting the best hyperparameters and adaptation strategy without any OOD labeled data.

agreement, artificial intelligence, machine learning, (17 more...)

Neural Information Processing Systems

Country: Europe > France (0.14)

Genre:

Research Report > Experimental Study (0.92)
Research Report > New Finding (0.86)

Industry: Information Technology (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.66)

Add feedback

89e541b817ea043a971840a926e12b37-Paper-Conference.pdf

Neural Information Processing SystemsMar-27-2025, 10:46:36 GMT

data mining, machine learning, mechanism, (19 more...)

Neural Information Processing Systems

Country:

Europe (0.28)
North America > United States (0.28)

Genre:

Research Report > Experimental Study (0.93)
Research Report > New Finding (0.68)

Industry: Health & Medicine (1.00)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.67)

Add feedback

NoiseGPT: Label Noise Detection and Rectification through Probability Curvature

Neural Information Processing SystemsMar-27-2025, 10:46:25 GMT

Machine learning craves high-quality data which is a major bottleneck during realistic deployment, as it takes abundant resources and massive human labor to collect and label data. Unfortunately, label noise where image data mismatches with incorrect label exists ubiquitously in all kinds of datasets, significantly degrading the learning performance of deep networks. Learning with Label Noise (LNL) has been a common strategy for mitigating the influence of noisy labels. However, existing LNL methods either require pertaining using the memorization effect to separate clean data from noisy ones or rely on dataset assumptions that cannot extend to various scenarios. Thanks to the development of Multimodal Large Language Models (MLLMs) which possess massive knowledge and hold In-Context Learning (ICL) ability, this paper proposes NoiseGPT to effectively leverage MLLMs as a knowledge expert for conducting label noise detection and rectification. Specifically, we observe a probability curvature effect of MLLMs where clean and noisy examples reside on curvatures with different smoothness, further enabling the detection of label noise.

large language model, machine learning, natural language, (16 more...)

Neural Information Processing Systems

Country:

Asia > China (0.28)
Europe > Switzerland > Zürich > Zürich (0.14)

Genre: Research Report > Experimental Study (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.89)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)

Add feedback

Learning from Offline Foundation Features with Tensor Augmentations Emir Konuk 1,2

Neural Information Processing SystemsMar-27-2025, 10:46:22 GMT

We introduce Learning from Offline Foundation Features with Tensor Augmentations (LOFF-TA), an efficient training scheme designed to harness the capabilities of foundation models in limited resource settings where their direct development is not feasible. LOFF-TA involves training a compact classifier on cached feature embeddings from a frozen foundation model, resulting in up to 37 faster training and up to 26 reduced GPU memory usage. Because the embeddings of augmented images would be too numerous to store, yet the augmentation process is essential for training, we propose to apply tensor augmentations to the cached embeddings of the original non-augmented images. LOFF-TA makes it possible to leverage the power of foundation models, regardless of their size, in settings with limited computational capacity. Moreover, LOFF-TA can be used to apply foundation models to high-resolution images without increasing compute. In certain scenarios, we find that training with LOFF-TA yields better results than directly fine-tuning the foundation model.

artificial intelligence, machine learning, natural language, (17 more...)

Neural Information Processing Systems

Country: Europe > Sweden (0.14)

Genre:

Research Report > Experimental Study (0.93)
Research Report > New Finding (0.93)

Industry:

Health & Medicine > Therapeutic Area (0.68)
Health & Medicine > Diagnostic Medicine > Imaging (0.46)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(2 more...)

Add feedback

Can Large Language Models Explore In-Context? 2

Neural Information Processing SystemsMar-27-2025, 10:46:18 GMT

We investigate the extent to which contemporary Large Language Models (LLMs) can engage in exploration, a core capability in reinforcement learning and decision making. We focus on native performance of existing LLMs, without training interventions. We deploy LLMs as agents in simple multi-armed bandit environments, specifying the environment description and interaction history entirely in-context, i.e., within the LLM prompt.

artificial intelligence, large language model, natural language, (17 more...)

Neural Information Processing Systems

Country: North America > United States (0.46)

Genre:

Research Report > New Finding (0.93)
Research Report > Experimental Study (0.93)

Industry: Education (0.92)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

Supplementary Material and Datasheet for the WorldStrat Dataset J. Cornebise, I. Oršolić, F. Kalaitzis 2022-06-16 4 2 Cloud coverage statistics 4 3 Full List of Hyperparameters for Benchmark

Neural Information Processing SystemsMar-27-2025, 10:46:04 GMT

Does this timeframe match the creation timeframe of the data associated with the instances (e.g., recent crawl of old news articles)? LCCS comprises of 23 classes and 14 sub-classes. The dataset, along with its machine-readable metadata, is hosted on CERN-backed Zenodo data repository: https://zenodo.org/record/6810792 Its longterm maintenance is discussed in the Datasheet. This includes reproducible code for the Benchmarks of Section 4 of [Cornebise et al., 2022a], following the ML Reproducibility Checklist [Pineau et al., 2021a,b]. The project also has its own website available at https://worldstrat.github.io/, The authors hereby state that they bear all responsibility in case of violation of rights, etc., and confirm that the data license is as follows: The low-resolution imagery, labels, metadata, and pretrained models are released under Creative Commons with Attribution 4.0 International (CC BY 4.0) The mean of the cloud coverage over the Sentinel 2 product areas is 7.98 %, with a standard deviation of 14.22. The quantiles are: 0.025: 0.00% 0.25: 0.00% 0.5: 0.66% 0.75: 10.05% 0.975: 49.95% It is important to note that this cloud cover percentage, as mentioned in the article and datasheet, is calculated on the entire product size of the provider, which varies in size but is much larger than the 2.5km we target. This means that even an image with a large cloud cover percentage can be cloud free, and in extreme cases (though unlikely), vice-versa. Also there are indeed considerable difference across sampled regions and land cover types. A simple example would be rainforests and non-desert equatorial regions. Using a strict no-cloud policy would make sampling enough low-resolution images either impossible or would make the temporal difference extremely large (up to 7 years for some AOIs). With that in mind, we strived to keep the cloud coverage as low as possible, ideally under 5%, while maintaining the temporal difference as small as possible.

artificial intelligence, dataset, machine learning, (19 more...)

Neural Information Processing Systems

Industry: