AITopics | Machine Learning

Collaborating Authors

Machine Learning

"The field of Machine Learning seeks to answer these questions: How can we build computer systems that automatically improve with experience, and what are the fundamental laws that govern all learning processes?"
– from The Discipline of Machine Learning by Tom Mitchell. CMU-ML-06-108, 2006.

News Overviews Instructional Materials AI-Alerts Classics

A Foundation Model for Zero-shot Logical Query Reasoning

Neural Information Processing SystemsMay-25-2025, 02:23:25 GMT

Complex logical query answering (CLQA) in knowledge graphs (KGs) goes beyond simple KG completion and aims at answering compositional queries comprised of multiple projections and logical operations. Existing CLQA methods that learn parameters bound to certain entity or relation vocabularies can only be applied to the graph they are trained on which requires substantial training time before being deployed on a new graph.

large language model, machine learning, question answering, (19 more...)

Neural Information Processing Systems

Country: North America > Canada > Quebec (0.14)

Genre: Research Report > Experimental Study (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.52)
Information Technology > Artificial Intelligence > Natural Language > Question Answering (0.50)

Add feedback

Supplementary Material for " AllClear: A Comprehensive Dataset and Benchmark for Cloud Removal in Satellite Imagery "

Neural Information Processing SystemsMay-25-2025, 02:13:27 GMT

In Sec. 2 we include a The data is publicly available at https://allclear.cs.cornell.edu. We include a datasheet for our dataset following the methodology from "Datasheets for Datasets" Gebru In this section, we include the prompts from Gebru et al. [2021] in blue, and in For what purpose was the dataset created? Was there a specific task in mind? The dataset was created to facilitate research development on cloud removal in satellite imagery. Specifically, our task is more temporally aligned than previous benchmarks.

artificial intelligence, dataset, machine learning, (18 more...)

Neural Information Processing Systems

Country: North America > United States (0.14)

Industry:

Law (1.00)
Government (0.68)
Energy > Renewable > Geothermal > Geothermal Energy Exploration and Development > Geophysical Analysis & Survey (0.61)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

G3: An Effective and Adaptive Framework for Worldwide Geolocalization Using Large Multi-Modality Models

Neural Information Processing SystemsMay-25-2025, 02:07:47 GMT

Worldwide geolocalization aims to locate the precise location at the coordinate level of photos taken anywhere on the Earth. It is very challenging due to 1) the difficulty of capturing subtle location-aware visual semantics, and 2) the heterogeneous geographical distribution of image data. As a result, existing studies have clear limitations when scaled to a worldwide context. They may easily confuse distant images with similar visual contents, or cannot adapt to various locations worldwide with different amounts of relevant data. To resolve these limitations, we propose G3, a novel framework based on Retrieval-Augmented Generation (RAG).

large language model, machine learning, natural language, (22 more...)

Neural Information Processing Systems

Country:

Europe (1.00)
North America > United States > Pennsylvania > Philadelphia County (0.14)

Genre: Research Report > Experimental Study (0.93)

Industry: Information Technology (0.46)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(2 more...)

Add feedback

Graph Convolutions Enrich the Self-Attention in Transformers!

Neural Information Processing SystemsMay-25-2025, 01:59:21 GMT

Transformers, renowned for their self-attention mechanism, have achieved state-ofthe-art performance across various tasks in natural language processing, computer vision, time-series modeling, etc. However, one of the challenges with deep Transformer models is the oversmoothing problem, where representations across layers converge to indistinguishable values, leading to significant performance degradation. We interpret the original self-attention as a simple graph filter and redesign it from a graph signal processing (GSP) perspective.

large language model, machine learning, natural language, (17 more...)

Neural Information Processing Systems

Country: North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre:

Research Report > Experimental Study (0.93)
Research Report > New Finding (0.67)

Industry:

Information Technology (0.67)
Government (0.46)
Media (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.69)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.68)

Add feedback

LF-Net: Learning Local Features from Images

Yuki Ono, Eduard Trulls, Pascal Fua, Kwang Moo Yi

Neural Information Processing SystemsMay-25-2025, 01:24:06 GMT

We present a novel deep architecture and a training strategy to learn a local feature pipeline from scratch, using collections of images without the need for human supervision. To do so we exploit depth and relative camera pose cues to create a virtual target that the network should achieve on one image, provided the outputs of the network for the other image. While this process is inherently non-differentiable, we show that we can optimize the network in a two-branch setup by confining it to one branch, while preserving differentiability in the other. We train our method on both indoor and outdoor datasets, with depth data from 3D sensors for the former, and depth estimates from an off-the-shelf Structure-from-Motion solution for the latter. Our models outperform the state of the art on sparse feature matching on both datasets, while running at 60+ fps for QVGA images.

artificial intelligence, keypoint, machine learning, (18 more...)

Neural Information Processing Systems

Country: North America > United States (0.28)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.49)

Add feedback

Improving Adaptivity via Over-Parameterization in Sequence Models

Neural Information Processing SystemsMay-25-2025, 01:18:22 GMT

It is well known that eigenfunctions of a kernel play a crucial role in kernel regression. Through several examples, we demonstrate that even with the same set of eigenfunctions, the order of these functions significantly impacts regression outcomes. Simplifying the model by diagonalizing the kernel, we introduce an over-parameterized gradient descent in the realm of sequence model to capture the effects of various orders of a fixed set of eigen-functions. This method is designed to explore the impact of varying eigenfunction orders. Our theoretical results show that the over-parameterization gradient flow can adapt to the underlying structure of the signal and significantly outperform the vanilla gradient flow method. Moreover, we also demonstrate that deeper over-parameterization can further enhance the generalization capability of the model. These results not only provide a new perspective on the benefits of over-parameterization and but also offer insights into the adaptivity and generalization potential of neural networks beyond the kernel regime.

artificial intelligence, gradient descent, machine learning, (18 more...)

Neural Information Processing Systems

Country: North America > United States > New York > New York County > New York City (0.14)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
Information Technology > Data Science (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.50)

Add feedback

AdvAD: Exploring Non-Parametric Diffusion for Imperceptible Adversarial Attacks

Neural Information Processing SystemsMay-25-2025, 01:07:45 GMT

Imperceptible adversarial attacks aim to fool DNNs by adding imperceptible perturbation to the input data. Previous methods typically improve the imperceptibility of attacks by integrating common attack paradigms with specifically designed perception-based losses or the capabilities of generative models. In this paper, we propose Adversarial Attacks in Diffusion (AdvAD), a novel modeling framework distinct from existing attack paradigms.

advad, artificial intelligence, machine learning, (17 more...)

Neural Information Processing Systems

Genre: Research Report > Experimental Study (0.93)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.92)
(3 more...)

Add feedback

Optimization over Continuous and Multi-dimensional Decisions with Observational Data

Dimitris Bertsimas, Christopher McCord

Neural Information Processing SystemsMay-25-2025, 01:03:35 GMT

We consider the optimization of an uncertain objective over continuous and multidimensional decision spaces in problems in which we are only provided with observational data. We propose a novel algorithmic framework that is tractable, asymptotically consistent, and superior to comparable methods on example problems. Our approach leverages predictive machine learning methods and incorporates information on the uncertainty of the predicted outcomes for the purpose of prescribing decisions. We demonstrate the efficacy of our method on examples involving both synthetic and real data sets.

artificial intelligence, data mining, machine learning, (18 more...)

Neural Information Processing Systems

Country: North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)

Genre: Research Report (0.69)

Industry:

Health & Medicine > Therapeutic Area (0.30)
Health & Medicine > Pharmaceuticals & Biotechnology (0.30)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.48)

Add feedback

Spike-Train Level Backpropagation for Training Deep Recurrent Spiking Neural Networks

Wenrui Zhang, Peng Li

Neural Information Processing SystemsMay-24-2025, 23:47:18 GMT

As an important class of SNNs, recurrent spiking neural networks (RSNNs) possess great computational power. However, the practical application of RSNNs is severely limited by challenges in training. Biologically-inspired unsupervised learning has limited capability in boosting the performance of RSNNs. On the other hand, existing backpropagation (BP) methods suffer from high complexity of unfolding in time, vanishing and exploding gradients, and approximate differentiation of discontinuous spiking activities when applied to RSNNs. To enable supervised training of RSNNs under a well-defined loss function, we present a novel Spike-Train level RSNNs Backpropagation (ST-RSBP) algorithm for training deep RSNNs. The proposed ST-RSBP directly computes the gradient of a rate-coded loss function defined at the output layer of the network w.r.t tunable parameters. The scalability of ST-RSBP is achieved by the proposed spike-train level computation during which temporal effects of the SNN is captured in both the forward and backward pass of BP. Our ST-RSBP algorithm can be broadly applied to RSNNs with a single recurrent layer or deep RSNNs with multiple feedforward and recurrent layers. Based upon challenging speech and image datasets including TI46 [25], N-TIDIGITS [3], Fashion-MNIST [40] and MNIST, ST-RSBP is able to train SNNs with an accuracy surpassing that of the current state-of-the-art SNN BP algorithms and conventional non-spiking deep learning models.

artificial intelligence, machine learning, st-rsbp, (16 more...)

Neural Information Processing Systems

Country: North America > United States > California > Santa Barbara County > Santa Barbara (0.14)

Genre: Research Report (0.46)

Industry: Energy > Oil & Gas (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Continuous Contrastive Learning for Long-Tailed Semi-Supervised Recognition Zi-Hao Zhou 1,2 Siyuan Fang 1,2 Tong Wei 1,2

Neural Information Processing SystemsMay-24-2025, 23:24:00 GMT

Long-tailed semi-supervised learning poses a significant challenge in training models with limited labeled data exhibiting a long-tailed label distribution. Current state-of-the-art LTSSL approaches heavily rely on high-quality pseudo-labels for large-scale unlabeled data. However, these methods often neglect the impact of representations learned by the neural network and struggle with real-world unlabeled data, which typically follows a different distribution than labeled data. This paper introduces a novel probabilistic framework that unifies various recent proposals in long-tail learning. Our framework derives the class-balanced contrastive loss through Gaussian kernel density estimation. We introduce a continuous contrastive learning method, CCL, extending our framework to unlabeled data using reliable and smoothed pseudo-labels. By progressively estimating the underlying label distribution and optimizing its alignment with model predictions, we tackle the diverse distribution of unlabeled data in real-world scenarios. Extensive experiments across multiple datasets with varying unlabeled data distributions demonstrate that CCL consistently outperforms prior state-of-the-art methods, achieving over 4% improvement on the ImageNet-127 dataset.

artificial intelligence, deep learning, machine learning, (14 more...)

Neural Information Processing Systems

Country: