AITopics | data structure

Real-world datasets are inherently heterogeneous, yet how per-class structural differences and sampling imbalance shape the training dynamics of diffusion models-and potentially exacerbate disparities-remains poorly understood. While models typically transition from an initial phase of generalization to memorizing the training set, existing theory assumes homogeneous data, leaving open how class imbalance and heterogeneity reshape these dynamics. In this work, we develop a high-dimensional analytical framework to study class-dependent learning in score-based diffusion models. Analyzing a random-features model trained on Gaussian mixtures, we derive the feature-covariance spectrum to characterize per-class generalization and memorization times. We reveal the explicit hierarchy governing these dynamics: class variance is the primary determinant of learning order-consistently favoring higher-variance classes-while centroid geometry plays a secondary role. Sampling imbalance acts as a modulator that can reverse this ordering and, under strong imbalance, forces minority classes to acquire distinct, delayed speciation times during backward diffusion. Together, these results suggest that diffusion models can memorize some classes while others remain insufficiently learned. We validate our theoretical predictions empirically using U-Net models trained on Fashion MNIST.

artificial intelligence, data structure, machine learning, (14 more...)

arXiv.org Machine Learning

2605.06367

Country: Europe > Italy (0.28)

Genre: Research Report > New Finding (0.87)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)

Add feedback

Worst-case Performance of Popular Approximate Nearest Neighbor Search Implementations: Guarantees and Limitations

Neural Information Processing SystemsApr-29-2026, 20:34:58 GMT

Graph-based approaches to nearest neighbor search are popular and powerful tools for handling large datasets in practice, but they have limited theoretical guarantees. We study the worst-case performance of recent graph-based approximate nearest neighbor search algorithms, such as HNSW, NSG and DiskANN. For DiskANN, we show that its "slow preprocessing" version provably supports approximate nearest neighbor search query with constant approximation ratio and poly-logarithmic query time, on data sets with bounded "intrinsic" dimension. For the other data structure variants studied, including DiskANN with "fast preprocessing", HNSW and NSG, we present a family of instances on which the empirical query time required to achieve a "reasonable" accuracy is linear in instance size. For example, for DiskANN, we show that the query procedure can take at least 0.1n steps on instances of size nbefore it encounters any of the 5nearest neighbors of the query.

artificial intelligence, information retrieval, natural language, (18 more...)

Neural Information Processing Systems

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Case-Based Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (1.00)

Add feedback

47a5feca4ce02883a5643e295c7ce6cd-Paper.pdf

Neural Information Processing SystemsApr-25-2026, 17:12:27 GMT

artificial intelligence, arxiv preprint arxiv, machine learning, (17 more...)

Neural Information Processing Systems

Country:

North America > United States (0.28)
Europe (0.28)

Genre: Research Report (0.69)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.97)

Add feedback

Faster Query Times for Fully Dynamic k-Center Clustering with Outliers

Neural Information Processing SystemsApr-25-2026, 15:32:12 GMT

Given a point set P M from a metric space (M,d)and numbers k,z N, the metric k-center problem with z outliers is to find a set C P of k points such that the maximum distance of all but at most z outlier points of P to their nearest center in C is minimized. We consider this problem in the fully dynamic model, i.e., under insertions and deletions of points, for the case that the metric space has a bounded doubling dimension dim. We utilize a hierarchical data structure to maintain the points and their neighborhoods, which enables us to efficiently find the clusters. In particular, our data structure can be queried at any time to generate a (3 + ε)-approximate solution for input values of k and z in worst-case query time ε O(dim)klognloglog, where is the ratio between the maximum and minimum distance between two points in P. Moreover, it allows insertion/deletion of a point in worst-case update time ε O(dim) lognlog . Our result achieves a significantly faster query time with respect to k and z than the current state-of-theart by Pellizzoni, Pietracaprina, and Pucci [18], which uses ε O(dim)(k+z)2 log query time to obtain a (3+ε)-approximate solution.

artificial intelligence, dim, machine learning, (19 more...)

Neural Information Processing Systems

Country:

Europe (0.46)
Asia (0.28)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

2fc6b8a3fc23108f184daa4759024c25-Paper-Conference.pdf

Neural Information Processing SystemsApr-25-2026, 08:29:20 GMT

artificial intelligence, data structure, machine learning, (15 more...)

Neural Information Processing Systems

Country: North America > United States > California (0.28)

Genre: Research Report (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Add feedback

2c27a260f16ad3098393cc529f391f4a-Supplemental.pdf

Neural Information Processing SystemsApr-25-2026, 06:53:29 GMT

artificial intelligence, machine learning, step follow, (16 more...)

Neural Information Processing Systems

Genre: Workflow (0.48)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.67)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.46)

Add feedback

1ed4723f12853cbd02aecb8160f5e0c9-Paper-Conference.pdf

Neural Information Processing SystemsApr-25-2026, 00:29:53 GMT

artificial intelligence, machine learning, matrix, (17 more...)

Neural Information Processing Systems

Country: North America > United States (0.28)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Data Science (0.93)

Add feedback

Community Detection on Evolving Graphs

Stefano Leonardi, Aris Anagnostopoulos, Jakub Łącki, Silvio Lattanzi, Mohammad Mahdian

Neural Information Processing SystemsMar-23-2026, 15:22:20 GMT

Neural Information Processing Systems http://nips.cc/

artificial intelligence, data mining, machine learning, (17 more...)

Neural Information Processing Systems

Country:

Europe (0.68)
North America > United States (0.47)
Asia (0.46)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Communications (0.94)

Add feedback

An algorithm for L1 nearest neighbor search via monotonic embedding

Xinan Wang, Sanjoy Dasgupta

Neural Information Processing SystemsMar-23-2026, 08:16:12 GMT

Fast algorithms for nearest neighbor (NN) search have in large part focused on 2 distance. Here we develop an approach for 1 distance that begins with an explicit and exactly distance-preserving embedding of the points into 22. We show how this can efficiently be combined with random-projection based methods for 2 NN search, such as locality-sensitive hashing (LSH) or random projection trees. We rigorously establish the correctness of the methodology and show by experimentation using LSH that it is competitive in practice with available alternatives.

information retrieval, machine learning, natural language, (18 more...)

Neural Information Processing Systems

Country:

Europe (0.28)
North America > United States (0.14)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Case-Based Reasoning (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.42)

Add feedback

GL-NeRF: Gauss-Laguerre Quadrature Enables Training-Free NeRF Acceleration

Neural Information Processing SystemsMar-22-2026, 16:36:47 GMT

Volume rendering in neural radiance fields is inherently time-consuming due to the large number of MLP calls on the points sampled per ray. Previous works would address this issue by introducing new neural networks or data structures. In this work, we propose GL-NeRF, a new perspective of computing volume rendering with the Gauss-Laguerre quadrature. GL-NeRF significantly reduces the number of MLP calls needed for volume rendering, introducing no additional data structures or neural networks. The simple formulation makes adopting GL-NeRF in any NeRF model possible. In the paper, we first justify the use of the Gauss-Laguerre quadrature and then demonstrate this plug-and-play attribute by implementing it in two different NeRF models. We show that with a minimal drop in performance, GL-NeRF can significantly reduce the number of MLP calls, showing the potential to speed up any NeRF model.

artificial intelligence, machine learning, proceedings, (8 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Vision (0.60)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.51)

Add feedback

Filters

Collaborating Authors

data structure

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

The Interplay of Data Structure and Imbalance in the Learning Dynamics of Diffusion Models

Worst-case Performance of Popular Approximate Nearest Neighbor Search Implementations: Guarantees and Limitations

47a5feca4ce02883a5643e295c7ce6cd-Paper.pdf

Faster Query Times for Fully Dynamic k-Center Clustering with Outliers

2fc6b8a3fc23108f184daa4759024c25-Paper-Conference.pdf

2c27a260f16ad3098393cc529f391f4a-Supplemental.pdf

1ed4723f12853cbd02aecb8160f5e0c9-Paper-Conference.pdf

Community Detection on Evolving Graphs

An algorithm for L1 nearest neighbor search via monotonic embedding

GL-NeRF: Gauss-Laguerre Quadrature Enables Training-Free NeRF Acceleration