AITopics | span

Collaborating Authors

span

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Coupled Training with Privileged Information and Unlabeled Data

Shi, Jiahao, Hagrass, Omar, Klusowski, Jason M.

arXiv.org Machine LearningMay-25-2026

In many prediction problems, we have extra information during training (for example, measurements that are expensive or slow to collect) that will not be available when the model is deployed. A common strategy is to first train a model that uses all training information, then use its predictions on unlabeled examples to train a second model that only uses the inputs available at test time. However, when the extra training-only information is weak or noisy, this Two-Stage approach can mislead the deployment model and even hurt accuracy. We propose a joint training method that learns the two models together, so the deployment model can benefit from the extra information only when it actually helps, instead of inheriting its mistakes. We provide guarantees that describe when joint training improves prediction accuracy and analyze a simple alternating training algorithm for large, high-dimensional models. Experiments on synthetic data and real-world prediction tasks show that our approach avoids these failures and robustly outperforms standard Two-Stage baselines.

artificial intelligence, machine learning, privileged information, (13 more...)

arXiv.org Machine Learning

2605.23268

Country: North America > United States (0.28)

Genre: Research Report (1.00)

Industry: Health & Medicine (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (0.52)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

Add feedback

Axiomatizing Neural Networks via Pursuit of Subspaces

Yamac, Mehmet, Duman, Mert, Akpinar, Ugur, Casadiego, Felix Rojas, Kiranyaz, Serkan, van Gerven, Marcel, Gabbouj, Moncef

arXiv.org Machine LearningMay-21-2026

While deep neural networks have achieved remarkable success across a wide range of domains, their underlying mechanisms remain poorly understood, and they are often regarded as black boxes. This gap between empirical performance and theoretical understanding poses a challenge analogous to the pre-axiomatic stage of classical geometry. In this work, we introduce the Pursuit of Subspaces (PoS) hypothesis, an axiomatic framework that formulates neural network behavior through a set of geometric postulates. These axioms, together with their derived consequences, provide a unified perspective on representation, computation, and generalization in both shallow and deep architectures. We show that this framework yields geometric explanations for fundamental questions in deep learning, including representation structure, architectural mechanisms, and generalization behavior, offering a principled step toward a coherent theoretical foundation.

artificial intelligence, machine learning, representation, (19 more...)

arXiv.org Machine Learning

2605.20534

Country:

Europe (0.45)
North America > United States (0.28)

Genre: Research Report (0.64)

Industry:

Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (1.00)
Information Technology (0.87)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

On the Limits of Latent Reuse in Diffusion Models

Yu, Yifeng, Yu, Lu

arXiv.org Machine LearningMay-14-2026

Diffusion models are often trained in low-dimensional latent spaces, which are then reused for related but shifted datasets. In this work, we study when such latent reuse remains reliable under distribution shift. We consider a source-target setting in which both datasets are approximately low-dimensional but may lie near different subspaces. We show that freezing and reusing a source latent space induces a target-domain score error governed by two quantities: the principal-angle misalignment between the source and target subspaces, and the target ambient noise amplified by the diffusion time scale. Motivated by these limits, we further study mixed source-target training and characterize how the required shared latent dimension depends on the relative geometry of the two distributions. Our results provide theoretical guidance on when latent reuse is reliable and when learning a shared representation may be necessary.

artificial intelligence, fcomp, machine learning, (15 more...)

arXiv.org Machine Learning

2605.13448

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Locally Near Optimal Piecewise Linear Regression in High Dimensions via Difference of Max-Affine Functions

Kanj, Haitham, Lee, Kiryung

arXiv.org Machine LearningMay-11-2026

This paper presents a parametric solution to piecewise linear regression through the Adaptive Block Gradient Descent (ABGD) algorithm. The heart of the method is the parametrization of piecewise linear functions as the difference of max-affine (DoMA) functions. A non-asymptotic local convergence analysis for ABGD is provided under sub-Gaussian covariate and noise distributions. To initialize ABGD, we adapt a prior algorithm originally developed for the simpler setting of max-affine functions. When suitably initialized, ABGD converges linearly to an $ε$-accurate estimate given $\tilde{\mathcal{O}}(d\max(σ_z/ε,1)^2)$ observations where $σ_z^2$ denotes the noise variance. This implies exact recovery given $\tilde{\mathcal{O}}(d)$ samples in the noiseless case. Also, such a rate is shown to be minimax optimal up to logarithmic factors. Synthetic numerical results corroborate the theoretical guarantees for ABGD. We also observe competitive performance compared to the state-of-the-art methods on real-world datasets.

artificial intelligence, machine learning, regression, (18 more...)

arXiv.org Machine Learning

2605.06959

Country: North America > United States (0.46)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)

Add feedback

Fair Graph Distillation

Neural Information Processing SystemsApr-30-2026, 10:51:57 GMT

As graph neural networks (GNNs) struggle with large-scale graphs due to high computational demands, graph data distillation promises to alleviate this issue by distilling a large real graph into a smaller distilled graph while maintaining comparable prediction performance for GNNs trained on both graphs. However, we observe that GNNs trained on distilled graphs may exhibit more severe group fairness issues than GNNs trained on real graphs for vanilla and fair GNNs training. Motivated by these observations, we propose fair graph distillation (FGD), an advanced graph distillation approach to generate fair distilled graphs. The challenge lies in the deficiency of sensitive attributes for nodes in the distilled graph, making most debiasing methods (e.g., regularization and adversarial debiasing) intractable for distilled graphs. We develop a simple yet effective bias metric, named coherence, for distilled graphs. Based on the proposed coherence metric, we introduce a framework for fair graph distillation using a bi-level optimization algorithm. Extensive experiments demonstrate that the proposed algorithm can achieve better prediction performance-fairness trade-offs across various datasets and GNN architectures.

artificial intelligence, data mining, machine learning, (19 more...)

Neural Information Processing Systems

Genre: Research Report (0.46)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.68)

Add feedback

This paper investigates simultaneous preference and metric learning from a crowd of respondents. A set of items represented by d-dimensional feature vectors and paired comparisons of the form "item i is preferable to item j" made by each user is given. Our model jointly learns a distance metric that characterizes the crowd's general measure of item similarities along with a latent ideal point for each user reflecting their individual preferences. This model has the flexibility to capture individual preferences, while enjoying a metric learning sample cost that is amortized over the crowd. We first study this problem in a noiseless, continuous response setting (i.e., responses equal to differences of item distances) to understand the fundamental limits of learning. Next, we establish prediction error guarantees for noisy, binary measurements such as may be collected from human respondents, and show how the sample complexity improves when the underlying metric is lowrank. Finally, we establish recovery guarantees under assumptions on the response distribution. We demonstrate the performance of our model on both simulated data and on a dataset of color preference judgments across a large number of users.

artificial intelligence, machine learning, survey article, (18 more...)

Neural Information Processing Systems

Country: North America > United States > Wisconsin (0.28)

Genre:

Overview (0.85)
Research Report > New Finding (0.45)

Industry: Education (0.67)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

Invariance . the Initialized

Neural Information Processing SystemsApr-24-2026, 22:46:25 GMT

In this paper, we analyze neural networks trained on high-dimensional data that lies on a low dimen-441 sional linear subspace denoted by P. We assume that the dimension of P is d ℓ. Throughout the pa-442 per it will be more convenient to analyze data which lies on the subspace M = span({e1,...,ed ℓ}),443 because then the "off manifold" directions correspond exactly to certain coordinates of the input. In444 this section we show that we can essentially analyze the data as if it is rotated to lie on M, and it445 would imply the same consequences as the original data from P.446 Theorem A.1. Let P Rd be a subspace of dimension d ℓ, and let M = span{e1,...,ed ℓ}.447 Let R be an orthogonal matrix such that R P = M, let X P be a training dataset and let448 XR = {R x: x X}.

artificial intelligence, experiment, machine learning, (16 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.90)

Add feedback

Filters

Collaborating Authors

span

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

Coupled Training with Privileged Information and Unlabeled Data

Axiomatizing Neural Networks via Pursuit of Subspaces

On the Limits of Latent Reuse in Diffusion Models

Locally Near Optimal Piecewise Linear Regression in High Dimensions via Difference of Max-Affine Functions

Fair Graph Distillation

e50e253e21cbcdcd200394f61d73acc8-Supplemental-Conference.pdf

240ac9371ec2671ae99847c3ae2e6384-Supplemental.pdf

12143893d9d37c3569dda800b95cabd9-Paper-Conference.pdf

One for All: Simultaneous Metric and Preference Learning over Multiple Users

Invariance . the Initialized