AITopics | Representation Of Examples

2510.17072

Country:

North America > United States > California (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
South America > Colombia > Meta Department > Villavicencio (0.04)
(4 more...)

Genre: Research Report (0.64)

Industry:

Banking & Finance > Economy (1.00)
Government > Regional Government > North America Government > United States Government (0.93)
Health & Medicine > Therapeutic Area (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Supervised Learning > Representation Of Examples (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Bayer, Christian, Gogolashvili, Davit, Pelizzari, Luca

Local regression on path spaces with signature metrics

arXiv.org Machine LearningOct-21-2025

We study nonparametric regression and classification for path-valued data. We introduce a functional Nadaraya-Watson estimator that combines the signature transform from rough path theory with local kernel regression. The signature transform provides a principled way to encode sequential data through iterated integrals, enabling direct comparison of paths in a natural metric space. Our approach leverages signature-induced distances within the classical kernel regression framework, achieving computational efficiency while avoiding the scalability bottlenecks of large-scale kernel matrix operations. We establish finite-sample convergence bounds demonstrating favorable statistical properties of signature-based distances compared to traditional metrics in infinite-dimensional settings. We propose robust signature variants that provide stability against outliers, enhancing practical performance. Applications to both synthetic and real-world data - including stochastic differential equation learning and time series classification - demonstrate competitive accuracy while offering significant computational advantages over existing methods.

artificial intelligence, machine learning, sig, (18 more...)

2510.16728

Country:

Europe > Austria > Vienna (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Germany > Berlin (0.04)
Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.04)

Genre: Research Report (0.82)

Industry: Health & Medicine > Therapeutic Area (0.67)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
(2 more...)

Dölz, Jürgen, Multerer, Michael

Data-intrinsic approximation in metric spaces

arXiv.org Machine LearningOct-16-2025

Analysis and processing of data is a vital part of our modern society and requires vast amounts of computational resources. To reduce the computational burden, compressing and approximating data has become a central topic. We consider the approximation of labeled data samples, mathematically described as site-to-value maps between finite metric spaces. Within this setting, we identify the discrete modulus of continuity as an effective data-intrinsic quantity to measure regularity of site-to-value maps without imposing further structural assumptions. We investigate the consistency of the discrete modulus of continuity in the infinite data limit and propose an algorithm for its efficient computation. Building on these results, we present a sample based approximation theory for labeled data. For data subject to statistical uncertainty we consider multilevel approximation spaces and a variant of the multilevel Monte Carlo method to compute statistical quantities of interest. Our considerations connect approximation theory for labeled data in metric spaces to the covering problem for (random) balls on the one hand and the efficient evaluation of the discrete modulus of continuity to combinatorial optimization on the other hand. We provide extensive numerical studies to illustrate the feasibility of the approach and to validate our theoretical results.

artificial intelligence, continuity, machine learning, (17 more...)

2510.13496

Country:

North America > United States > New York (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > United States > California > San Diego County > San Diego (0.04)
(5 more...)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Supervised Learning > Representation Of Examples (0.91)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)

arXiv.org Artificial IntelligenceOct-14-2025

Mean-square and linear convergence of a stochastic proximal point algorithm in metric spaces of nonpositive curvature

Pischke, Nicholas

We define a stochastic variant of the proximal point algorithm in the general setting of nonlinear (separable) Hadamard spaces for approximating zeros of the mean of a stochastically perturbed monotone vector field and prove its convergence under a suitable strong monotonicity assumption, together with a probabilistic independence assumption and a separability assumption on the tangent spaces. As a particular case, our results transfer previous work by P. Bianchi on that method in Hilbert spaces for the first time to Hadamard manifolds. Moreover, our convergence proof is fully effective and allows for the construction of explicit rates of convergence for the iteration towards the (unique) solution both in mean and almost surely. These rates are moreover highly uniform, being independent of most data surrounding the iteration, space or distribution. In that generality, these rates are novel already in the context of Hilbert spaces. Linear nonasymptotic guarantees under additional second-moment conditions on the Yosida approximates and special cases of stochastic convex minimization are discussed.

artificial intelligence, assumption, machine learning, (14 more...)

arXiv.org Artificial Intelligence

2510.10697

Country: North America > United States (0.67)

Genre: Research Report (0.70)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Supervised Learning > Representation Of Examples (0.42)
Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.34)

Neural Information Processing SystemsOct-10-2025, 19:10:53 GMT

Metric Space Magnitude for Evaluating the Diversity of Latent Representations

Diversity is a key concept in representation learning, referring to the relative abundance and distinctiveness of model outputs.

diversity, diversity measure, magnitude, (14 more...)

Country:

North America > United States (0.14)
Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Switzerland > Fribourg > Fribourg (0.04)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.92)

Industry: Information Technology (0.45)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
(2 more...)

Neural Information Processing SystemsOct-8-2025, 10:11:55 GMT

32e41d6b0a51a63a9a90697da19d235d-Paper-Conference.pdf

approximate nearest neighbor search, feature vector, nearest neighbor search, (12 more...)

Country:

Asia > China > Zhejiang Province > Hangzhou (0.05)
South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
Asia > Japan (0.04)
Asia > Afghanistan > Parwan Province > Charikar (0.04)

Technology:

Information Technology > Information Management (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(2 more...)

Neural Information Processing SystemsOct-3-2025, 05:22:47 GMT

First-Order Algorithms for Min-Max Optimization in Geodesic Metric Spaces

Zhang et al. have recently shown that geodesic convex concave Riemannian problems always admit saddle-point solutions.

artificial intelligence, machine learning, manifold, (15 more...)

Country:

Asia > Middle East > Jordan (0.05)
North America > United States > California > Alameda County > Berkeley (0.04)
Europe > Netherlands > North Holland > Amsterdam (0.04)
Asia > China (0.04)

Genre: Research Report > New Finding (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Supervised Learning > Representation Of Examples (0.40)

Denis Mazur, Vage Egiazarian, Stanislav Morozov, Artem Babenko

Beyond Vector Spaces: Compact Data Representation as Differentiable Weighted Graphs

Neural Information Processing SystemsOct-2-2025, 22:58:55 GMT

Neural Information Processing Systems http://nips.cc/

artificial intelligence, machine learning, natural language, (16 more...)

Country:

North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.14)
North America > United States > California > San Diego County > San Diego (0.04)
Europe > Russia > Central Federal District > Moscow Oblast > Moscow (0.04)
Europe > Ireland > Munster > County Kerry > Killarney (0.04)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
(3 more...)

Baker, Gregory D., McCallum, Scott, Pattinson, Dirk

Linear Regression in p-adic metric spaces

arXiv.org Artificial IntelligenceOct-2-2025

Many real-world machine learning problems involve inherently hierarchical data, yet traditional approaches rely on Euclidean metrics that fail to capture the discrete, branching nature of hierarchical relationships. We present a theoretical foundation for machine learning in p-adic metric spaces, which naturally respect hierarchical structure. Our main result proves that an n-dimensional plane minimizing the p-adic sum of distances to points in a dataset must pass through at least n + 1 of those points -- a striking contrast to Euclidean regression that highlights how p-adic metrics better align with the discrete nature of hierarchical data. As a corollary, a polynomial of degree n constructed to minimise the p-adic sum of residuals will pass through at least n + 1 points. As a further corollary, a polynomial of degree n approximating a higher degree polynomial at a finite number of points will yield a difference polynomial that has distinct rational roots. We demonstrate the practical significance of this result through two applications in natural language processing: analyzing hierarchical taxonomies and modeling grammatical morphology. These results suggest that p-adic metrics may be fundamental to properly handling hierarchical data structures in machine learning. In hierarchical data, interpolation between points often makes less sense than selecting actual observed points as representatives.

artificial intelligence, machine learning, polynomial, (15 more...)

arXiv.org Artificial Intelligence

2510.00043

Country: North America > United States (0.28)

Genre: Research Report > New Finding (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Supervised Learning > Representation Of Examples (0.62)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.53)

Zhou, Yidong, Iao, Su I, Müller, Hans-Georg

End-to-End Deep Learning for Predicting Metric Space-Valued Outputs

arXiv.org Machine LearningSep-30-2025

Many modern applications involve predicting structured, non-Euclidean outputs such as probability distributions, networks, and symmetric positive-definite matrices. These outputs are naturally modeled as elements of general metric spaces, where classical regression techniques that rely on vector space structure no longer apply. We introduce E2M (End-to-End Metric regression), a deep learning framework for predicting metric space-valued outputs. E2M performs prediction via a weighted Fréchet means over training outputs, where the weights are learned by a neural network conditioned on the input. This construction provides a principled mechanism for geometry-aware prediction that avoids surrogate embeddings and restrictive parametric assumptions, while fully preserving the intrinsic geometry of the output space. We establish theoretical guarantees, including a universal approximation theorem that characterizes the expressive capacity of the model and a convergence analysis of the entropy-regularized training objective. Through extensive simulations involving probability distributions, networks, and symmetric positive-definite matrices, we show that E2M consistently achieves state-of-the-art performance, with its advantages becoming more pronounced at larger sample sizes. Applications to human mortality distributions and New York City taxi networks further demonstrate the flexibility and practical utility of the framework.

fr echet regression, neural network, regression, (13 more...)