AITopics | data geometry

Collaborating Authors

data geometry

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Distributionally Robust Optimization with Data Geometry

Neural Information Processing SystemsDec-25-2025, 11:01:04 GMT

Distributionally Robust Optimization (DRO) serves as a robust alternative to empirical risk minimization (ERM), which optimizes the worst-case distribution in an uncertainty set typically specified by distance metrics including $f$-divergence and the Wasserstein distance. The metrics defined in the ostensible high dimensional space lead to exceedingly large uncertainty sets, resulting in the underperformance of most existing DRO methods. It has been well documented that high dimensional data approximately resides on low dimensional manifolds. In this work, to further constrain the uncertainty set, we incorporate data geometric properties into the design of distance metrics, obtaining our novel Geometric Wasserstein DRO (GDRO). Empowered by Gradient Flow, we derive a generically applicable approximate algorithm for the optimization of GDRO, and provide the bounded error rate of the approximation as well as the convergence rate of our algorithm. We also theoretically characterize the edge cases where certain existing DRO methods are the degeneracy of GDRO. Extensive experiments justify the superiority of our GDRO to existing DRO methods in multiple settings with strong distributional shifts, and confirm that the uncertainty set of GDRO adapts to data geometry.

data geometry, distributionally robust optimization, name change, (5 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.40)

Add feedback

Effects of Data Geometry in Early Deep Learning

Neural Information Processing SystemsDec-25-2025, 04:43:00 GMT

Deep neural networks can approximate functions on different types of data, from images to graphs, with varied underlying structure. This underlying structure can be viewed as the geometry of the data manifold. By extending recent advances in the theoretical understanding of neural networks, we study how a randomly initialized neural network with piecewise linear activation splits the data manifold into regions where the neural network behaves as a linear function. We derive bounds on the density of boundary of linear regions and the distance to these boundaries on the data manifold. This leads to insights into the expressivity of randomly initialized deep neural networks on non-Euclidean data sets. We empirically corroborate our theoretical results using a toy supervised learning problem. Our experiments demonstrate that number of linear regions varies across manifolds and the results hold with changing neural network architectures. We further demonstrate how the complexity of linear regions is different on the low dimensional manifold of images as compared to the Euclidean space, using the MetFaces dataset.

data geometry, name change, neural network, (5 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.64)

Add feedback

Distributionally Robust Optimization with Data Geometry

Neural Information Processing SystemsJan-19-2025, 01:39:35 GMT

Distributionally Robust Optimization (DRO) serves as a robust alternative to empirical risk minimization (ERM), which optimizes the worst-case distribution in an uncertainty set typically specified by distance metrics including f -divergence and the Wasserstein distance. The metrics defined in the ostensible high dimensional space lead to exceedingly large uncertainty sets, resulting in the underperformance of most existing DRO methods. It has been well documented that high dimensional data approximately resides on low dimensional manifolds. In this work, to further constrain the uncertainty set, we incorporate data geometric properties into the design of distance metrics, obtaining our novel Geometric Wasserstein DRO (GDRO). Empowered by Gradient Flow, we derive a generically applicable approximate algorithm for the optimization of GDRO, and provide the bounded error rate of the approximation as well as the convergence rate of our algorithm. We also theoretically characterize the edge cases where certain existing DRO methods are the degeneracy of GDRO.

data geometry, distributionally robust optimization, gdro, (3 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.43)

Add feedback

Effects of Data Geometry in Early Deep Learning

Neural Information Processing SystemsJan-18-2025, 19:03:55 GMT

data manifold, early deep learning, neural network, (3 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Effects of Data Geometry in Early Deep Learning

Tiwari, Saket, Konidaris, George

arXiv.org Artificial IntelligenceDec-29-2022

Deep neural networks can approximate functions on different types of data, from images to graphs, with varied underlying structure. This underlying structure can be viewed as the geometry of the data manifold. By extending recent advances in the theoretical understanding of neural networks, we study how a randomly initialized neural network with piece-wise linear activation splits the data manifold into regions where the neural network behaves as a linear function. We derive bounds on the density of boundary of linear regions and the distance to these boundaries on the data manifold. This leads to insights into the expressivity of randomly initialized deep neural networks on non-Euclidean data sets. We empirically corroborate our theoretical results using a toy supervised learning problem. Our experiments demonstrate that number of linear regions varies across manifolds and the results hold with changing neural network architectures. We further demonstrate how the complexity of linear regions is different on the low dimensional manifold of images as compared to the Euclidean space, using the MetFaces dataset.

artificial intelligence, early deep learning, machine learning, (1 more...)

arXiv.org Artificial Intelligence

2301.00008

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.89)

Add feedback

Geometry- and Accuracy-Preserving Random Forest Proximities

Rhodes, Jake S., Cutler, Adele, Moon, Kevin R.

arXiv.org Machine LearningJan-29-2022

Abstract--Random forests are considered one of the best out-of-the-box classification and regression algorithms due to their high level of predictive performance with relatively little tuning. Pairwise proximities can be computed from a trained random forest which measure the similarity between data points relative to the supervised task. Random forest proximities have been used in many applications including the identification of variable importance, data imputation, outlier detection, and data visualization. However, existing definitions of random forest proximities do not accurately reflect the data geometry learned by the random forest. In this paper, we introduce a novel definition of random forest proximities called Random Forest-Geometry-and Accuracy-Preserving proximities (RF-GAP). We prove that the proximity-weighted sum (regression) or majority vote (classification) using RF-GAP exactly match the out-of-bag random forest prediction, thus capturing the data geometry learned by the random forest. We empirically show that this improved geometric representation outperforms traditional random forest proximities in tasks such as data imputation and provides outlier detection and visualization results consistent with the learned data geometry. ANDOM forests [1] are well-known, powerful predictors comprised of an ensemble of binary recursive was first defined by Leo Breiman as the proportion of decision trees. Random forests are easily adapted for both trees in which the observations reside in the same terminal classification and regression, are trivially parallelizable, can node [16].

proximity, random forest, random forest proximity, (15 more...)

arXiv.org Machine Learning

2201.12682

Country:

Europe > Austria > Vienna (0.14)
Oceania > Australia > Australian Capital Territory > Canberra (0.04)
North America > United States > Utah (0.04)
(6 more...)

Genre: Research Report > New Finding (0.67)

Industry:

Health & Medicine > Therapeutic Area > Neurology (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.34)

Add feedback

Generating and Aligning from Data Geometries with Generative Adversarial Networks

Amodio, Matthew, Krishnaswamy, Smita

arXiv.org Machine LearningJan-23-2019

Unsupervised domain mapping has attracted substantial attention in recent years due to the success of models based on the cycle-consistency assumption. These models map between two domains by fooling a probabilistic discriminator, thereby matching the probability distributions of the real and generated data. Instead of this probabilistic approach, we cast the problem in terms of aligning the geometry of the manifolds of the two domains. We introduce the Manifold Geometry Matching Generative Adversarial Network (MGM GAN), which adds two novel mechanisms to facilitate GANs sampling from the geometry of the manifold rather than the density and then aligning two manifold geometries: (1) an importance sampling technique that reweights points based on their density on the manifold, making the discriminator only able to discern geometry and (2) a penalty adapted from traditional manifold alignment literature that explicitly enforces the geometry to be preserved. The MGM GAN leverages the manifolds arising from a pre-trained autoencoder to bridge the gap between formal manifold alignment literature and existing GAN work, and demonstrate the advantages of modeling the manifold geometry over its density.

generating and aligning, geometry, manifold, (15 more...)

arXiv.org Machine Learning

1901.08177

Country:

North America > United States > New York > New York County > New York City (0.14)
Asia (0.04)

Genre: Research Report (0.40)

Industry: Health & Medicine (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback