AITopics

2412.02535

Country:

Europe (1.00)
North America > United States > California (0.27)
North America > United States > Kansas > Graham County (0.24)

Genre: Research Report > New Finding (0.67)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Evolutionary Systems (1.00)
(2 more...)

arXiv.org Artificial IntelligenceJul-7-2023

Fermat Distances: Metric Approximation, Spectral Convergence, and Clustering Algorithms

Trillos, Nicolás García, Little, Anna, McKenzie, Daniel, Murphy, James M.

We analyze the convergence properties of Fermat distances, a family of density-driven metrics defined on Riemannian manifolds with an associated probability measure. Fermat distances may be defined either on discrete samples from the underlying measure, in which case they are random, or in the continuum setting, in which they are induced by geodesics under a density-distorted Riemannian metric. We prove that discrete, sample-based Fermat distances converge to their continuum analogues in small neighborhoods with a precise rate that depends on the intrinsic dimensionality of the data and the parameter governing the extent of density weighting in Fermat distances. This is done by leveraging novel geometric and statistical arguments in percolation theory that allow for non-uniform densities and curved domains. Our results are then used to prove that discrete graph Laplacians based on discrete, sample-driven Fermat distances converge to corresponding continuum operators. In particular, we show the discrete eigenvalues and eigenvectors converge to their continuum analogues at a dimension-dependent rate, which allows us to interpret the efficacy of discrete spectral clustering using Fermat distances in terms of the resulting continuum limit. The perspective afforded by our discrete-to-continuum Fermat distance analysis leads to new clustering algorithms for data and related insights into efficient computations associated to density-driven spectral clustering. Our theoretical analysis is supported with numerical simulations and experiments on synthetic and real image data.

artificial intelligence, laplacian, machine learning, (18 more...)

2307.0575

Country:

North America > United States > Wisconsin > Dane County > Madison (0.14)
North America > United States > Colorado > Jefferson County > Golden (0.14)

Genre: Research Report (0.84)

Industry: Education (0.67)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.70)

arXiv.org Artificial IntelligenceMay-30-2023

It begins with a boundary: A geometric view on probabilistically robust learning

Bungert, Leon, Trillos, Nicolás García, Jacobs, Matt, McKenzie, Daniel, Nikolić, Đorđe, Wang, Qingsong

Although deep neural networks have achieved super-human performance on many classification tasks, they often exhibit a worrying lack of robustness towards adversarially generated examples. Thus, considerable effort has been invested into reformulating Empirical Risk Minimization (ERM) into an adversarially robust framework. Recently, attention has shifted towards approaches which interpolate between the robustness offered by adversarial training and the higher clean accuracy and faster training times of ERM. In this paper, we take a fresh and geometric view on one such method -- Probabilistically Robust Learning (PRL) (Robey et al., ICML, 2022). We propose a geometric framework for understanding PRL, which allows us to identify a subtle flaw in its original formulation and to introduce a family of probabilistic nonlocal perimeter functionals to address this. We prove existence of solutions using novel relaxation methods and study properties as well as local limits of the introduced perimeters.

artificial intelligence, classifier, machine learning, (16 more...)

2305.18779

Country:

North America > United States > Wisconsin (0.14)
North America > United States > Colorado (0.14)
North America > United States > California (0.14)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

arXiv.org Artificial IntelligenceOct-12-2022

Wasserstein Barycenter-based Model Fusion and Linear Mode Connectivity of Neural Networks

Akash, Aditya Kumar, Li, Sixu, Trillos, Nicolás García

Based on the concepts of Wasserstein barycenter (WB) and Gromov-Wasserstein barycenter (GWB), we propose a unified mathematical framework for neural network (NN) model fusion and utilize it to reveal new insights about the linear mode connectivity of SGD solutions. In our framework, the fusion occurs in a layer-wise manner and builds on an interpretation of a node in a network as a function of the layer preceding it. The versatility of our mathematical framework allows us to talk about model fusion and linear mode connectivity for a broad class of NNs, including fully connected NN, CNN, ResNet, RNN, and LSTM, in each case exploiting the specific structure of the network architecture. We present extensive numerical experiments to: 1) illustrate the strengths of our approach in relation to other model fusion methodologies and 2) from a certain perspective, provide new empirical evidence for recent conjectures which say that two local minima found by gradient-based methods end up lying on the same basin of the loss landscape after a proper permutation of weights is applied to one of the models. However, data privacy concerns, communication costs, bandwidth limits, and time sensitivity prevent the gathering of local data from edge devices into one single centralized location. These obstacles have motivated the design and development of federated learning strategies which are aimed at pooling information from locally trained neural networks (NNs) with the objective of building strong centralized models without relying on the collection of local data McMahan et al. (2017); Kairouz et al. (2019). A standard fusion method for aggregating models with the same architecture is FedAvg McMahan et al. (2017), which involves element-wise averaging of the parameters of local models. This is also known as vanilla averaging Singh & Jaggi (2019). Although easily implementable, vanilla averaging performs poorly when fusing models whose weights do not have a one-to-one correspondence. This happens because even when models are trained on the same dataset it is possible to obtain models that differ only by a permutation of weights Wang et al. (2020); Yurochkin et al. (2019); this feature Moreover, vanilla averaging is not naturally designed to work when using local models with different architectures (e.g., different widths).

artificial intelligence, deep learning, machine learning, (17 more...)

2210.06671

Country: North America > United States > Wisconsin (0.14)

Genre: Research Report (0.63)

Industry: Information Technology > Security & Privacy (0.88)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

arXiv.org Machine LearningNov-26-2021

The Geometry of Adversarial Training in Binary Classification

Bungert, Leon, Trillos, Nicolás García, Murray, Ryan

We establish an equivalence between a family of adversarial training problems for non-parametric binary classification and a family of regularized risk minimization problems where the regularizer is a nonlocal perimeter functional. The resulting regularized risk minimization problems admit exact convex relaxations of the type $L^1+$ (nonlocal) $\operatorname{TV}$, a form frequently studied in image analysis and graph-based learning. A rich geometric structure is revealed by this reformulation which in turn allows us to establish a series of properties of optimal solutions of the original problem, including the existence of minimal and maximal solutions (interpreted in a suitable sense), and the existence of regular solutions (also interpreted in a suitable sense). In addition, we highlight how the connection between adversarial training and perimeter minimization problems provides a novel, directly interpretable, statistical motivation for a family of regularized risk minimization problems involving perimeter/total variation. The majority of our theoretical results are independent of the distance used to define adversarial attacks.

artificial intelligence, machine learning, minimizer, (17 more...)

2111.13613

Country: North America > United States > Wisconsin > Dane County > Madison (0.14)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

arXiv.org Machine LearningAug-17-2021

Clustering dynamics on graphs: from spectral clustering to mean shift through Fokker-Planck interpolation

Craig, Katy, Trillos, Nicolás García, Slepčev, Dejan

In this work we build a unifying framework to interpolate between density-driven and geometry-based algorithms for data clustering, and specifically, to connect the mean shift algorithm with spectral clustering at discrete and continuum levels. We seek this connection through the introduction of Fokker-Planck equations on data graphs. Besides introducing new forms of mean shift algorithms on graphs, we provide new theoretical insights on the behavior of the family of diffusion maps in the large sample limit as well as provide new connections between diffusion maps and mean shift dynamics on a fixed graph. Several numerical examples illustrate our theoretical findings and highlight the benefits of interpolating density-driven and geometry-based clustering algorithms.

artificial intelligence, equation, survey article, (16 more...)

2108.08687

Country: Europe > Switzerland (0.14)

Genre:

Research Report (0.50)
Instructional Material (0.45)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)

arXiv.org Machine LearningAug-8-2015

A variational approach to the consistency of spectral clustering

Trillos, Nicolás García, Slepčev, Dejan

This paper establishes the consistency of spectral approaches to data clustering. We consider clustering of point clouds obtained as samples of a ground-truth measure. A graph representing the point cloud is obtained by assigning weights to edges based on the distance between the points they connect. We investigate the spectral convergence of both unnormalized and normalized graph Laplacians towards the appropriate operators in the continuum domain. We obtain sharp conditions on how the connectivity radius can be scaled with respect to the number of sample points for the spectral convergence to hold. We also show that the discrete clusters obtained via spectral clustering converge towards a continuum partition of the ground truth measure. Such continuum partition minimizes a functional describing the continuum analogue of the graph-based spectral partitioning. Our approach, based on variational convergence, is general and flexible.

artificial intelligence, machine learning, tl 2, (16 more...)

1508.01928

Country: North America > United States (0.92)

Genre:

Research Report (1.00)
Instructional Material > Course Syllabus & Notes (0.67)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.88)

arXiv.org Machine LearningSep-16-2014

Continuum limit of total variation on point clouds

Trillos, Nicolás García, Slepčev, Dejan

We consider point clouds obtained as random samples of a measure on a Euclidean domain. A graph representing the point cloud is obtained by assigning weights to edges based on the distance between the points they connect. Our goal is to develop mathematical tools needed to study the consistency, as the number of available data points increases, of graph-based machine learning algorithms for tasks such as clustering. In particular, we study when is the cut capacity, and more generally total variation, on these graphs a good approximation of the perimeter (total variation) in the continuum setting. We address this question in the setting of $\Gamma$-convergence. We obtain almost optimal conditions on the scaling, as number of points increases, of the size of the neighborhood over which the points are connected by an edge for the $\Gamma$-convergence to hold. Taking the limit is enabled by a transportation based metric which allows to suitably compare functionals defined on different point clouds.

artificial intelligence, machine learning, sequence, (16 more...)

doi: 10.1007/s00205-015-0929-z

1403.6355

Country: North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.14)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)