AITopics

2411.04265

Country:

North America > United States > Maryland (0.14)
North America > United States > New York (0.14)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

arXiv.org Machine LearningJun-3-2024

Learning equivariant tensor functions with applications to sparse vector recovery

Gregory, Wilson G., Tonelli-Cueto, Josué, Marshall, Nicholas F., Lee, Andrew S., Villar, Soledad

This work characterizes equivariant polynomial functions from tuples of tensor inputs to tensor outputs. Loosely motivated by physics, we focus on equivariant functions with respect to the diagonal action of the orthogonal group on tensors. We show how to extend this characterization to other linear algebraic groups, including the Lorentz and symplectic groups. Our goal behind these characterizations is to define equivariant machine learning models. In particular, we focus on the sparse vector estimation problem. This problem has been broadly studied in the theoretical computer science literature, and explicit spectral methods, derived by techniques from sum-of-squares, can be shown to recover sparse vectors under certain assumptions. Our numerical results show that the proposed equivariant machine learning models can learn spectral methods that outperform the best theoretically known spectral methods in some regimes. The experiments also suggest that learned spectral methods can solve the problem in settings that have not yet been theoretically analyzed. This is an example of a promising direction in which theory can inform machine learning models and machine learning models could inform theory.

artificial intelligence, machine learning, tensor, (16 more...)

2406.01552

Country:

Europe (0.92)
North America > United States > New York > New York County > New York City (0.14)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.92)

arXiv.org Machine LearningMay-31-2024

Is machine learning good or bad for the natural sciences?

Hogg, David W., Villar, Soledad

Machine learning (ML) methods are having a huge impact across all of the sciences. However, ML has a strong ontology - in which only the data exist - and a strong epistemology - in which a model is considered good if it performs well on held-out training data. These philosophies are in strong conflict with both standard practices and key philosophies in the natural sciences. Here we identify some locations for ML in the natural sciences at which the ontology and epistemology are valuable. For example, when an expressive machine learning model is used in a causal inference to represent the effects of confounders, such as foregrounds, backgrounds, or instrument calibration parameters, the model capacity and loose philosophy of ML can make the results more trustworthy. We also show that there are contexts in which the introduction of ML introduces strong, unwanted statistical biases. For one, when ML models are used to emulate physical (or first-principles) simulations, they amplify confirmation biases. For another, when expressive regressions are used to label datasets, those labels cannot be used in downstream joint or ensemble analyses without taking on uncontrolled biases. The question in the title is being asked of all of the natural sciences; that is, we are calling on the scientific communities to take a step back and consider the role and value of ML in their fields; the (partial) answers we give here come from the particular perspective of physics.

artificial intelligence, machine learning, natural science, (15 more...)

2405.18095

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
Europe > United Kingdom > England (0.14)
Europe > Austria > Vienna (0.14)

Genre:

Instructional Material (0.67)
Research Report (0.64)

Industry: Health & Medicine (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

arXiv.org Artificial IntelligenceMay-15-2024

Learning functions on symmetric matrices and point clouds via lightweight invariant features

Blum-Smith, Ben, Huang, Ningyuan, Cuturi, Marco, Villar, Soledad

In this work, we present a mathematical formulation for machine learning of (1) functions on symmetric matrices that are invariant with respect to the action of permutations by conjugation, and (2) functions on point clouds that are invariant with respect to rotations, reflections, and permutations of the points. To achieve this, we construct $O(n^2)$ invariant features derived from generators for the field of rational functions on $n\times n$ symmetric matrices that are invariant under joint permutations of rows and columns. We show that these invariant features can separate all distinct orbits of symmetric matrices except for a measure zero set; such features can be used to universally approximate invariant functions on almost all weighted graphs. For point clouds in a fixed dimension, we prove that the number of invariant features can be reduced, generically without losing expressivity, to $O(n)$, where $n$ is the number of points. We combine these invariant features with DeepSets to learn functions on symmetric matrices and point clouds with varying sizes. We empirically demonstrate the feasibility of our approach on molecule property regression and point cloud distance prediction.

artificial intelligence, machine learning, point cloud, (14 more...)

2405.08097

Country: North America > United States > Hawaii (0.14)

Genre: Research Report (0.63)

Industry: Transportation (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.94)

arXiv.org Machine LearningNov-17-2023

Approximately Equivariant Graph Networks

Huang, Ningyuan, Levie, Ron, Villar, Soledad

Graph neural networks (GNNs) are commonly described as being permutation equivariant with respect to node relabeling in the graph. This symmetry of GNNs is often compared to the translation equivariance of Euclidean convolution neural networks (CNNs). However, these two symmetries are fundamentally different: The translation equivariance of CNNs corresponds to symmetries of the fixed domain acting on the image signals (sometimes known as active symmetries), whereas in GNNs any permutation acts on both the graph signals and the graph domain (sometimes described as passive symmetries). In this work, we focus on the active symmetries of GNNs, by considering a learning setting where signals are supported on a fixed graph. In this case, the natural symmetries of GNNs are the automorphisms of the graph. Since real-world graphs tend to be asymmetric, we relax the notion of symmetries by formalizing approximate symmetries via graph coarsening. We present a bias-variance formula that quantifies the tradeoff between the loss in expressivity and the gain in the regularity of the learned estimator, depending on the chosen symmetry group. To illustrate our approach, we conduct extensive experiments on image inpainting, traffic flow prediction, and human pose estimation with different choices of symmetries. We show theoretically and empirically that the best generalization performance can be achieved by choosing a suitably larger group than the graph automorphism, but smaller than the permutation group.

artificial intelligence, machine learning, symmetry, (16 more...)

2308.10436

Country: North America > United States (0.28)

Genre: Research Report (0.63)

Industry: Transportation (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.66)

arXiv.org Artificial IntelligenceNov-2-2023

Fine-grained Expressivity of Graph Neural Networks

Böker, Jan, Levie, Ron, Huang, Ningyuan, Villar, Soledad, Morris, Christopher

Numerous recent works have analyzed the expressive power of message-passing graph neural networks (MPNNs), primarily utilizing combinatorial techniques such as the $1$-dimensional Weisfeiler-Leman test ($1$-WL) for the graph isomorphism problem. However, the graph isomorphism objective is inherently binary, not giving insights into the degree of similarity between two given graphs. This work resolves this issue by considering continuous extensions of both $1$-WL and MPNNs to graphons. Concretely, we show that the continuous variant of $1$-WL delivers an accurate topological characterization of the expressive power of MPNNs on graphons, revealing which graphs these networks can distinguish and the level of difficulty in separating them. We identify the finest topology where MPNNs separate points and prove a universal approximation theorem. Consequently, we provide a theoretical framework for graph and graphon similarity combining various topological variants of classical characterizations of the $1$-WL. In particular, we characterize the expressive power of MPNNs in terms of the tree distance, which is a graph distance based on the concept of fractional isomorphisms, and substructure counts via tree homomorphisms, showing that these concepts have the same expressive power as the $1$-WL and MPNNs on graphons. Empirically, we validate our theoretical findings by showing that randomly initialized MPNNs, without training, exhibit competitive performance compared to their trained counterparts. Moreover, we evaluate different MPNN architectures based on their ability to preserve graph distances, highlighting the significance of our continuous $1$-WL test in understanding MPNNs' expressivity.

artificial intelligence, graph, machine learning, (17 more...)

2306.03698

Country:

Europe (0.28)
North America > Canada > Ontario > Toronto (0.14)

Genre: Research Report > New Finding (0.92)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

arXiv.org Artificial IntelligenceSep-13-2023

A Spectral Analysis of Graph Neural Networks on Dense and Sparse Graphs

Ruiz, Luana, Huang, Ningyuan, Villar, Soledad

In this work we propose a random graph model that can produce graphs at different levels of sparsity. We analyze how sparsity affects the graph spectra, and thus the performance of graph neural networks (GNNs) in node classification on dense and sparse graphs. We compare GNNs with spectral methods known to provide consistent estimators for community detection on dense graphs, a closely related task. We show that GNNs can outperform spectral methods on sparse graphs, and illustrate these results with numerical examples on both synthetic and real graphs.

artificial intelligence, dense and sparse graph, machine learning, (2 more...)

2211.03231

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.60)

arXiv.org Artificial IntelligenceJun-28-2023

Towards fully covariant machine learning

Villar, Soledad, Hogg, David W., Yao, Weichi, Kevrekidis, George A., Schölkopf, Bernhard

Any representation of data involves arbitrary investigator choices. Because those choices are external to the data-generating process, each choice leads to an exact symmetry, corresponding to the group of transformations that takes one possible representation to another. These are the passive symmetries; they include coordinate freedom, gauge symmetry, and units covariance, all of which have led to important results in physics. In machine learning, the most visible passive symmetry is the relabeling or permutation symmetry of graphs. Our goal is to understand the implications for machine learning of the many passive symmetries in play. We discuss dos and don'ts for machine learning practice if passive symmetries are to be respected. We discuss links to causal modeling, and argue that the implementation of passive symmetries is particularly valuable when the goal of the learning problem is to generalize out of sample. This paper is conceptual: It translates among the languages of physics, mathematics, and machine-learning. We believe that consideration and implementation of passive symmetries might help machine learning in the same ways that it transformed physics in the twentieth century.

artificial intelligence, machine learning, symmetry, (18 more...)

2301.13724

Country:

North America > United States (0.94)
Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.14)

Genre: Research Report (0.64)

Industry: Education (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)

arXiv.org Artificial IntelligenceJun-24-2023

Structuring Representation Geometry with Rotationally Equivariant Contrastive Learning

Gupta, Sharut, Robinson, Joshua, Lim, Derek, Villar, Soledad, Jegelka, Stefanie

Self-supervised learning converts raw perceptual data such as images to a compact space where simple Euclidean distances measure meaningful variations in data. In this paper, we extend this formulation by adding additional geometric structure to the embedding space by enforcing transformations of input space to correspond to simple (i.e., linear) transformations of embedding space. Specifically, in the contrastive learning setting, we introduce an equivariance objesctive and theoretically prove that its minima forces augmentations on input space to correspond to rotations on the spherical embedding space. We show that merely combining our equivariant loss with a non-collapse term results in non-trivial representations, without requiring invariance to data augmentations. Optimal performance is achieved by also encouraging approximate invariance, where input augmentations correspond to small rotations. Our method, Care: Contrastive Augmentation-induced Rotational Equivariance, leads to improved performance on downstream tasks, and ensures sensitivity in embedding space to important variations in data (e.g., color) that standard contrastive methods do not achieve. Code is available at https://github.com/Sharut/CARE.

artificial intelligence, augmentation, machine learning, (14 more...)

2306.13924

Country: North America > United States (0.28)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)

arXiv.org Artificial IntelligenceMay-21-2023

GeometricImageNet: Extending convolutional neural networks to vector and tensor images

Gregory, Wilson, Hogg, David W., Blum-Smith, Ben, Arias, Maria Teresa, Wong, Kaze W. K., Villar, Soledad

Convolutional neural networks and their ilk have been very successful for many learning tasks involving images. These methods assume that the input is a scalar image representing the intensity in each pixel, possibly in multiple channels for color images. In natural-science domains however, image-like data sets might have vectors (velocity, say), tensors (polarization, say), pseudovectors (magnetic field, say), or other geometric objects in each pixel. Treating the components of these objects as independent channels in a CNN neglects their structure entirely. Our formulation -- the GeometricImageNet -- combines a geometric generalization of convolution with outer products, tensor index contractions, and tensor index permutations to construct geometric-image functions of geometric images that use and benefit from the tensor structure. The framework permits, with a very simple adjustment, restriction to function spaces that are exactly equivariant to translations, discrete rotations, and reflections. We use representation theory to quantify the dimension of the space of equivariant polynomial functions on 2-dimensional vector images. We give partial results on the expressivity of GeometricImageNet on small images. In numerical experiments, we find that GeometricImageNet has good generalization for a small simulated physics system, even when trained with a small training set. We expect this tool will be valuable for scientific and engineering machine learning, for example in cosmology or ocean dynamics.

artificial intelligence, convolution, machine learning, (19 more...)

2305.12585

Country:

Europe (0.67)
North America > United States > New York (0.14)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)