AITopics | Dehmamy, Nima

Collaborating Authors

Dehmamy, Nima

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Symmetry-Informed Governing Equation Discovery

Yang, Jianke, Rao, Wang, Dehmamy, Nima, Walters, Robin, Yu, Rose

arXiv.org Artificial IntelligenceMay-26-2024

Despite the advancements in learning governing differential equations from observations of dynamical systems, data-driven methods are often unaware of fundamental physical laws, such as frame invariance. As a result, these algorithms may search an unnecessarily large space and discover equations that are less accurate or overly complex. In this paper, we propose to leverage symmetry in automated equation discovery to compress the equation search space and improve the accuracy and simplicity of the learned equations. Specifically, we derive equivariance constraints from the time-independent symmetries of ODEs. Depending on the types of symmetries, we develop a pipeline for incorporating symmetry constraints into various equation discovery algorithms, including sparse regression and genetic programming. In experiments across a diverse range of dynamical systems, our approach demonstrates better robustness against noise and recovers governing equations with significantly higher probability than baselines without symmetry.

artificial intelligence, equation, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2405.16756

Country: North America > United States (0.67)

Genre: Research Report (0.82)

Industry: Government > Regional Government > North America Government > United States Government (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Latent Space Symmetry Discovery

Yang, Jianke, Dehmamy, Nima, Walters, Robin, Yu, Rose

arXiv.org Artificial IntelligenceSep-29-2023

Equivariant neural networks require explicit knowledge of the symmetry group. Automatic symmetry discovery methods aim to relax this constraint and learn invariance and equivariance from data. However, existing symmetry discovery methods are limited to linear symmetries in their search space and cannot handle the complexity of symmetries in real-world, often high-dimensional data. We propose a novel generative model, Latent LieGAN (LaLiGAN), which can discover nonlinear symmetries from data. It learns a mapping from data to a latent space where the symmetries become linear and simultaneously discovers symmetries in the latent space. Theoretically, we show that our method can express any nonlinear symmetry under certain conditions. Experimentally, our method can capture the intrinsic symmetry in high-dimensional observations, which results in a well-structured latent space that is useful for other downstream tasks. We demonstrate the use cases for LaLiGAN in improving equation discovery and long-term forecasting for various dynamical systems. However, for complex real-world data, the underlying symmetries may be unknown or challenging to articulate through programming. Each et al., 2023) can discover various types of symmetries, trajectory is a group action orbit containing but its search space is still constrained to general linear a random v V. groups. Successful discovery can only be achieved when observations are measured in an ideal coordinate system where linear symmetry is present. Unfortunately, real-world data often contain nonlinear symmetries, such as high-dimensional dynamics that evolve on a low-dimensional manifold (Champion et al., 2019), or 2D images of 3D objects (Garrido et al., 2023). Another line of study focuses on learning equivariant representations (Park et al., 2022; Yu et al., 2022; Dangovski et al., 2021; Quessard et al., 2020). These approaches learn a latent embedding space with particular symmetries.

artificial intelligence, latent space, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2310.00105

Country: North America > United States (0.67)

Genre: Research Report (0.50)

Industry: Government > Regional Government > North America Government > United States Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.88)

Add feedback

Generative Adversarial Symmetry Discovery

Yang, Jianke, Walters, Robin, Dehmamy, Nima, Yu, Rose

arXiv.org Artificial IntelligenceJun-18-2023

Despite the success of equivariant neural networks in scientific applications, they require knowing the symmetry group a priori. However, it may be difficult to know which symmetry to use as an inductive bias in practice. Enforcing the wrong symmetry could even hurt the performance. In this paper, we propose a framework, LieGAN, to automatically discover equivariances from a dataset using a paradigm akin to generative adversarial training. Specifically, a generator learns a group of transformations applied to the data, which preserve the original distribution and fool the discriminator. LieGAN represents symmetry as interpretable Lie algebra basis and can discover various symmetries such as the rotation group $\mathrm{SO}(n)$, restricted Lorentz group $\mathrm{SO}(1,3)^+$ in trajectory prediction and top-quark tagging tasks. The learned symmetry can also be readily used in several existing equivariant neural networks to improve accuracy and generalization in prediction.

artificial intelligence, machine learning, symmetry, (15 more...)

arXiv.org Artificial Intelligence

2302.00236

Country:

North America > United States > Hawaii (0.14)
North America > United States > California (0.14)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)

Add feedback

Symmetries, flat minima, and the conserved quantities of gradient flow

Zhao, Bo, Ganev, Iordan, Walters, Robin, Yu, Rose, Dehmamy, Nima

arXiv.org Artificial IntelligenceMar-23-2023

Empirical studies of the loss landscape of deep networks have revealed that many local minima are connected through low-loss valleys. Yet, little is known about the theoretical origin of such valleys. We present a general framework for finding continuous symmetries in the parameter space, which carve out low-loss valleys. Our framework uses equivariances of the activation functions and can be applied to different layer architectures. To generalize this framework to nonlinear neural networks, we introduce a novel set of nonlinear, data-dependent symmetries. These symmetries can transform a trained model such that it performs similarly on new samples, which allows ensemble building that improves robustness under certain adversarial attacks. We then show that conserved quantities associated with linear symmetries can be used to define coordinates along low-loss valleys. The conserved quantities help reveal that using common initialization methods, gradient flow only explores a small part of the global minimum. By relating conserved quantities to convergence rate and sharpness of the minimum, we provide insights on how initialization impacts convergence and generalizability.

artificial intelligence, conserved quantity, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2210.17216

Country: North America > United States > California (0.14)

Genre: Research Report (1.00)

Industry:

Government > Regional Government (0.45)
Government > Military (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Symmetry Teleportation for Accelerated Optimization

Zhao, Bo, Dehmamy, Nima, Walters, Robin, Yu, Rose

arXiv.org Artificial IntelligenceJan-4-2023

Existing gradient-based optimization methods update parameters locally, in a direction that minimizes the loss function. We study a different approach, symmetry teleportation, that allows parameters to travel a large distance on the loss level set, in order to improve the convergence speed in subsequent steps. Teleportation exploits symmetries in the loss landscape of optimization problems. We derive loss-invariant group actions for test functions in optimization and multi-layer neural networks, and prove a necessary condition for teleportation to improve convergence rate. We also show that our algorithm is closely related to second order methods. Experimentally, we show that teleportation improves the convergence speed of gradient descent and AdaGrad for several optimization problems including test functions, multi-layer regressions, and MNIST classification.

accelerated optimization, machine learning, optimization problem, (2 more...)

arXiv.org Artificial Intelligence

2205.10637

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.73)
Information Technology > Artificial Intelligence > Machine Learning (0.53)

Add feedback

Automatic Symmetry Discovery with Lie Algebra Convolutional Network

Dehmamy, Nima, Walters, Robin, Liu, Yanchen, Wang, Dashun, Yu, Rose

arXiv.org Artificial IntelligenceSep-15-2021

Existing equivariant neural networks for continuous groups require discretization or group representations. All these approaches require detailed knowledge of the group parametrization and cannot learn entirely new symmetries. We propose to work with the Lie algebra (infinitesimal generators) instead of the Lie group.Our model, the Lie algebra convolutional network (L-conv) can learn potential symmetries and does not require discretization of the group. We show that L-conv can serve as a building block to construct any group equivariant architecture. We discuss how CNNs and Graph Convolutional Networks are related to and can be expressed as L-conv with appropriate groups. We also derive the MSE loss for a single L-conv layer and find a deep relation with Lagrangians used in physics, with some of the physics aiding in defining generalization and symmetries in the loss landscape. Conversely, L-conv could be used to propose more general equivariant ans\"atze for scientific machine learning.

artificial intelligence, l-conv, neural network, (18 more...)

arXiv.org Artificial Intelligence

2109.07103

Country: North America > United States > California (0.14)

Genre: Research Report > New Finding (0.45)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.66)

Add feedback

Understanding the Representation Power of Graph Neural Networks in Learning Graph Topology

Dehmamy, Nima, Barabási, Albert-László, Yu, Rose

arXiv.org Machine LearningJul-11-2019

To deepen our understanding of graph neural networks, we investigate the representation power of Graph Convolutional Networks (GCN) through the looking glass of graph moments, a key property of graph topology encoding path of various lengths. We find that GCNs are rather restrictive in learning graph moments. Without careful design, GCNs can fail miserably even with multiple layers and nonlinear activation functions. We analyze theoretically the expressiveness of GCNs, arriving at a modular GCN design, using different propagation rules. Our modular design is capable of distinguishing graphs from different graph generation models for surprisingly small graphs, a notoriously difficult problem in network science. Our investigation suggests that, depth is much more influential than width, with deeper GCNs being more capable of learning higher order graph moments. Additionally, combining GCN modules with different propagation rules is critical to the representation power of GCNs.

deep learning, graph, neural network, (19 more...)

arXiv.org Machine Learning

1907.05008

Country: Europe > Hungary (0.14)

Genre: Research Report (0.84)

Industry: Health & Medicine (0.93)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)

Add feedback

Separation of time scales and direct computation of weights in deep neural networks

Dehmamy, Nima, Rohani, Neda, Katsaggelos, Aggelos

arXiv.org Machine LearningMar-11-2018

Artificial intelligence is revolutionizing our lives at an ever increasing pace. At the heart of this revolution is the recent advancements in deep neural networks (DNN), learning to perform sophisticated, high-level tasks. However, training DNNs requires massive amounts of data and is very computationally intensive. Gaining analytical understanding of the solutions found by DNNs can help us devise more efficient training algorithms, replacing the commonly used mthod of stochastic gradient descent (SGD). We analyze the dynamics of SGD and show that, indeed, direct computation of the solutions is possible in many cases. We show that a high performing setup used in DNNs introduces a separation of time-scales in the training dynamics, allowing SGD to train layers from the lowest (closest to input) to the highest. We then show that for each layer, the distribution of solutions found by SGD can be estimated using a class-based principal component analysis (PCA) of the layer's input. This finding allows us to forgo SGD entirely and directly derive the DNN parameters using this class-based PCA, which can be well estimated using significantly less data than SGD. We implement these results on image datasets MNIST, CIFAR10 and CIFAR100 and find that, in fact, layers derived using our class-based PCA perform comparable or superior to neural networks of the same size and architecture trained using SGD. We also confirm that the class-based PCA often converges using a fraction of the data required for SGD. Thus, using our method training time can be reduced both by requiring less training data than SGD, and by eliminating layers in the costly backpropagation step of the training.

dataset, deep learning, neural network, (20 more...)

arXiv.org Machine Learning

1703.04757

Genre: Research Report > New Finding (0.67)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback