Goto

Collaborating Authors

 anandkumar


Robustness, Novelty

Neural Information Processing Systems

We would like to thank all the reviewers for their comprehensive reviews. We clarify the major comments below. As noted in Sec.6 (and suggested by As discussed in Sec.1, 1.1, 2-4, and Figure 1, TensorNOODL accomplishes Therefore, it seems that leveraging tensor structure may increase the computational complexity. Thank you for this insight. Further, TensorNOODL requires the initial dictionary estimate to follow A.2. for exact recovery at a linear Initializations which do not meet these conditions may still converge, albeit not at a linear rate.



85b42dd8aae56e01379be5736db5b496-AuthorFeedback.pdf

Neural Information Processing Systems

We would like to thank all the reviewers for their comprehensive reviews. We clarify the major comments below. As noted in Sec.6 (and suggested by As discussed in Sec.1, 1.1, 2-4, and Figure 1, TensorNOODL accomplishes Therefore, it seems that leveraging tensor structure may increase the computational complexity. Thank you for this insight. Further, TensorNOODL requires the initial dictionary estimate to follow A.2. for exact recovery at a linear Initializations which do not meet these conditions may still converge, albeit not at a linear rate.


Anima Anandkumar Highlights AI's Potential to Solve 'Hard Scientific Challenges'

TIME - Tech

Anima Anandkumar is using AI to help solve the world's challenges faster. She has used the technology to speed up prediction models in an effort to get ahead of extreme weather, and to work on sustainable nuclear fusion simulations so as to one day safely harness the energy source. Accepting a TIME100 AI Impact Award in Dubai on Monday, Anandkumar--a professor at California Institute of Technology who was previously the senior director of AI research at Nvidia--credited her engineer parents with setting an example for her. "Having a mom who is an engineer was just such a great role model right at home." Her parents, who brought computerized manufacturing to her hometown in India, opened up her world, she said.

  Country:
  Industry: Energy (0.39)

Sequential Transfer in Multi-armed Bandit with Finite Set of Models

Mohammad Gheshlaghi azar, Alessandro Lazaric, Emma Brunskill

Neural Information Processing Systems

Learning from prior tasks and transferring that experience to improve future performance is critical for building lifelong learning agents. Although results in supervised and reinforcement learning show that transfer may significantly improve the learning performance, most of the literature on transfer is focused on batch learning tasks. In this paper we study the problem of sequential transfer in online learning, notably in the multi-armed bandit framework, where the objective is to minimize the total regret over a sequence of tasks by transferring knowledge from prior tasks. We introduce a novel bandit algorithm based on a method-of-moments approach for estimating the possible tasks and derive regret bounds for it.


Fourier Neural Operators for Learning Dynamics in Quantum Spin Systems

Shah, Freya, Patti, Taylor L., Berner, Julius, Tolooshams, Bahareh, Kossaifi, Jean, Anandkumar, Anima

arXiv.org Artificial Intelligence

Fourier Neural Operators (FNOs) excel on tasks using functional data, such as those originating from partial differential equations. Such characteristics render them an effective approach for simulating the time evolution of quantum wavefunctions, which is a computationally challenging, yet coveted task for understanding quantum systems. In this manuscript, we use FNOs to model the evolution of random quantum spin systems, so chosen due to their representative quantum dynamics and minimal symmetry. We explore two distinct FNO architectures and examine their performance for learning and predicting time evolution using both random and low-energy input states. Additionally, we apply FNOs to a compact set of Hamiltonian observables ($\sim\text{poly}(n)$) instead of the entire $2^n$ quantum wavefunction, which greatly reduces the size of our inputs and outputs and, consequently, the requisite dimensions of the resulting FNOs. Moreover, this Hamiltonian observable-based method demonstrates that FNOs can effectively distill information from high-dimensional spaces into lower-dimensional spaces. The extrapolation of Hamiltonian observables to times later than those used in training is of particular interest, as this stands to fundamentally increase the simulatability of quantum systems past both the coherence times of contemporary quantum architectures and the circuit-depths of tractable tensor networks.


Reduced-Order Neural Operators: Learning Lagrangian Dynamics on Highly Sparse Graphs

Viswanath, Hrishikesh, Chang, Yue, Berner, Julius, Chen, Peter Yichen, Bera, Aniket

arXiv.org Artificial Intelligence

We present a neural operator architecture to simulate Lagrangian dynamics, such as fluid flow, granular flows, and elastoplasticity. Traditional numerical methods, such as the finite element method (FEM), suffer from long run times and large memory consumption. On the other hand, approaches based on graph neural networks are faster but still suffer from long computation times on dense graphs, which are often required for high-fidelity simulations. Our model, GIOROM or Graph Interaction Operator for Reduced-Order Modeling, learns temporal dynamics within a reduced-order setting, capturing spatial features from a highly sparse graph representation of the input and generalizing to arbitrary spatial locations during inference. The model is geometry-aware and discretization-agnostic and can generalize to different initial conditions, velocities, and geometries after training. We show that point clouds of the order of 100,000 points can be inferred from sparse graphs with $\sim$1000 points, with negligible change in computation time. We empirically evaluate our model on elastic solids, Newtonian fluids, Non-Newtonian fluids, Drucker-Prager granular flows, and von Mises elastoplasticity. On these benchmarks, our approach results in a 25$\times$ speedup compared to other neural network-based physics simulators while delivering high-fidelity predictions of complex physical systems and showing better performance on most benchmarks. The code and the demos are provided at https://github.com/HrishikeshVish/GIOROM.


Estimating Mixture Models via Mixtures of Polynomials

Neural Information Processing Systems

Mixture modeling is a general technique for making any simple model more expressive through weighted combination. This generality and simplicity in part explains the success of the Expectation Maximization (EM) algorithm, in which updates are easy to derive for a wide class of mixture models. However, the likelihood of a mixture model is non-convex, so EM has no known global convergence guarantees. Recently, method of moments approaches offer global guarantees for some mixture models, but they do not extend easily to the range of mixture models that exist. In this work, we present Polymom, an unifying framework based on method of moments in which estimation procedures are easily derivable, just as in EM. Polymom is applicable when the moments of a single mixture component are polynomials of the parameters. Our key observation is that the moments of the mixture model are a mixture of these polynomials, which allows us to cast estimation as a Generalized Moment Problem. We solve its relaxations using semidefinite optimization, and then extract parameters using ideas from computer algebra. This framework allows us to draw insights and apply tools from convex optimization, computer algebra and the theory of moments to study problems in statistical estimation.


Online and Differentially-Private Tensor Decomposition

Neural Information Processing Systems

Tensor decomposition is an important tool for big data analysis. In this paper, we resolve many of the key algorithmic questions regarding robustness, memory efficiency, and differential privacy of tensor decomposition. We propose simple variants of the tensor power method which enjoy these strong properties. We present the first guarantees for online tensor power method which has a linear memory requirement. Moreover, we present a noise calibrated tensor power method with efficient privacy guarantees. At the heart of all these guarantees lies a careful perturbation analysis derived in this paper which improves up on the existing results significantly.


Neural Operators for Accelerating Scientific Simulations and Design

Azizzadenesheli, Kamyar, Kovachki, Nikola, Li, Zongyi, Liu-Schiaffini, Miguel, Kossaifi, Jean, Anandkumar, Anima

arXiv.org Artificial Intelligence

Scientific discovery and engineering design are currently limited by the time and cost of physical experiments, selected mostly through trial-and-error and intuition that require deep domain expertise. Numerical simulations present an alternative to physical experiments but are usually infeasible for complex real-world domains due to the computational requirements of existing numerical methods. Artificial intelligence (AI) presents a potential paradigm shift by developing fast data-driven surrogate models. In particular, an AI framework, known as Neural Operators, presents a principled framework for learning mappings between functions defined on continuous domains, e.g., spatiotemporal processes and partial differential equations (PDE). They can extrapolate and predict solutions at new locations unseen during training, i.e., perform zero-shot super-resolution. Neural Operators can augment or even replace existing simulators in many applications, such as computational fluid dynamics, weather forecasting, and material modeling, while being 4-5 orders of magnitude faster. Further, Neural Operators can be integrated with physics and other domain constraints enforced at finer resolutions to obtain high-fidelity solutions and good generalization. Since Neural Operators are differentiable, they can directly optimize parameters for inverse design and other inverse problems. We believe that Neural Operators present a transformative approach to simulation and design, enabling rapid research and development.