Pitt, David
Tensor-GaLore: Memory-Efficient Training via Gradient Tensor Decomposition
George, Robert Joseph, Pitt, David, Zhao, Jiawei, Kossaifi, Jean, Luo, Cheng, Tian, Yuandong, Anandkumar, Anima
We present Tensor-GaLore, a novel method for efficient training of neural networks with higher-order tensor weights. Many models, particularly those used in scientific computing, employ tensor-parameterized layers to capture complex, multidimensional relationships. When scaling these methods to high-resolution problems makes memory usage grow intractably, and matrix based optimization methods lead to suboptimal performance and compression. We propose to work directly in the high-order space of the complex tensor parameter space using a tensor factorization of the gradients during optimization. We showcase its effectiveness on Fourier Neural Operators (FNOs), a class of models crucial for solving partial differential equations (PDE) and prove the theory of it. Across various PDE tasks like the Navier Stokes and Darcy Flow equations, Tensor-GaLore achieves substantial memory savings, reducing optimizer memory usage by up to 75%. These substantial memory savings across AI for science demonstrate Tensor-GaLore's potential.
A Library for Learning Neural Operators
Kossaifi, Jean, Kovachki, Nikola, Li, Zongyi, Pitt, David, Liu-Schiaffini, Miguel, George, Robert Joseph, Bonev, Boris, Azizzadenesheli, Kamyar, Berner, Julius, Anandkumar, Anima
We present NeuralOperator, an open-source Python library for operator learning. Neural operators generalize neural networks to maps between function spaces instead of finite-dimensional Euclidean spaces. They can be trained and inferenced on input and output functions given at various discretizations, satisfying a discretization convergence properties. Built on top of PyTorch, NeuralOperator provides all the tools for training and deploying neural operator models, as well as developing new ones, in a high-quality, tested, open-source package. It combines cutting-edge models and customizability with a gentle learning curve and simple user interface for newcomers.