Goto

Collaborating Authors

 broyden


Block Broyden's Methods for Solving Nonlinear Equations

Neural Information Processing Systems

This paper studies quasi-Newton methods for solving nonlinear equations. We propose block variants of both good and bad Broyden's methods, which enjoy explicit local superlinear convergence rates. Our block good Broyden's method has a faster condition-number-free convergence rate than existing Broyden's methods because it takes the advantage of multiple rank modification on Jacobian estimator. On the other hand, our block bad Broyden's method directly estimates the inverse of the Jacobian provably, which reduces the computational cost of the iteration. Our theoretical results provide some new insights on why good Broyden's method outperforms bad Broyden's method in most of the cases. The empirical results also demonstrate the superiority of our methods and validate our theoretical analysis.



Block Broyden's Methods for Solving Nonlinear Equations

Neural Information Processing Systems

This paper studies quasi-Newton methods for solving nonlinear equations. We propose block variants of both good and bad Broyden's methods, which enjoy explicit local superlinear convergence rates. Our block good Broyden's method has faster condition-number-free convergence rate than existing Broyden's methods because it takes the advantage of multiple rank modification on the Jacobian estimator. On the other hand, our block bad Broyden's method directly estimates the inverse of the Jacobian provably, which reduces the computational cost of the iteration. Our theoretical results provide some new insights on why good Broyden's method outperforms bad Broyden's method in most of the cases. The empirical results also demonstrate the superiority of our methods and validate our theoretical analysis.


AT ask Descriptions and Training Settings

Neural Information Processing Systems

We provide a detailed description of all tasks and some additional details on the training of MDEQ. The entire dataset is divided into training (50K images) and testing (10K) sets. We use two different training settings for evaluating the MDEQ model on CIFAR-10. In the second setting, we apply data augmentation to the input images (i.e., The dataset we use contains 1.2 million labeled training images from ImageNet [ Each pixel is classified in a 19-way fashion for evaluation. CIFAR-10 classification models were trained on 1 GPU (including the baselines).


Block Broyden's Methods for Solving Nonlinear Equations

Neural Information Processing Systems

This paper studies quasi-Newton methods for solving nonlinear equations. We propose block variants of both good and bad Broyden's methods, which enjoy explicit local superlinear convergence rates. Our block good Broyden's method has a faster condition-number-free convergence rate than existing Broyden's methods because it takes the advantage of multiple rank modification on Jacobian estimator. On the other hand, our block bad Broyden's method directly estimates the inverse of the Jacobian provably, which reduces the computational cost of the iteration. Our theoretical results provide some new insights on why good Broyden's method outperforms bad Broyden's method in most of the cases. The empirical results also demonstrate the superiority of our methods and validate our theoretical analysis.



AT ask Descriptions and Training Settings

Neural Information Processing Systems

We provide a detailed description of all tasks and some additional details on the training of MDEQ. The entire dataset is divided into training (50K images) and testing (10K) sets. We use two different training settings for evaluating the MDEQ model on CIFAR-10. In the second setting, we apply data augmentation to the input images (i.e., The dataset we use contains 1.2 million labeled training images from ImageNet [ Each pixel is classified in a 19-way fashion for evaluation. CIFAR-10 classification models were trained on 1 GPU (including the baselines).


Block Broyden's Methods for Solving Nonlinear Equations

Neural Information Processing Systems

This paper studies quasi-Newton methods for solving nonlinear equations. We propose block variants of both good and bad Broyden's methods, which enjoy explicit local superlinear convergence rates. Our block good Broyden's method has faster condition-number-free convergence rate than existing Broyden's methods because it takes the advantage of multiple rank modification on the Jacobian estimator. On the other hand, our block bad Broyden's method directly estimates the inverse of the Jacobian provably, which reduces the computational cost of the iteration. Our theoretical results provide some new insights on why good Broyden's method outperforms bad Broyden's method in most of the cases.


Open source Differentiable ODE Solving Infrastructure

Singh, Rakshit Kr., Menezes, Aaron Rock, Irfan, Rida, Ramsundar, Bharath

arXiv.org Artificial Intelligence

Ordinary Differential Equations (ODEs) are widely used in physics, chemistry, and biology to model dynamic systems, including reaction kinetics, population dynamics, and biological processes. In this work, we integrate GPU-accelerated ODE solvers into the open-source DeepChem framework, making these tools easily accessible. These solvers support multiple numerical methods and are fully differentiable, enabling easy integration into more complex differentiable programs. We demonstrate the capabilities of our implementation through experiments on Lotka-Volterra predator-prey dynamics, pharmacokinetic compartment models, neural ODEs, and solving PDEs using reaction-diffusion equations. Our solvers achieved high accuracy with mean squared errors ranging from $10^{-4}$ to $10^{-6}$ and showed scalability in solving large systems with up to 100 compartments.


Efficient Training of Deep Equilibrium Models

Nguyen, Bac, Mauch, Lukas

arXiv.org Artificial Intelligence

Deep equilibrium models (DEQs) have proven to be very powerful for learning data representations. The idea is to replace traditional (explicit) feedforward neural networks with an implicit fixed-point equation, which allows to decouple the forward and backward passes. In particular, training DEQ layers becomes very memory-efficient via the implicit function theorem. However, backpropagation through DEQ layers still requires solving an expensive Jacobian-based equation. In this paper, we introduce a simple but effective strategy to avoid this computational burden. Our method relies on the Jacobian approximation of Broyden's method after the forward pass to compute the gradients during the backward pass. Experiments show that simply re-using this approximation can significantly speed up the training while not causing any performance degradation.