Symplectic Adjoint Method for Exact Gradient of Neural ODE with Minimal Memory
–Neural Information Processing Systems
A neural network model of a differential equation, namely neural ODE, has enabled the learning of continuous-time dynamical systems and probabilistic distributions with high accuracy. The neural ODE uses the same network repeatedly during a numerical integration. The memory consumption of the backpropagation algorithm is proportional to the number of uses times the network size. This is true even if a checkpointing scheme divides the computation graph into sub-graphs.
Neural Information Processing Systems
Dec-24-2025, 17:50:40 GMT
- Technology: