AITopics | exact gradient

cb8da6767461f2812ae4290eac7cbc42-Paper.pdf

Neural Information Processing SystemsFeb-11-2026, 04:52:51 GMT

artificial intelligence, gradient, machine learning, (13 more...)

Neural Information Processing Systems

Country: Asia > China (0.04)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
Information Technology > Artificial Intelligence > Vision (0.68)

Add feedback

Biologically-plausiblebackpropagationthrough arbitrarytimespansvialocalneuromodulators

Neural Information Processing SystemsFeb-9-2026, 17:16:49 GMT

Here, we propose that extra-synaptic diffusion of local neuromodulators such as neuropeptides may afford an effective mode of backpropagation lying within the bounds of biological plausibility.

approximation, artificial intelligence, machine learning, (19 more...)

Neural Information Processing Systems

Country: North America > United States > Washington > King County > Seattle (0.04)

Genre: Research Report > New Finding (0.46)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.47)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Exact Gradients for Stochastic Spiking Neural Networks Driven by Rough Signals

Neural Information Processing SystemsDec-24-2025, 23:57:53 GMT

We introduce a mathematically rigorous framework based on rough path theory to model stochastic spiking neural networks (SSNNs) as stochastic differential equations with event discontinuities (Event SDEs) and driven by càdlàg rough paths. Our formalism is general enough to allow for potential jumps to be present both in the solution trajectories as well as in the driving noise. We then identify a set of sufficient conditions ensuring the existence of pathwise gradients of solution trajectories and event times with respect to the network's parameters and show how these gradients satisfy a recursive relation. Furthermore, we introduce a general-purpose loss function defined by means of a new class of signature kernels indexed on càdlàg rough paths and use it to train SSNNs as generative models. We provide an end-to-end autodifferentiable solver for Event SDEs and make its implementation available as part of the $\texttt{diffrax}$ library. Our framework is, to our knowledge, the first enabling gradient-based training of SSNNs with noise affecting both the spike timing and the network's dynamics.

artificial intelligence, machine learning, stochastic spiking neural network driven, (7 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.67)

Add feedback

Symplectic Adjoint Method for Exact Gradient of Neural ODE with Minimal Memory

Neural Information Processing SystemsDec-24-2025, 17:50:40 GMT

A neural network model of a differential equation, namely neural ODE, has enabled the learning of continuous-time dynamical systems and probabilistic distributions with high accuracy. The neural ODE uses the same network repeatedly during a numerical integration. The memory consumption of the backpropagation algorithm is proportional to the number of uses times the network size. This is true even if a checkpointing scheme divides the computation graph into sub-graphs.

adjoint method, exact gradient, symplectic adjoint method, (7 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Backward-Friendly Optimization: Training Large Language Models with Approximate Gradients under Memory Constraints

Yang, Jing, Cai, Kaitong, Fan, Yijia, Yang, Yufeng, Wang, Keze

arXiv.org Artificial IntelligenceOct-28-2025

Full fine-tuning of Large Language Models (LLMs) is notoriously memory-intensive, primarily because conventional optimizers such as SGD or Adam assume access to exact gradients derived from cached activations. Existing solutions either alter the model architecture (e.g., reversible networks) or trade memory for computation (e.g., activation checkpointing), but the optimizer itself remains untouched. In this work, we introduce GradLite, a backward-friendly optimizer that relaxes the requirement of exact gradients, enabling efficient training even when intermediate activations are aggressively discarded or approximated. GradLite leverages two key techniques: (i) low-rank Jacobian approximation, which reduces the dimensionality of backpropagated error signals, and (ii) error-feedback correction, which accumulates and compensates approximation errors across iterations to preserve convergence guarantees. We provide a theoretical analysis showing that GradLite maintains unbiased gradient estimates with bounded variance, ensuring convergence rates comparable to Adam. Empirically, GradLite reduces optimizer-state and activation memory consumption by up to 50\% without architectural changes, and achieves on-par or superior downstream performance on reasoning (MMLU, GSM8K), multilingual, and dialogue benchmarks compared to checkpointing and optimizer-centric baselines (LoMo, GaLore).

large language model, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2510.22467

Country: Asia > Indonesia > Bali (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.70)

Add feedback

A Complete Pipeline for deploying SNNs with Synaptic Delays on Loihi 2

Mészáros, Balázs, Knight, James C., Timcheck, Jonathan, Nowotny, Thomas

arXiv.org Artificial IntelligenceOct-16-2025

Abstract--Spiking Neural Networks are attracting increased attention as a more energy-efficient alternative to traditional Artificial Neural Networks for edge computing. Neuromorphic computing can significantly reduce energy requirements. Here, we present a complete pipeline: efficient event-based training of SNNs with synaptic delays on GPUs and deployment on Intel's Loihi 2 neuromorphic chip. We evaluate our approach on keyword recognition tasks using the Spiking Heidelberg Digits and Spiking Speech Commands datasets, demonstrating that our algorithm can enhance classification accuracy compared to architectures without delays. Our benchmarking indicates almost no accuracy loss between GPU and Loihi 2 implementations, while classification on Loihi 2 is up to 18 faster and uses 250 less energy than on an NVIDIA Jetson Orin Nano.

artificial intelligence, loihi 2, machine learning, (13 more...)

arXiv.org Artificial Intelligence

2510.13757

Country: North America > United States (0.14)

Genre: Research Report (0.82)

Industry:

Information Technology (0.67)
Energy (0.67)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

qEHVI_CR

Eytan Bakshy

Neural Information Processing SystemsOct-3-2025, 05:13:00 GMT

evolutionary algorithm, machine learning, optimization, (16 more...)

Neural Information Processing Systems

Country:

North America > United States > Wisconsin > Dane County > Madison (0.04)
North America > Canada (0.04)
Asia > China > Beijing > Beijing (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Evolutionary Systems (0.94)

Add feedback

cb8da6767461f2812ae4290eac7cbc42-Paper.pdf

Neural Information Processing SystemsAug-17-2025, 09:34:48 GMT

artificial intelligence, gradient, machine learning, (13 more...)

Neural Information Processing Systems

Country: Asia > China (0.04)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
Information Technology > Artificial Intelligence > Vision (0.68)

Add feedback

Biologically-plausible backpropagation through arbitrary timespans via local neuromodulators Y uhan Helena Liu 1,2,3,*, Stephen Smith

Neural Information Processing SystemsAug-15-2025, 17:54:36 GMT

Here, we propose that extra-synaptic diffusion of local neuromodulators such as neuropeptides may afford an effective mode of back-propagation lying within the bounds of biological plausibility.

approximation, artificial intelligence, machine learning, (17 more...)

Neural Information Processing Systems

Country:

North America > United States > Washington > King County > Seattle (0.04)
North America > United States > California > Santa Clara County > Stanford (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report > New Finding (0.93)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Therapeutic Area > Neurology (0.95)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Backpropagation (0.42)

Add feedback

Exact Gradients for Stochastic Spiking Neural Networks Driven by Rough Signals

Neural Information Processing SystemsMay-26-2025, 21:26:44 GMT

We introduce a mathematically rigorous framework based on rough path theory to model stochastic spiking neural networks (SSNNs) as stochastic differential equations with event discontinuities (Event SDEs) and driven by càdlàg rough paths. Our formalism is general enough to allow for potential jumps to be present both in the solution trajectories as well as in the driving noise. We then identify a set of sufficient conditions ensuring the existence of pathwise gradients of solution trajectories and event times with respect to the network's parameters and show how these gradients satisfy a recursive relation. Furthermore, we introduce a general-purpose loss function defined by means of a new class of signature kernels indexed on càdlàg rough paths and use it to train SSNNs as generative models. We provide an end-to-end autodifferentiable solver for Event SDEs and make its implementation available as part of the \texttt{diffrax} library.

artificial intelligence, machine learning, stochastic spiking neural network driven, (5 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.65)

Add feedback

Filters

Collaborating Authors

exact gradient

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

cb8da6767461f2812ae4290eac7cbc42-Paper.pdf

Biologically-plausiblebackpropagationthrough arbitrarytimespansvialocalneuromodulators

Exact Gradients for Stochastic Spiking Neural Networks Driven by Rough Signals

Symplectic Adjoint Method for Exact Gradient of Neural ODE with Minimal Memory

Backward-Friendly Optimization: Training Large Language Models with Approximate Gradients under Memory Constraints

A Complete Pipeline for deploying SNNs with Synaptic Delays on Loihi 2

qEHVI_CR

cb8da6767461f2812ae4290eac7cbc42-Paper.pdf

Biologically-plausible backpropagation through arbitrary timespans via local neuromodulators Y uhan Helena Liu 1,2,3,*, Stephen Smith

Exact Gradients for Stochastic Spiking Neural Networks Driven by Rough Signals