AITopics

Genre: Research Report (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.89)

We note that these results are about two of the most commonly used architecture modifications for RNNs. First, the gating mechanism is ubiquitous in RNNs, and usually thought of as a heuristic for smoothing optimization [28]. Second, many of the effective large-scale RNNs use linear (gated) recurrences and deeper models, which is usually thought of as a heuristic for computational efficiency [5]. Our results suggest that neither of these are heuristics after all, and arise from standard ways to approximate ODEs. To be more specific, we show that: 19 Table 6: A summary of the characteristics of popular RNN methods and their approximation mechanisms for capturing the dynamics x(t) = x(t) + f(t,x(t)) (equation (14)). The LSSL entries are for the very specific case with order N = 1 and A= 1,B = 1,C = 1,D= 0; LSSLs are more general.

artificial intelligence, machine learning, matrix, (19 more...)

Genre: Research Report > New Finding (0.68)

Industry: Health & Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Neural Information Processing SystemsFeb-18-2026, 15:57:32 GMT

f162fa05675e3db4a733aafc081653cf-Paper-Conference.pdf

arxiv preprint arxiv, machine learning, natural language, (16 more...)

Country:

North America > United States > California > Santa Clara County > Palo Alto (0.04)
Asia > Japan (0.04)

Genre: Research Report > New Finding (0.48)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.45)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.67)

arXiv.org Machine LearningOct-22-2025

The Picard-Lagrange Framework for Higher-Order Langevin Monte Carlo

Mahajan, Jaideep, Zhang, Kaihong, Liang, Feng, Liu, Jingbo

Sampling from log-concave distributions is a central problem in statistics and machine learning. Prior work establishes theoretical guarantees for Langevin Monte Carlo algorithm based on overdamped and underdamped Langevin dynamics and, more recently, some third-order variants. In this paper, we introduce a new sampling algorithm built on a general $K$th-order Langevin dynamics, extending beyond second- and third-order methods. To discretize the $K$th-order dynamics, we approximate the drift induced by the potential via Lagrange interpolation and refine the node values at the interpolation points using Picard-iteration corrections, yielding a flexible scheme that fully utilizes the acceleration of higher-order Langevin dynamics. For targets with smooth, strongly log-concave densities, we prove dimension-dependent convergence in Wasserstein distance: the sampler achieves $\varepsilon$-accuracy within $\widetilde O(d^{\frac{K-1}{2K-3}}\varepsilon^{-\frac{2}{2K-3}})$ gradient evaluations for $K \ge 3$. To our best knowledge, this is the first sampling algorithm achieving such query complexity. The rate improves with the order $K$ increases, yielding better rates than existing first to third-order approaches.

artificial intelligence, derivative, machine learning, (16 more...)

arXiv.org Machine Learning

2510.18242

Genre: Research Report (0.63)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Neural Information Processing SystemsOct-10-2025, 21:09:54 GMT

f162fa05675e3db4a733aafc081653cf-Paper-Conference.pdf

arxiv preprint arxiv, diffusion model, implementation, (13 more...)

Country:

North America > United States > California > Santa Clara County > Palo Alto (0.04)
Asia > Japan (0.04)

Genre: Research Report (1.00)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.45)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.67)

Gonzalez, Xavier, Buchanan, E. Kelly, Lee, Hyun Dong, Liu, Jerry Weihong, Wang, Ke Alexander, Zoltowski, David M., Ré, Christopher, Linderman, Scott W.

A Unifying Framework for Parallelizing Sequential Models with Linear Dynamical Systems

arXiv.org Artificial IntelligenceSep-29-2025

Harnessing parallelism in seemingly sequential models is a central challenge for modern machine learning. Several approaches have been proposed for evaluating sequential processes in parallel using fixed-point methods, like Newton, Picard, and Jacobi iterations. In this work, we show that these methods can be understood within a common framework based on linear dynamical systems (LDSs), where different iteration schemes arise naturally as approximate linearizations of a nonlinear recursion. This unifying view highlights shared principles behind these techniques and clarifies when particular fixed-point methods are most likely to be effective. By bridging diverse algorithms through the language of LDSs, our framework provides a clearer theoretical foundation for parallelizing sequential models and points toward new opportunities for efficient and scalable computation.

artificial intelligence, machine learning, natural language, (19 more...)

2509.21716

Country: North America > United States (0.14)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.68)

Chen, Yan-Ting, Chen, Hao-Wei, Hsiao, Tsu-Ching, Lee, Chun-Yi

Parallel Sampling of Diffusion Models on $SO(3)$

arXiv.org Artificial IntelligenceJul-15-2025

In this paper, we design an algorithm to accelerate the diffusion process on the $SO(3)$ manifold. The inherently sequential nature of diffusion models necessitates substantial time for denoising perturbed data. To overcome this limitation, we proposed to adapt the numerical Picard iteration for the $SO(3)$ space. We demonstrate our algorithm on an existing method that employs diffusion models to address the pose ambiguity problem. Moreover, we show that this acceleration advantage occurs without any measurable degradation in task reward. The experiments reveal that our algorithm achieves a speed-up of up to 4.9$\times$, significantly reducing the latency for generating a single sample.

artificial intelligence, iteration, machine learning, (13 more...)

2507.10347

Genre: Research Report (0.82)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)

arXiv.org Artificial IntelligenceMay-13-2025

You Only Look One Step: Accelerating Backpropagation in Diffusion Sampling with Gradient Shortcuts

Dou, Hongkun, Li, Zeyu, Jiang, Xingyu, Li, Hongjue, Yang, Lijun, Yao, Wen, Deng, Yue

Diffusion models (DMs) have recently demonstrated remarkable success in modeling large-scale data distributions. However, many downstream tasks require guiding the generated content based on specific differentiable metrics, typically necessitating backpropagation during the generation process. This approach is computationally expensive, as generating with DMs often demands tens to hundreds of recursive network calls, resulting in high memory usage and significant time consumption. In this paper, we propose a more efficient alternative that approaches the problem from the perspective of parallel denoising. We show that full backpropagation throughout the entire generation process is unnecessary. The downstream metrics can be optimized by retaining the computational graph of only one step during generation, thus providing a shortcut for gradient propagation. The resulting method, which we call Shortcut Diffusion Optimization (SDO), is generic, high-performance, and computationally lightweight, capable of optimizing all parameter types in diffusion sampling. We demonstrate the effectiveness of SDO on several real-world tasks, including controlling generation by optimizing latent and aligning the DMs by fine-tuning network parameters. Compared to full backpropagation, our approach reduces computational costs by $\sim 90\%$ while maintaining superior performance. Code is available at https://github.com/deng-ai-lab/SDO.

artificial intelligence, diffusion model, machine learning, (15 more...)

2505.07477

Country: Asia > China (0.14)

Genre: Research Report > New Finding (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Backpropagation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Zhou, Huanjian, Sugiyama, Masashi

Parallel simulation for sampling under isoperimetry and score-based diffusion models

arXiv.org Artificial IntelligenceDec-12-2024

In recent years, there has been a surge of interest in proving discretization bounds for sampling under isoperimetry and for diffusion models. As data size grows, reducing the iteration cost becomes an important goal. Inspired by the great success of the parallel simulation of the initial value problem in scientific computation, we propose parallel Picard methods for sampling tasks. Rigorous theoretical analysis reveals that our algorithm achieves better dependence on dimension $d$ than prior works in iteration complexity (i.e., reduced from $\widetilde{O}(\log^2 d)$ to $\widetilde{O}(\log d)$), which is even optimal for sampling under isoperimetry with specific iteration complexity. Our work highlights the potential advantages of simulation methods in scientific computation for dynamics-based sampling and diffusion models.

diffusion model, iteration, time slice, (15 more...)