AITopics | jacobi

Collaborating Authors

jacobi

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Rethinking the Variational Interpretation of Accelerated Optimization Methods

Neural Information Processing SystemsAug-15-2025, 07:22:02 GMT

The continuous-time model of Nesterov's momentum provides a thought-provoking perspective for understanding the nature of the acceleration phenomenon in convex

lagrangian, nesterov, variation, (15 more...)

Neural Information Processing Systems

Country:

Asia > Middle East > Jordan (0.06)
Europe > Switzerland > Zürich > Zürich (0.04)
North America > United States > California > Alameda County > Berkeley (0.04)
(4 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.40)

Add feedback

Extracting Complex Topology from Multivariate Functional Approximation: Contours, Jacobi Sets, and Ridge-Valley Graphs

Ma, Guanqun, Lenz, David, Guo, Hanqi, Peterka, Tom, Wang, Bei

arXiv.org Artificial IntelligenceAug-12-2025

Implicit continuous models, such as functional models and implicit neural networks, are an increasingly popular method for replacing discrete data representations with continuous, high-order, and differentiable surrogates. These models offer new perspectives on the storage, transfer, and analysis of scientific data. In this paper, we introduce the first framework to directly extract complex topological features -- contours, Jacobi sets, and ridge-valley graphs -- from a type of continuous implicit model known as multivariate functional approximation (MFA). MFA replaces discrete data with continuous piecewise smooth functions. Given an MFA model as the input, our approach enables direct extraction of complex topological features from the model, without reverting to a discrete representation of the model. Our work is easily generalizable to any continuous implicit model that supports the queries of function values and high-order derivatives. Our work establishes the building blocks for performing topological data analysis and visualization on implicit continuous models.

artificial intelligence, machine learning, mfa model, (19 more...)

arXiv.org Artificial Intelligence

2508.07637

Country: North America > United States (1.00)

Genre: Research Report > New Finding (0.46)

Industry:

Energy (0.46)
Government > Regional Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.48)
Information Technology > Artificial Intelligence > Representation & Reasoning > Model-Based Reasoning (0.48)
Information Technology > Artificial Intelligence > Representation & Reasoning > Spatial Reasoning (0.46)

Add feedback

Faster Diffusion Models via Higher-Order Approximation

Li, Gen, Zhou, Yuchen, Wei, Yuting, Chen, Yuxin

arXiv.org Machine LearningJul-1-2025

In this paper, we explore provable acceleration of diffusion models without any additional retraining. Focusing on the task of approximating a target data distribution in $\mathbb{R}^d$ to within $\varepsilon$ total-variation distance, we propose a principled, training-free sampling algorithm that requires only the order of $$ d^{1+2/K} \varepsilon^{-1/K} $$ score function evaluations (up to log factor) in the presence of accurate scores, where $K$ is an arbitrarily large fixed integer. This result applies to a broad class of target data distributions, without the need for assumptions such as smoothness or log-concavity. Our theory is robust vis-a-vis inexact score estimation, degrading gracefully as the score estimation error increases -- without demanding higher-order smoothness on the score estimates as assumed in previous work. The proposed algorithm draws insight from high-order ODE solvers, leveraging high-order Lagrange interpolation and successive refinement to approximate the integral derived from the probability flow ODE.

artificial intelligence, log 2, machine learning, (15 more...)

arXiv.org Machine Learning

2506.24042

Country:

North America > United States > Pennsylvania (0.04)
North America > United States > Illinois > Champaign County > Urbana (0.04)
Asia > China > Hong Kong (0.04)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Add feedback

CEED-VLA: Consistency Vision-Language-Action Model with Early-Exit Decoding

Song, Wenxuan, Chen, Jiayi, Ding, Pengxiang, Huang, Yuxin, Zhao, Han, Wang, Donglin, Li, Haoang

arXiv.org Artificial IntelligenceJun-17-2025

In recent years, Vision-Language-Action (VLA) models have become a vital research direction in robotics due to their impressive multimodal understanding and generalization capabilities. Despite the progress, their practical deployment is severely constrained by inference speed bottlenecks, particularly in high-frequency and dexterous manipulation tasks. While recent studies have explored Jacobi decoding as a more efficient alternative to traditional autoregressive decoding, its practical benefits are marginal due to the lengthy iterations. To address it, we introduce consistency distillation training to predict multiple correct action tokens in each iteration, thereby achieving acceleration. Besides, we design mixed-label supervision to mitigate the error accumulation during distillation. Although distillation brings acceptable speedup, we identify that certain inefficient iterations remain a critical bottleneck. To tackle this, we propose an early-exit decoding strategy that moderately relaxes convergence conditions, which further improves average inference efficiency. Experimental results show that the proposed method achieves more than 4 times inference acceleration across different baselines while maintaining high task success rates in both simulated and real-world robot tasks. These experiments validate that our approach provides an efficient and general paradigm for accelerating multimodal decision-making in robotics. Our project page is available at https://irpn-eai.github.io/CEED-VLA/.

artificial intelligence, ceed-vla, iteration, (16 more...)

arXiv.org Artificial Intelligence

2506.13725

Genre: Research Report > New Finding (0.48)

Technology: Information Technology > Artificial Intelligence > Robots > Manipulation (0.34)

Add feedback

Inference Acceleration of Autoregressive Normalizing Flows by Selective Jacobi Decoding

Zhang, Jiaru, Lu, Juanwu, Wang, Ziran, Zhang, Ruqi

arXiv.org Artificial IntelligenceJun-2-2025

Normalizing flows are promising generative models with advantages such as theoretical rigor, analytical log-likelihood computation, and end-to-end training. However, the architectural constraints to ensure invertibility and tractable Jacobian computation limit their expressive power and practical usability. Recent advancements utilize autoregressive modeling, significantly enhancing expressive power and generation quality. However, such sequential modeling inherently restricts parallel computation during inference, leading to slow generation that impedes practical deployment. In this paper, we first identify that strict sequential dependency in inference is unnecessary to generate high-quality samples. We observe that patches in sequential modeling can also be approximated without strictly conditioning on all preceding patches. Moreover, the models tend to exhibit low dependency redundancy in the initial layer and higher redundancy in subsequent layers. Leveraging these observations, we propose a selective Jacobi decoding (SeJD) strategy that accelerates autoregressive inference through parallel iterative optimization. Theoretical analyses demonstrate the method's superlinear convergence rate and guarantee that the number of iterations required is no greater than the original sequential approach. Empirical evaluations across multiple datasets validate the generality and effectiveness of our acceleration technique. Experiments demonstrate substantial speed improvements up to 4.7 times faster inference while keeping the generation quality and fidelity.

artificial intelligence, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2505.24791

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.94)

Add feedback

Adaptivity and Convergence of Probability Flow ODEs in Diffusion Generative Models

Tang, Jiaqi, Yan, Yuling

arXiv.org Machine LearningJan-30-2025

Score-based generative models, which transform noise into data by learning to reverse a diffusion process, have become a cornerstone of modern generative AI. This paper contributes to establishing theoretical guarantees for the probability flow ODE, a widely used diffusion-based sampler known for its practical efficiency. While a number of prior works address its general convergence theory, it remains unclear whether the probability flow ODE sampler can adapt to the low-dimensional structures commonly present in natural image data. We demonstrate that, with accurate score function estimation, the probability flow ODE sampler achieves a convergence rate of $O(k/T)$ in total variation distance (ignoring logarithmic factors), where $k$ is the intrinsic dimension of the target distribution and $T$ is the number of iterations. This dimension-free convergence rate improves upon existing results that scale with the typically much larger ambient dimension, highlighting the ability of the probability flow ODE sampler to exploit intrinsic low-dimensional structures in the target distribution for faster sampling.

artificial intelligence, machine learning, natural language, (13 more...)

arXiv.org Machine Learning

2501.18863

Country:

North America > United States > Wisconsin > Dane County > Madison (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Italy > Calabria > Catanzaro Province > Catanzaro (0.04)
Europe > Austria (0.04)

Genre:

Overview (0.67)
Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Generation (0.90)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.66)

Add feedback

Low-dimensional adaptation of diffusion models: Convergence in total variation

Liang, Jiadong, Huang, Zhihan, Chen, Yuxin

arXiv.org Machine LearningJan-22-2025

This paper investigates how diffusion generative models leverage (unknown) low-dimensional structure to accelerate sampling. Focusing on two mainstream samplers -- the denoising diffusion implicit model (DDIM) and the denoising diffusion probabilistic model (DDPM) -- and assuming accurate score estimates, we prove that their iteration complexities are no greater than the order of $k/\varepsilon$ (up to some log factor), where $\varepsilon$ is the precision in total variation distance and $k$ is some intrinsic dimension of the target distribution. Our results are applicable to a broad family of target distributions without requiring smoothness or log-concavity assumptions. Further, we develop a lower bound that suggests the (near) necessity of the coefficients introduced by Ho et al.(2020) and Song et al.(2020) in facilitating low-dimensional adaptation. Our findings provide the first rigorous evidence for the adaptivity of the DDIM-type samplers to unknown low-dimensional structure, and improve over the state-of-the-art DDPM theory regarding total variation convergence.

artificial intelligence, arxiv preprint arxiv, machine learning, (16 more...)

arXiv.org Machine Learning

2501.12982

Genre: Research Report > New Finding (0.54)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.67)

Add feedback

A Sharp Convergence Theory for The Probability Flow ODEs of Diffusion Models

Li, Gen, Wei, Yuting, Chi, Yuejie, Chen, Yuxin

arXiv.org Machine LearningAug-5-2024

Diffusion models, which convert noise into new data instances by learning to reverse a diffusion process, have become a cornerstone in contemporary generative modeling. In this work, we develop non-asymptotic convergence theory for a popular diffusion-based sampler (i.e., the probability flow ODE sampler) in discrete time, assuming access to $\ell_2$-accurate estimates of the (Stein) score functions. For distributions in $\mathbb{R}^d$, we prove that $d/\varepsilon$ iterations -- modulo some logarithmic and lower-order terms -- are sufficient to approximate the target distribution to within $\varepsilon$ total-variation distance. This is the first result establishing nearly linear dimension-dependency (in $d$) for the probability flow ODE sampler. Imposing only minimal assumptions on the target data distribution (e.g., no smoothness assumption is imposed), our results also characterize how $\ell_2$ score estimation errors affect the quality of the data generation processes. In contrast to prior works, our theory is developed based on an elementary yet versatile non-asymptotic approach without the need of resorting to SDE and ODE toolboxes.

arxiv preprint arxiv, jacobi, log 3, (15 more...)

arXiv.org Machine Learning

2408.0232

Country:

North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.14)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(2 more...)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

AI is poised to automate today's most mundane manual warehouse task

MIT Technology ReviewJul-11-2024, 13:00:00 GMT

After much trial and error, Jacobi's founders, including roboticist Ken Goldberg, say they've cracked it. Their software, built upon research from a paper they published in Science Robotics in 2020, is designed to work with the four leading makers of robotic palletizing arms. It uses deep learning to generate a "first draft" of how an arm might move an item onto the pallet. Then it uses more traditional robotics methods, like optimization, to check whether the movement can be done safely and without glitches. Jacobi aims to replace the legacy methods customers are currently using to train their bots.

artificial intelligence, jacobi, machine learning, (8 more...)

MIT Technology Review

Country: North America > United States > Michigan (0.06)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.40)

Add feedback

Make Some Noise: Unlocking Language Model Parallel Inference Capability through Noisy Training

Wang, Yixuan, Luo, Xianzhen, Wei, Fuxuan, Liu, Yijun, Zhu, Qingfu, Zhang, Xuanyu, Yang, Qing, Xu, Dongliang, Che, Wanxiang

arXiv.org Artificial IntelligenceJun-25-2024

Existing speculative decoding methods typically require additional model structure and training processes to assist the model for draft token generation. This makes the migration of acceleration methods to the new model more costly and more demanding on device memory. To address this problem, we propose the Make Some Noise (MSN) training framework as a replacement for the supervised fine-tuning stage of the large language model. The training method simply introduces some noise at the input for the model to learn the denoising task. It significantly enhances the parallel decoding capability of the model without affecting the original task capability. In addition, we propose a tree-based retrieval-augmented Jacobi (TR-Jacobi) decoding strategy to further improve the inference speed of MSN models. Experiments in both the general and code domains have shown that MSN can improve inference speed by 2.3-2.7x times without compromising model performance. The MSN model also achieves comparable acceleration ratios to the SOTA model with additional model structure on Spec-Bench.

arxiv preprint arxiv, jacobi, noise segment, (14 more...)

arXiv.org Artificial Intelligence

2406.17404

Country:

North America > United States (0.05)
Asia > China > Heilongjiang Province > Harbin (0.04)
Europe > Monaco (0.04)
(2 more...)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.95)

Add feedback