AITopics | gradient estimation

Collaborating Authors

gradient estimation

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Fast Zeroth-Order Convex Optimization with Quantum Gradient Methods

Neural Information Processing SystemsJun-18-2026, 18:57:15 GMT

We study quantum algorithms based on quantum (sub)gradient estimation using noisy function evaluation oracles, and demonstrate the first dimension-independent query complexities (up to poly-logarithmic factors) for zeroth-order convex optimization in both smooth and nonsmooth settings. Interestingly, only using noisy function evaluation oracles, we match the first-order query complexities of classical gradient descent, thereby exhibiting exponential separation between quantum and classical zeroth-order optimization. We then generalize these algorithms to work in non-Euclidean settings by using quantum (sub)gradient estimation to instantiate mirror descent and its variants, including dual averaging and mirror prox. By leveraging a connection between semidefinite programming and eigenvalue optimization, we use our quantum mirror descent method to give a new quantum algorithm for solving semidefinite programs, linear programs, and zero-sum games. We identify a parameter regime in which our zero-sum games algorithm is faster than any existing classical or quantum approach.

artificial intelligence, machine learning, natural language, (18 more...)

Neural Information Processing Systems

Country:

Europe (0.46)
North America > United States (0.28)

Genre: Research Report > Experimental Study (1.00)

Industry: Banking & Finance (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.49)

Add feedback

Fast Zeroth-Order Convex Optimization with Quantum Gradient Methods

Neural Information Processing SystemsJun-12-2026, 22:00:56 GMT

artificial intelligence, machine learning, proceedings, (8 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.41)

Add feedback

Computation and Memory-Efficient Model Compression with Gradient Reweighting

Neural Information Processing SystemsJun-11-2026, 23:15:44 GMT

Pruning is a commonly employed technique for deep neural networks (DNNs) aiming at compressing the model size to reduce computational and memory costs during inference. In contrast to conventional neural networks, large language models (LLMs) pose a unique challenge regarding pruning efficiency due to their substantial computational and memory demands. Existing methods, particularly optimization-based ones, often require considerable computational resources in gradient estimation because they cannot effectively leverage weight sparsity of the intermediate pruned network to lower compuation and memory costs in each iteration. The fundamental challenge lies in the need to frequently instantiate intermediate pruned sub-models to achieve these savings, a task that becomes infeasible even for moderately sized neural networks. To this end, this paper proposes a novel pruning method for DNNs that is both computationally and memory-efficient.

large language model, machine learning, natural language, (10 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.38)

Add feedback

Random Noise Defense Against Query-Based Black-Box Attacks

Neural Information Processing SystemsApr-25-2026, 14:05:44 GMT

The query-based black-box attacks have raised serious threats to machine learning models in many real applications. In this work, we study a lightweight defense method, dubbed Random Noise Defense (RND), which adds proper Gaussian noise to each query. We conduct the theoretical analysis about the effectiveness of RND against query-based black-box attacks and the corresponding adaptive attacks. Our theoretical results reveal that the defense performance of RND is determined by the magnitude ratio between the noise induced by RND and the noise added by the attackers for gradient estimation or local search. The large magnitude ratio leads to the stronger defense performance of RND, and it's also critical for mitigating adaptive attacks. Based on our analysis, we further propose to combine RND with a plausible Gaussian augmentation Fine-tuning (RND-GF). It enables RND to add larger noise to each query while maintaining the clean accuracy to obtain a better trade-off between clean accuracy and defense performance. Additionally, RND can be flexibly combined with the existing defense methods to further boost the adversarial robustness, such as adversarial training (AT). Extensive experiments on CIFAR-10 and ImageNet verify our theoretical findings and the effectiveness of RND and RND-GF.

artificial intelligence, machine learning, query, (17 more...)

Neural Information Processing Systems

Country: Asia > China (0.28)

Genre: Research Report (0.46)

Industry:

Transportation > Air (0.85)
Information Technology > Security & Privacy (0.69)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.89)

Add feedback

ReLIZO: Sample Reusable Linear Interpolation-based Zeroth-order Optimization

Neural Information Processing SystemsMar-18-2026, 16:31:34 GMT

Gradient estimation is critical in zeroth-order optimization methods, which aims to obtain the descent direction by sampling update directions and querying function evaluations. Extensive research has been conducted including smoothing and linear interpolation. The former methods smooth the objective function, causing a biased gradient estimation, while the latter often enjoys more accurate estimates, at the cost of large amounts of samples and queries at each iteration to update variables. This paper resorts to the linear interpolation strategy and proposes to reduce the complexity of gradient estimation by reusing queries in the prior iterations while maintaining the sample size unchanged. Specifically, we model the gradient estimation as a quadratically constrained linear program problem and manage to derive the analytical solution. It innovatively decouples the required sample size from the variable dimension without extra conditions required, making it able to leverage the queries in the prior iterations. Moreover, part of the intermediate variables that contribute to the gradient estimation can be directly indexed, significantly reducing the computation complexity.

artificial intelligence, gradient estimation, proceedings, (7 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning (0.60)

Add feedback

Efficient Combinatorial Optimization via Heat Diffusion

Neural Information Processing SystemsFeb-18-2026, 04:21:32 GMT

To overcome this challenge, diverging from conventional efforts of expanding the solver's search scope, we focus on enabling information to actively propagate to the solver through heat diffusion.

artificial intelligence, machine learning, optimization, (18 more...)

Neural Information Processing Systems

Country:

Asia > China > Shanghai > Shanghai (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Greece (0.04)
(2 more...)

Genre: Research Report > Experimental Study (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

OptEx: Expediting First-Order Optimization with Approximately Parallelized Iterations

Neural Information Processing SystemsFeb-16-2026, 01:34:19 GMT

First-order optimization (FOO) algorithms are pivotal in numerous computational domains, such as reinforcement learning and deep learning. However, their application to complex tasks often entails significant optimization inefficiency due to their need of many sequential iterations for convergence. In response, we introduce first-order opt imization ex pedited with approximately parallelized iterations (OptEx), the first general framework that enhances the optimization efficiency of FOO by leveraging parallel computing to directly mitigate its requirement of many sequential iterations for convergence. To achieve this, OptEx utilizes a kernelized gradient estimation that is based on the history of evaluated gradients to predict the gradients required by the next few sequential iterations in FOO, which helps to break the inherent iterative dependency and hence enables the approximate paral-lelization of iterations in FOO. We further establish theoretical guarantees for the estimation error of our kernelized gradient estimation and the iteration complexity of SGD-based OptEx, confirming that the estimation error diminishes to zero as the history of gradients accumulates and that our SGD-based OptEx enjoys an effective acceleration rate of Θ( N) over standard SGD given parallelism of N, in terms of the sequential iterations required for convergence. Finally, we provide extensive empirical studies, including synthetic functions, reinforcement learning tasks, and neural network training on various datasets, to underscore the substantial efficiency improvements achieved by OptEx in practice. Our implementation is available at https://github.com/youyve/OptEx .

artificial intelligence, deep learning, machine learning, (18 more...)

Neural Information Processing Systems

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > Canada > Ontario > National Capital Region > Ottawa (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > China > Guangdong Province > Shenzhen (0.04)

Genre: Research Report > Experimental Study (0.93)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.66)

Add feedback

Thinning for Accelerating the Learning of Point Processes

Tianbo Li, Yiping Ke

Neural Information Processing SystemsFeb-14-2026, 10:30:23 GMT

This paper discusses one of the most fundamental issues about point processes that what is the best sampling method for point processes. We propose thinning as a downsampling method for accelerating the learning of point processes.

artificial intelligence, machine learning, point process, (19 more...)

Neural Information Processing Systems

Country:

North America > United States > New York (0.04)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
Asia > Singapore (0.04)

Genre: Research Report (0.66)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.49)

Add feedback

d6288499d0083cc34e60a077b7c4b3e1-AuthorFeedback.pdf

Neural Information Processing SystemsFeb-14-2026, 10:30:09 GMT

estimation, hawke, point process, (17 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.32)
Information Technology > Artificial Intelligence > Machine Learning (0.31)

Add feedback

a1e865a9b1065392ed6035d8ccd072d9-Paper.pdf

Neural Information Processing SystemsFeb-13-2026, 07:59:23 GMT

Unfortunately,the per-iteration cost of maintaining this adaptivedistribution for gradient estimation is more than calculating the full gradient itself, which we call the chicken-and-the-egg loop. As a result, the false impression of faster convergence in iterations, inreality,leads to slower convergence in time.

artificial intelligence, estimation, machine learning, (18 more...)

Neural Information Processing Systems

Country:

North America > United States > Texas > Harris County > Houston (0.05)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
Asia > Afghanistan > Parwan Province > Charikar (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.50)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback