AITopics | escaping

Collaborating Authors

escaping

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Escaping from saddle points on Riemannian manifolds

Neural Information Processing SystemsDec-25-2025, 03:32:25 GMT

We consider minimizing a nonconvex, smooth function $f$ on a Riemannian manifold $\mathcal{M}$. We show that a perturbed version of the gradient descent algorithm converges to a second-order stationary point for this problem (and hence is able to escape saddle points on the manifold). While the unconstrained problem is well-studied, our result is the first to prove such a rate for nonconvex, manifold-constrained problems. The rate of convergence depends as $1/\epsilon^2$ on the accuracy $\epsilon$, which matches a rate known only for unconstrained smooth minimization. The convergence rate also has a polynomial dependence on the parameters denoting the curvature of the manifold and the smoothness of the function.

name change, riemannian manifold, saddle point, (3 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.42)

Add feedback

Escaping the Gravitational Pull of Softmax

Neural Information Processing SystemsDec-24-2025, 21:13:10 GMT

The softmax is the standard transformation used in machine learning to map real-valued vectors to categorical distributions. Unfortunately, this transform poses serious drawbacks for gradient descent (ascent) optimization. We reveal this difficulty by establishing two negative results: (1) optimizing any expectation with respect to the softmax must exhibit sensitivity to parameter initialization ( softmax damping''). Both findings are based on an analysis of convergence rates using the Non-uniform \L{}ojasiewicz (N\L{}) inequalities. To circumvent these shortcomings we investigate an alternative transformation, the \emph{escort} mapping, that demonstrates better optimization properties. The disadvantages of the softmax and the effectiveness of the escort transformation are further explained using the concept of N\L{} coefficient. In addition to proving bounds on convergence rates to firmly establish these results, we also provide experimental evidence for the superiority of the escort transformation.

gravitational pull, name change, transformation, (6 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

Escaping from the Barren Plateau via Gaussian Initializations in Deep Variational Quantum Circuits

Neural Information Processing SystemsDec-24-2025, 11:48:03 GMT

Variational quantum circuits have been widely employed in quantum simulation and quantum machine learning in recent years. However, quantum circuits with random structures have poor trainability due to the exponentially vanishing gradient with respect to the circuit depth and the qubit number. This result leads to a general standpoint that deep quantum circuits would not be feasible for practical tasks. In this work, we propose an initialization strategy with theoretical guarantees for the vanishing gradient problem in general deep quantum circuits. Specifically, we prove that under proper Gaussian initialized parameters, the norm of the gradient decays at most polynomially when the qubit number and the circuit depth increase. Our theoretical results hold for both the local and the global observable cases, where the latter was believed to have vanishing gradients even for very shallow circuits. Experimental results verify our theoretical findings in quantum simulation and quantum chemistry.

barren plateau, gaussian initialization, quantum circuit, (8 more...)

Neural Information Processing Systems

Technology:

Information Technology > Hardware (1.00)
Information Technology > Artificial Intelligence > Machine Learning (0.78)

Add feedback

DreamSparse: Escaping from Plato's Cave with 2D Diffusion Model Given Sparse Views

Neural Information Processing SystemsDec-23-2025, 20:21:44 GMT

dreamsparse, novel view image, pre-trained diffusion model, (9 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.53)

Add feedback

Review for NeurIPS paper: Escaping the Gravitational Pull of Softmax

Neural Information Processing SystemsFeb-8-2025, 02:57:07 GMT

Summary and Contributions: ##Update## The rebuttal adequately addressed my main concerns and I am consequently increasing my score to a 7. In particular I was pleased that the authors investigated the issues with the learning rate, and I would be happy if they mention this potential limitation in their revisions, and include the experimental results showing that the naive adaptive learning rate proposals I made would not be effective. It was also pleasing that they will discuss and compare with Neural Replicator Dynamics, and the additional experiment with sampled actions also looks promising. The reason I didn't increase my score further was that the current set of experiments is still rather simple, and it is difficult for me to assess whether the new method is likely to be widely used. Though, I feel that the contribution may well turn out to be much more influential.

escape time, policy gradient, softmax policy gradient, (13 more...)

Neural Information Processing Systems

Genre: Research Report (0.36)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.52)

Add feedback

Review for NeurIPS paper: Escaping the Gravitational Pull of Softmax

Neural Information Processing SystemsFeb-8-2025, 02:57:00 GMT

This paper is proposing alternative to common practices in machine learning: Softmax Policy Gradient for RL and softmax parameterization in classification when minimizing cross-entropy loss. The limitation of softmax in these two cases are well explained, and the paper will be interesting for a wide range of the NeurIPS community.

artificial intelligence, gravitational pull, machine learning, (3 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

Escaping the Gravitational Pull of Softmax

Neural Information Processing SystemsJan-15-2025, 05:30:52 GMT

The softmax is the standard transformation used in machine learning to map real-valued vectors to categorical distributions. Unfortunately, this transform poses serious drawbacks for gradient descent (ascent) optimization. We reveal this difficulty by establishing two negative results: (1) optimizing any expectation with respect to the softmax must exhibit sensitivity to parameter initialization (softmax gravity well''), and (2) optimizing log-probabilities under the softmax must exhibit slow convergence (softmax damping''). Both findings are based on an analysis of convergence rates using the Non-uniform \L{}ojasiewicz (N\L{}) inequalities. To circumvent these shortcomings we investigate an alternative transformation, the \emph{escort} mapping, that demonstrates better optimization properties.

gravitational pull, softmax, transformation, (3 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

Escaping from the Barren Plateau via Gaussian Initializations in Deep Variational Quantum Circuits

Neural Information Processing SystemsOct-11-2024, 16:13:06 GMT

deep variational quantum circuit, gaussian initialization, quantum circuit, (5 more...)

Neural Information Processing Systems

Technology:

Information Technology > Hardware (1.00)
Information Technology > Artificial Intelligence > Machine Learning (0.85)

Add feedback

Escaping from saddle points on Riemannian manifolds

Neural Information Processing SystemsOct-9-2024, 16:55:51 GMT

We consider minimizing a nonconvex, smooth function f on a Riemannian manifold \mathcal{M} . We show that a perturbed version of the gradient descent algorithm converges to a second-order stationary point for this problem (and hence is able to escape saddle points on the manifold). While the unconstrained problem is well-studied, our result is the first to prove such a rate for nonconvex, manifold-constrained problems. The rate of convergence depends as 1/\epsilon 2 on the accuracy \epsilon, which matches a rate known only for unconstrained smooth minimization. The convergence rate also has a polynomial dependence on the parameters denoting the curvature of the manifold and the smoothness of the function.

escaping, riemannian manifold, saddle point, (1 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.49)

Add feedback

DreamSparse: Escaping from Plato's Cave with 2D Diffusion Model Given Sparse Views

Neural Information Processing SystemsOct-9-2024, 14:00:24 GMT

Synthesizing novel view images from a few views is a challenging but practical problem. Existing methods often struggle with producing high-quality results or necessitate per-object optimization in such few-view settings due to the insufficient information provided. In this work, we explore leveraging the strong 2D priors in pre-trained diffusion models for synthesizing novel view images. To address these problems, we propose \textit{DreamSparse}, a framework that enables the frozen pre-trained diffusion model to generate geometry and identity-consistent novel view images. Specifically, DreamSparse incorporates a geometry module designed to capture features about spatial information from sparse views as a 3D prior. Subsequently, a spatial guidance model is introduced to convert rendered feature maps as spatial information for the generative process.

dreamsparse, novel view image, pre-trained diffusion model, (5 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback