AITopics | Optimization

Collaborating Authors

Optimization

News Overviews Instructional Materials AI-Alerts Classics

Escaping Saddle Points in Constrained Optimization

Neural Information Processing SystemsMar-16-2026, 17:57:22 GMT

In this paper, we study the problem of escaping from saddle points in smooth nonconvex optimization problems subject to a convex set $\mathcal{C}$. We propose a generic framework that yields convergence to a second-order stationary point of the problem, if the convex set $\mathcal{C}$ is simple for a quadratic objective function.

artificial intelligence, optimization problem, proceedings, (4 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.44)

Add feedback

Smoothed analysis of the low-rank approach for smooth semidefinite programs

Thomas Pumir, Samy Jelassi, Nicolas Boumal

Neural Information Processing SystemsMar-16-2026, 01:08:30 GMT

We consider semidefinite programs (SDPs) of size nwith equality constraints. In order to overcome scalability issues, Burer and Monteiro proposed a factorized approach based on optimizing over a matrix Y of size n ksuch that X = YY is the SDP variable. The advantages of such formulation are twofold: the dimension of the optimization variable is reduced, and positive semidefiniteness is naturally enforced. However, optimization in Y is non-convex. In prior work, it has been shown that, when the constraints on the factorized variable regularly define a smooth manifold, provided k is large enough, for almost all cost matrices, all second-order stationary points (SOSPs) are optimal. Importantly, in practice, one can only compute points which approximately satisfy necessary optimality conditions, leading to the question: are such points also approximately optimal? To answer it, under similar assumptions, we use smoothed analysis to show that approximate SOSPs for a randomly perturbed objective function are approximate global optima, with k scaling like the square root of the number of constraints (up to log factors). Moreover, we bound the optimality gap at the approximate solution of the perturbed problem with respect to the original problem.

artificial intelligence, machine learning, optimization problem, (18 more...)

Neural Information Processing Systems

Country:

North America (0.28)
Europe (0.28)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

A New Kernel Regularity Condition for Distributed Mirror Descent: Broader Coverage and Simpler Analysis

Qiu, Junwen, Zeng, Ziyang, Mei, Leilei, Zhang, Junyu

arXiv.org Machine LearningMar-16-2026

Existing convergence of distributed optimization methods in non-Euclidean geometries typically rely on kernel assumptions: (i) global Lipschitz smoothness and (ii) bi-convexity of the associated Bregman divergence function. Unfortunately, these conditions are violated by nearly all kernels used in practice, leaving a huge theory-practice gap. This work closes this gap by developing a unified analytical tool that guarantees convergence under mild conditions. Specifically, we introduce Hessian relative uniform continuity (HRUC), a regularity satisfied by nearly all standard kernels. Importantly, HRUC is closed under concatenation, positive scaling, composition, and various kernel combinations. Leveraging the geometric structure induced by HRUC, we derive convergence guarantees for mirror descent-based gradient tracking without imposing any restrictive assumptions. More broadly, our analysis techniques extend seamlessly to other decentralized optimization methods in genuinely non-Euclidean and non-Lipschitz settings.

artificial intelligence, kernel, optimization problem, (16 more...)

arXiv.org Machine Learning

2603.12838

Country:

North America > United States > New Jersey > Mercer County > Princeton (0.04)
Asia > Singapore (0.04)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)

Add feedback

Generalizing Graph Matching beyond Quadratic Assignment Model

Tianshu Yu, Junchi Yan, Yilin Wang, Wei Liu, baoxin Li

Neural Information Processing SystemsMar-15-2026, 08:23:31 GMT

Neural Information Processing Systems http://nips.cc/

artificial intelligence, machine learning, optimization problem, (14 more...)

Neural Information Processing Systems

Country: North America (0.28)

Genre: Research Report (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Maximum Causal Tsallis Entropy Imitation Learning

Kyungjae Lee, Sungjoon Choi, Songhwai Oh

Neural Information Processing SystemsMar-14-2026, 19:24:45 GMT

In this paper, we propose a novel maximum causal Tsallis entropy (MCTE) framework for imitation learning which can efficiently learn a sparse multi-modal policy distribution from demonstrations. We provide the full mathematical analysis of the proposed framework. First, the optimal solution of an MCTE problem is shown to be a sparsemax distribution, whose supporting set can be adjusted. The proposed method has advantages over a softmax distribution in that it can exclude unnecessary actions by assigning zero probability. Second, we prove that an MCTE problem is equivalent to robust Bayes estimation in the sense of the Brier score. Third, we propose a maximum causal Tsallis entropy imitation learning (MCTEIL) algorithm with a sparse mixture density network (sparse MDN) by modeling mixture weights using a sparsemax distribution. In particular, we show that the causal Tsallis entropy of an MDN encourages exploration and efficient mixture utilization while Shannon entropy is less effective.

artificial intelligence, demonstration, machine learning, (18 more...)

Neural Information Processing Systems

Country: North America (0.46)

Technology: