AITopics | composite optimization problem

Collaborating Authors

composite optimization problem

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

A Linearly Convergent Proximal Gradient Algorithm for Decentralized Optimization

Sulaiman Alghunaim, Kun Yuan, Ali H. Sayed

Neural Information Processing SystemsFeb-14-2026, 22:18:04 GMT

Neural Information Processing Systems http://nips.cc/

algorithm, convergence rate, optimization, (13 more...)

Neural Information Processing Systems

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.28)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
Europe > Switzerland > Vaud > Lausanne (0.04)
(10 more...)

Industry: Energy (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.94)

Add feedback

Zeroth-Order Methods for Stochastic Nonconvex Nonsmooth Composite Optimization

Chen, Ziyi, Yu, Peiran, Huang, Heng

arXiv.org Artificial IntelligenceOct-7-2025

This work aims to solve a stochastic nonconvex nonsmooth composite optimization problem. Previous works on composite optimization problem requires the major part to satisfy Lipschitz smoothness or some relaxed smoothness conditions, which excludes some machine learning examples such as regularized ReLU network and sparse support matrix machine. In this work, we focus on stochastic nonconvex composite optimization problem without any smoothness assumptions. In particular, we propose two new notions of approximate stationary points for such optimization problem and obtain finite-time convergence results of two zeroth-order algorithms to these two approximate stationary points respectively. Finally, we demonstrate that these algorithms are effective using numerical experiments.

artificial intelligence, machine learning, optimization, (16 more...)

arXiv.org Artificial Intelligence

2510.04446

Country: North America > United States (0.46)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Contributions to Robust and Efficient Methods for Analysis of High Dimensional Data

Yang, Kai

arXiv.org Artificial IntelligenceSep-11-2025

A ubiquitous feature of data of our era is their extra-large sizes and dimensions. Analyzing such high-dimensional data poses significant challenges, since the feature dimension is often much larger than the sample size. This thesis introduces robust and computationally efficient methods to address several common challenges associated with high-dimensional data. In my first manuscript, I propose a coherent approach to variable screening that accommodates nonlinear associations. I develop a novel variable screening method that transcends traditional linear assumptions by leveraging mutual information, with an intended application in neuroimaging data. This approach allows for accurate identification of important variables by capturing nonlinear as well as linear relationships between the outcome and covariates. Building on this foundation, I develop new optimization methods for sparse estimation using nonconvex penalties in my second manuscript. These methods address notable challenges in current statistical computing practices, facilitating computationally efficient and robust analyses of complex datasets. The proposed method can be applied to a general class of optimization problems. In my third manuscript, I contribute to robust modeling of high-dimensional correlated observations by developing a mixed-effects model based on Tsallis power-law entropy maximization and discussed the theoretical properties of such distribution. This model surpasses the constraints of conventional Gaussian models by accommodating a broader class of distributions with enhanced robustness to outliers. Additionally, I develop a proximal nonlinear conjugate gradient algorithm that accelerates convergence while maintaining numerical stability, along with rigorous statistical properties for the proposed framework.

artificial intelligence, data mining, machine learning, (20 more...)

arXiv.org Artificial Intelligence

2509.08155

Country:

Europe (0.92)
North America > United States > New York (0.28)
North America > Canada > Quebec (0.28)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)
Overview (1.00)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Health Care Technology (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (1.00)
Health & Medicine > Therapeutic Area > Neurology > Alzheimer's Disease (0.45)

Technology:

Information Technology > Modeling & Simulation (1.00)
Information Technology > Mathematics of Computing (1.00)
Information Technology > Data Science > Data Mining (1.00)
(4 more...)

Add feedback

A Linearly Convergent Proximal Gradient Algorithm for Decentralized Optimization

Sulaiman Alghunaim, Kun Yuan, Ali H. Sayed

Neural Information Processing SystemsAug-20-2025, 07:56:37 GMT

Neural Information Processing Systems http://nips.cc/

algorithm, convergence rate, optimization, (13 more...)

Neural Information Processing Systems

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.28)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
Europe > Switzerland > Vaud > Lausanne (0.04)
(10 more...)

Industry: Energy (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.94)

Add feedback

Double Variance Reduction: A Smoothing Trick for Composite Optimization Problems without First-Order Gradient

Di, Hao, Ye, Haishan, Zhang, Yueling, Chang, Xiangyu, Dai, Guang, Tsang, Ivor W.

arXiv.org Artificial IntelligenceMay-27-2024

Variance reduction techniques are designed to decrease the sampling variance, thereby accelerating convergence rates of first-order (FO) and zeroth-order (ZO) optimization methods. However, in composite optimization problems, ZO methods encounter an additional variance called the coordinate-wise variance, which stems from the random gradient estimation. To reduce this variance, prior works require estimating all partial derivatives, essentially approximating FO information. This approach demands O(d) function evaluations (d is the dimension size), which incurs substantial computational costs and is prohibitive in high-dimensional scenarios. This paper proposes the Zeroth-order Proximal Double Variance Reduction (ZPDVR) method, which utilizes the averaging trick to reduce both sampling and coordinate-wise variances. Compared to prior methods, ZPDVR relies solely on random gradient estimates, calls the stochastic zeroth-order oracle (SZO) in expectation $\mathcal{O}(1)$ times per iteration, and achieves the optimal $\mathcal{O}(d(n + \kappa)\log (\frac{1}{\epsilon}))$ SZO query complexity in the strongly convex and smooth setting, where $\kappa$ represents the condition number and $\epsilon$ is the desired accuracy. Empirical results validate ZPDVR's linear convergence and demonstrate its superior performance over other related methods.

composite optimization problem, double variance reduction, variance, (9 more...)

arXiv.org Artificial Intelligence

2405.17761

Country:

Europe > Austria > Vienna (0.14)
Asia > Singapore (0.04)
Asia > China > Beijing > Beijing (0.04)
Asia > China > Shaanxi Province > Xi'an (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)

Add feedback

Non-Convex Stochastic Composite Optimization with Polyak Momentum

Gao, Yuan, Rodomanov, Anton, Stich, Sebastian U.

arXiv.org Artificial IntelligenceMar-18-2024

The stochastic proximal gradient method is a powerful generalization of the widely used stochastic gradient descent (SGD) method and has found numerous applications in Machine Learning. However, it is notoriously known that this method fails to converge in non-convex settings where the stochastic noise is significant (i.e. when only small or bounded batch sizes are used). In this paper, we focus on the stochastic proximal gradient method with Polyak momentum. We prove this method attains an optimal convergence rate for non-convex composite optimization problems, regardless of batch size. Additionally, we rigorously analyze the variance reduction effect of the Polyak momentum in the composite optimization setting and we show the method also converges when the proximal step can only be solved inexactly. Finally, we provide numerical experiments to validate our theoretical results.

algorithm 1, assumption 3, gradient method, (13 more...)

arXiv.org Artificial Intelligence

2403.02967

Country:

Europe > Russia (0.04)
Europe > Germany (0.04)
Asia > Russia (0.04)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.89)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.56)

Add feedback

A simple uniformly optimal method without line search for convex optimization

Li, Tianjiao, Lan, Guanghui

arXiv.org Artificial IntelligenceOct-26-2023

Line search (or backtracking) procedures have been widely employed into first-order methods for solving convex optimization problems, especially those with unknown problem parameters (e.g., Lipschitz constant). In this paper, we show that line search is superfluous in attaining the optimal rate of convergence for solving a convex optimization problem whose parameters are not given a priori. In particular, we present a novel accelerated gradient descent type algorithm called auto-conditioned fast gradient method (AC-FGM) that can achieve an optimal $\mathcal{O}(1/k^2)$ rate of convergence for smooth convex optimization without requiring the estimate of a global Lipschitz constant or the employment of line search procedures. We then extend AC-FGM to solve convex optimization problems with H\"{o}lder continuous gradients and show that it automatically achieves the optimal rates of convergence uniformly for all problem classes with the desired accuracy of the solution as the only input. Finally, we report some encouraging numerical results that demonstrate the advantages of AC-FGM over the previously developed parameter-free methods for convex optimization.

ac-fgm, convex optimization, optimization, (14 more...)

arXiv.org Artificial Intelligence

2310.10082

Country:

North America > United States > Georgia > Fulton County > Atlanta (0.04)
Europe > Russia (0.04)
Asia > Russia (0.04)
North America > United States > Massachusetts (0.04)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.34)

Add feedback

How Distributed Optimization operates part1

#artificialintelligenceDec-19-2022, 13:15:06 GMT

Abstract: rivacy protection has become an increasingly pressing requirement in distributed optimization. However, equipping distributed optimization with differential privacy, the state-of-the-art privacy protection mechanism, will unavoidably compromise optimization accuracy. In this paper, we propose an algorithm to achieve rigorous ε-differential privacy in gradient-tracking based distributed optimization with enhanced optimization accuracy. More specifically, to suppress the influence of differential-privacy noise, we propose a new robust gradient-tracking based distributed optimization algorithm that allows both stepsize and the variance of injected noise to vary with time. Then, we establish a new analyzing approach that can characterize the convergence of the gradient-tracking based algorithm under both constant and time-varying stespsizes.

algorithm, optimization, optimization operate part1, (11 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.40)

Add feedback

BALPA: A Balanced Primal-Dual Algorithm for Nonsmooth Optimization with Application to Distributed Optimization

Guo, Luyao, Cao, Jinde, Shi, Xinli, Yang, Shaofu

arXiv.org Artificial IntelligenceDec-6-2022

In this paper, we propose a novel primal-dual proximal splitting algorithm (PD-PSA), named BALPA, for the composite optimization problem with equality constraints, where the loss function consists of a smooth term and a nonsmooth term composed with a linear mapping. In BALPA, the dual update is designed as a proximal point for a time-varying quadratic function, which balances the implementation of primal and dual update and retains the proximity-induced feature of classic PD-PSAs. In addition, by this balance, BALPA eliminates the inefficiency of classic PD-PSAs for composite optimization problems in which the Euclidean norm of the linear mapping or the equality constraint mapping is large. Therefore, BALPA not only inherits the advantages of simple structure and easy implementation of classic PD-PSAs but also ensures a fast convergence when these norms are large. Moreover, we propose a stochastic version of BALPA (S-BALPA) and apply the developed BALPA to distributed optimization to devise a new distributed optimization algorithm. Furthermore, a comprehensive convergence analysis for BALPA and S-BALPA is conducted, respectively. Finally, numerical experiments demonstrate the efficiency of the proposed algorithms.

algorithm, artificial intelligence, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2212.02835

Country: Asia > China > Jiangsu Province > Nanjing (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.89)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)

Add feedback

Constrained and Composite Optimization via Adaptive Sampling Methods

Xie, Yuchen, Bollapragada, Raghu, Byrd, Richard, Nocedal, Jorge

arXiv.org Machine LearningDec-30-2020

The motivation for this paper stems from the desire to develop an adaptive sampling method for solving constrained optimization problems in which the objective function is stochastic and the constraints are deterministic. The method proposed in this paper is a proximal gradient method that can also be applied to the composite optimization problem min f(x) + h(x), where f is stochastic and h is convex (but not necessarily differentiable). Adaptive sampling methods employ a mechanism for gradually improving the quality of the gradient approximation so as to keep computational cost to a minimum. The mechanism commonly employed in unconstrained optimization is no longer reliable in the constrained or composite optimization settings because it is based on pointwise decisions that cannot correctly predict the quality of the proximal gradient step. The method proposed in this paper measures the result of a complete step to determine if the gradient approximation is accurate enough; otherwise a more accurate gradient is generated and a new step is computed. Convergence results are established both for strongly convex and general convex f. Numerical experiments are presented to illustrate the practical behavior of the method.

inner-product test, iteration, norm test, (14 more...)

arXiv.org Machine Learning

2012.15411

Country:

North America > United States > Colorado > Boulder County > Boulder (0.14)
North America > United States > Illinois > Cook County > Evanston (0.04)
North America > United States > Texas > Travis County > Austin (0.04)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback