inequality
SLM: A Smoothed First-Order Lagrangian Method for Structured Constrained Nonconvex Optimization
Functional constrained optimization (FCO) has emerged as a powerful tool for solving various machine learning problems. However, with the rapid increase in applications of neural networks in recent years, it has become apparent that both the objective and constraints often involve nonconvex functions, which poses significant challenges in obtaining high-quality solutions. In this work, we focus on a class of nonconvex FCO problems with nonconvex constraints, where the two optimization variables are nonlinearly coupled in the inequality constraint. Leveraging the primal-dual optimization framework, we propose a smoothed first-order Lagrangian method (SLM) for solving this class of problems. We establish the theoretical convergence guarantees of SLM to the Karush-Kuhn-Tucker (KKT) solutions through quantifying dual error bounds. By establishing connections between this structured FCO and equilibrium-constrained nonconvex problems (also known as bilevel optimization), we apply the proposed SLM to tackle bilevel optimization oriented problems where the lower-level problem is nonconvex. Numerical results obtained from both toy examples and hyper-data cleaning problems demonstrate the superiority of SLM compared to benchmark methods.
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Asia > Middle East > Jordan (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Asia > Middle East > Jordan (0.04)
Supplementary Materials for: Max-Sliced Mutual Information A Proofs
A.1 Proof of Proposition 1 We note that 1 is restated and was proved in [25, Appendix A.1] Proof of 2: Non-negativity directly follows by non-negativity of mutual information. Proof of 5: The proof relies on the independence of functions of independent random variables. This concludes the proof. 1 A.2 Proof of Proposition 2 By translation invariance of mutual information, we may assume w.l.o.g. that the means are Next, we show that we may equivalently optimize with the added unit variance constraint. Example 3.4]), we have I (A B) null, where the last equality uses the unit variance property and Schur's determinant formula. Armed with Lemma 1, we are in place to prove Proposition 2. Since the CCA solutions Theorem 2.2], which is restated next for completeness.
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Europe > France > Auvergne-Rhône-Alpes > Isère > Grenoble (0.04)
- North America > United States > Arizona > Maricopa County > Phoenix (0.04)
- Europe > Sweden > Stockholm > Stockholm (0.04)
- Asia > China > Jiangsu Province > Nanjing (0.04)
- (13 more...)
- North America > United States > Arizona > Maricopa County > Phoenix (0.04)
- Europe > Sweden > Stockholm > Stockholm (0.04)
- Asia > China > Jiangsu Province > Nanjing (0.04)
- (13 more...)
- Africa > Middle East > Tunisia > Ben Arous Governorate > Ben Arous (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- North America > United States > California (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- (2 more...)