Optimization
Export Reviews, Discussions, Author Feedback and Meta-Reviews
They motivate their approach by first showing that under some assumptions, the discriminant function over a fully connected graph on the labels can be expressed as the uniform expectation of the discriminant functions over random spanning trees. Through a sampling result, they then show that with high probability over samples of random spanning trees, there is a conical combination of these trees which achieve a substantial fraction of the margin of a predictor which uses the complete graph, and then prove a related risk bound for conical combination over random trees. This motivates to optimize the margin for conical combination of trees as predictors, and the author proposes a primal (and dual) formulation for this optimization problem (somewhat analog to the structured SVM), for which a standard dual subgradient method is proposed as in previous work. They then show that the maximizing joint label for the combination of trees (inference) can be done exactly (under an assumption that be checked at run-time) by looking through the K-best list for each spanning tree (the latter can be obtained by dynamic programming, as was already mentioned in Tsochantaridis et al. [JMLR 2005]). Experiments on standard multilabel datasets show a small improvement over alternative methods.
SLM: A Smoothed First-Order Lagrangian Method for Structured Constrained Nonconvex Optimization
Functional constrained optimization (FCO) has emerged as a powerful tool for solving various machine learning problems. However, with the rapid increase in applications of neural networks in recent years, it has become apparent that both the objective and constraints often involve nonconvex functions, which poses significant challenges in obtaining high-quality solutions. In this work, we focus on a class of nonconvex FCO problems with nonconvex constraints, where the two optimization variables are nonlinearly coupled in the inequality constraint. Leveraging the primal-dual optimization framework, we propose a smoothed first-order Lagrangian method (SLM) for solving this class of problems. We establish the theoretical convergence guarantees of SLM to the Karush-Kuhn-Tucker (KKT) solutions through quantifying dual error bounds. By establishing connections between this structured FCO and equilibrium-constrained nonconvex problems (also known as bilevel optimization), we apply the proposed SLM to tackle bilevel optimization oriented problems where the lower-level problem is nonconvex. Numerical results obtained from both toy examples and hyper-data cleaning problems demonstrate the superiority of SLM compared to benchmark methods.