Deng, Zengde
Boosting Gradient Ascent for Continuous DR-submodular Maximization
Zhang, Qixin, Wan, Zongqi, Deng, Zengde, Chen, Zaiyi, Sun, Xiaoming, Zhang, Jialin, Yang, Yu
Projected Gradient Ascent (PGA) is the most commonly used optimization scheme in machine learning and operations research areas. Nevertheless, numerous studies and examples have shown that the PGA methods may fail to achieve the tight approximation ratio for continuous DR-submodular maximization problems. To address this challenge, we present a boosting technique in this paper, which can efficiently improve the approximation guarantee of the standard PGA to \emph{optimal} with only small modifications on the objective function. The fundamental idea of our boosting technique is to exploit non-oblivious search to derive a novel auxiliary function $F$, whose stationary points are excellent approximations to the global maximum of the original DR-submodular objective $f$. Specifically, when $f$ is monotone and $\gamma$-weakly DR-submodular, we propose an auxiliary function $F$ whose stationary points can provide a better $(1-e^{-\gamma})$-approximation than the $(\gamma^2/(1+\gamma^2))$-approximation guaranteed by the stationary points of $f$ itself. Similarly, for the non-monotone case, we devise another auxiliary function $F$ whose stationary points can achieve an optimal $\frac{1-\min_{\boldsymbol{x}\in\mathcal{C}}\|\boldsymbol{x}\|_{\infty}}{4}$-approximation guarantee where $\mathcal{C}$ is a convex constraint set. In contrast, the stationary points of the original non-monotone DR-submodular function can be arbitrarily bad~\citep{chen2023continuous}. Furthermore, we demonstrate the scalability of our boosting technique on four problems. In all of these four problems, our resulting variants of boosting PGA algorithm beat the previous standard PGA in several aspects such as approximation ratio and efficiency. Finally, we corroborate our theoretical findings with numerical experiments, which demonstrate the effectiveness of our boosting PGA methods.
Decentralized Weakly Convex Optimization Over the Stiefel Manifold
Wang, Jinxin, Hu, Jiang, Chen, Shixiang, Deng, Zengde, So, Anthony Man-Cho
We focus on a class of non-smooth optimization problems over the Stiefel manifold in the decentralized setting, where a connected network of $n$ agents cooperatively minimize a finite-sum objective function with each component being weakly convex in the ambient Euclidean space. Such optimization problems, albeit frequently encountered in applications, are quite challenging due to their non-smoothness and non-convexity. To tackle them, we propose an iterative method called the decentralized Riemannian subgradient method (DRSM). The global convergence and an iteration complexity of $\mathcal{O}(\varepsilon^{-2} \log^2(\varepsilon^{-1}))$ for forcing a natural stationarity measure below $\varepsilon$ are established via the powerful tool of proximal smoothness from variational analysis, which could be of independent interest. Besides, we show the local linear convergence of the DRSM using geometrically diminishing stepsizes when the problem at hand further possesses a sharpness property. Numerical experiments are conducted to corroborate our theoretical findings.
An Online Algorithm for Chance Constrained Resource Allocation
Chen, Yuwei, Deng, Zengde, Zhou, Yinzhi, Chen, Zaiyi, Chen, Yujie, Hu, Haoyuan
This paper studies the online stochastic resource allocation problem (RAP) with chance constraints. The online RAP is a 0-1 integer linear programming problem where the resource consumption coefficients are revealed column by column along with the corresponding revenue coefficients. When a column is revealed, the corresponding decision variables are determined instantaneously without future information. Moreover, in online applications, the resource consumption coefficients are often obtained by prediction. To model their uncertainties, we take the chance constraints into the consideration. To the best of our knowledge, this is the first time chance constraints are introduced in the online RAP problem. Assuming that the uncertain variables have known Gaussian distributions, the stochastic RAP can be transformed into a deterministic but nonlinear problem with integer second-order cone constraints. Next, we linearize this nonlinear problem and analyze the performance of vanilla online primal-dual algorithm for solving the linearized stochastic RAP. Under mild technical assumptions, the optimality gap and constraint violation are both on the order of $\sqrt{n}$. Then, to further improve the performance of the algorithm, several modified online primal-dual algorithms with heuristic corrections are proposed. Finally, extensive numerical experiments on both synthetic and real data demonstrate the applicability and effectiveness of our methods.
Single-Layer Graph Convolutional Networks For Recommendation
Xu, Yue, Chen, Hao, Deng, Zengde, Zhu, Junxiong, Li, Yanghua, He, Peng, Gao, Wenyao, Xu, Wenjun
Graph Convolutional Networks (GCNs) and their variants have received significant attention and achieved start-of-the-art performances on various recommendation tasks. However, many existing GCN models tend to perform recursive aggregations among all related nodes, which arises severe computational burden. Moreover, they favor multi-layer architectures in conjunction with complicated modeling techniques. Though effective, the excessive amount of model parameters largely hinder their applications in real-world recommender systems. To this end, in this paper, we propose the single-layer GCN model which is able to achieve superior performance along with remarkably less complexity compared with existing models. Our main contribution is three-fold. First, we propose a principled similarity metric named distribution-aware similarity (DA similarity), which can guide the neighbor sampling process and evaluate the quality of the input graph explicitly. We also prove that DA similarity has a positive correlation with the final performance, through both theoretical analysis and empirical simulations. Second, we propose a simplified GCN architecture which employs a single GCN layer to aggregate information from the neighbors filtered by DA similarity and then generates the node representations. Moreover, the aggregation step is a parameter-free operation, such that it can be done in a pre-processing manner to further reduce red the training and inference costs. Third, we conduct extensive experiments on four datasets. The results verify that the proposed model outperforms existing GCN models considerably and yields up to a few orders of magnitude speedup in training, in terms of the recommendation performance.
Manifold Proximal Point Algorithms for Dual Principal Component Pursuit and Orthogonal Dictionary Learning
Chen, Shixiang, Deng, Zengde, Ma, Shiqian, So, Anthony Man-Cho
The problem is numerically challenging due to its nonsmooth objective and nonconvex constraint, and its algorithmic aspects have not been well explored. In this paper, we show how the manifold structure of the sphere can be exploited to design fast algorithms for tackling this problem. Specifically, our contribution is threefold. First, we present a manifold proximal point algorithm (ManPPA) for the problem and show that it converges at a sublinear rate. Furthermore, we show that ManPPA can achieve a quadratic convergence rate when applied to the ODL and RSR problems. Second, we propose a stochastic variant of ManPPA called StManPPA, which is well suited for large-scale computation, and establish its sublinear convergence rate. Both ManPPA and StManPPA have provably faster convergence rates than existing subgradient-type methods. Third, using ManPPA as a building block, we propose a new approach to solving a matrix analog of the problem, in which the sphere is replaced by the Stiefel manifold. The results from our extensive numerical experiments on the ODL and RSR problems demonstrate the efficiency and efficacy of our proposed methods.
Voting-Based Multi-Agent Reinforcement Learning
Xu, Yue, Deng, Zengde, Wang, Mengdi, Xu, Wenjun, So, Anthony Man-Cho, Cui, Shuguang
The recent success of single-agent reinforcement learning (RL) encourages the exploration of multi-agent reinforcement learning (MARL), which is more challenging due to the interactions among different agents. In this paper, we consider a voting-based MARL problem, in which the agents vote to make group decisions and the goal is to maximize the globally averaged returns. To this end, we formulate the MARL problem based on the linear programming form of the policy optimization problem and propose a distributed primal-dual algorithm to obtain the optimal solution. We also propose a voting mechanism through which the distributed learning achieves the same sub-linear convergence rate as centralized learning. In other words, the distributed decision making does not slow down the global consensus to optimal. We also verify the convergence of our proposed algorithm with numerical simulations and conduct case studies in practical multi-agent systems.
An Efficient Augmented Lagrangian Based Method for Constrained Lasso
Deng, Zengde, So, Anthony Man-Cho
Variable selection is one of the most important tasks in statistics and machine learning. To incorporate more prior information about the regression coefficients, the constrained Lasso model has been proposed in the literature. In this paper, we present an inexact augmented Lagrangian method to solve the Lasso problem with linear equality constraints. By fully exploiting second-order sparsity of the problem, we are able to greatly reduce the computational cost and obtain highly efficient implementations. Furthermore, numerical results on both synthetic data and real data show that our algorithm is superior to existing first-order methods in terms of both running time and solution accuracy.