inertial bregman proximal gradient algorithm
Beyond Alternating Updates for Matrix Factorization with Inertial Bregman Proximal Gradient Algorithms
Matrix Factorization is a popular non-convex optimization problem, for which alternating minimization schemes are mostly used. They usually suffer from the major drawback that the solution is biased towards one of the optimization variables. A remedy is non-alternating schemes. However, due to a lack of Lipschitz continuity of the gradient in matrix factorization problems, convergence cannot be guaranteed. A recently developed approach relies on the concept of Bregman distances, which generalizes the standard Euclidean distance. We exploit this theory by proposing a novel Bregman distance for matrix factorization problems, which, at the same time, allows for simple/closed form update steps. Therefore, for non-alternating schemes, such as the recently introduced Bregman Proximal Gradient (BPG) method and an inertial variant Convex--Concave Inertial BPG (CoCaIn BPG), convergence of the whole sequence to a stationary point is proved for Matrix Factorization. In several experiments, we observe a superior performance of our non-alternating schemes in terms of speed and objective value at the limit point.
Reviews: Beyond Alternating Updates for Matrix Factorization with Inertial Bregman Proximal Gradient Algorithms
The Bregman Proximal Gradient (BPG) method was introduced in general form with theory by Bolte et al. in [8] for nonconvex optimization, and an inertial version with convex-concave backtracking (CoCaInBPG) was introduced by Mukkamala et al. in [45]. In section 6.6 of the same paper, Mukkamala et al. apply CoCaInBPG to the problem of structured matrix factorization and show good performance, but leave the theory and computational efficiency open. Additionally, BPG without inertia was applied sucessfully to the problem of symmetric non-negative matrix factorization (SNMF) by Dragomir et al. in [20], which appears in its most recent version to also include more general (symmetric) matrix completion. The submitted paper here is concerned with applying BPG and CoCaInBPG to the non-symmetric matrix factorization problem, essentially picking up where [45] left off and providing work complementary to [20]. This paper first restates the results of [8,45] for the specific objective (1.1) of matrix factorization, then makes its two primary contributions. First, the paper introduces a kernel generating distance function h that is appropriate for matrix factorization, which is related to the universal function of [20] but extended non-trivially to the non-symmetric case.
Beyond Alternating Updates for Matrix Factorization with Inertial Bregman Proximal Gradient Algorithms
Matrix Factorization is a popular non-convex optimization problem, for which alternating minimization schemes are mostly used. They usually suffer from the major drawback that the solution is biased towards one of the optimization variables. A remedy is non-alternating schemes. However, due to a lack of Lipschitz continuity of the gradient in matrix factorization problems, convergence cannot be guaranteed. A recently developed approach relies on the concept of Bregman distances, which generalizes the standard Euclidean distance. We exploit this theory by proposing a novel Bregman distance for matrix factorization problems, which, at the same time, allows for simple/closed form update steps.
Beyond Alternating Updates for Matrix Factorization with Inertial Bregman Proximal Gradient Algorithms
Mukkamala, Mahesh Chandra, Ochs, Peter
Matrix Factorization is a popular non-convex optimization problem, for which alternating minimization schemes are mostly used. They usually suffer from the major drawback that the solution is biased towards one of the optimization variables. A remedy is non-alternating schemes. However, due to a lack of Lipschitz continuity of the gradient in matrix factorization problems, convergence cannot be guaranteed. A recently developed approach relies on the concept of Bregman distances, which generalizes the standard Euclidean distance.