Goto

Collaborating Authors

 Optimization


A primal-dual method for conic constrained distributed optimization problems

Neural Information Processing Systems

We consider cooperative multi-agent consensus optimization problems over anundirected network of agents, where only those agents connected by an edgecan directly communicate. The objective is to minimize the sum of agent-specific composite convex functions over agent-specific private conic constraintsets; hence, the optimal consensus decision should lie in the intersection of theseprivate sets. We provide convergence rates in sub-optimality, infeasibility andconsensus violation; examine the effect of underlying network topology on theconvergence rates of the proposed decentralized algorithms; and show how to ex-tend these methods to handle time-varying communication networks.




Tight Complexity Bounds for Optimizing Composite Objectives

Neural Information Processing Systems

We provide tight upper and lower bounds on the complexity of minimizing the average of m convex functions using gradient and prox oracles of the component functions. We show a significant gap between the complexity of deterministic vs randomized optimization. For smooth functions, we show that accelerated gradient descent (AGD) and an accelerated variant of SVRG are optimal in the deterministic and randomized settings respectively, and that a gradient oracle is sufficient for the optimal rate. For non-smooth functions, having access to prox oracles reduces the complexity and we present optimal methods based on smoothing that improve over methods using just gradient accesses.




Sub-sampled Newton Methods with Non-uniform Sampling

Neural Information Processing Systems

We consider the regime where nd. We propose randomized Newton-type algorithms that exploit non-uniform sub-sampling of { 2fi(w)}ni=1, as well as inexact updates, as means to reduce the computational complexity, and are applicable to a wide range of problems in machine learning. Two non-uniform sampling distributions based on block norm squares and block partial leverage scores are considered. Under certain assumptions, we show that our algorithms inherit a linear-quadratic convergence rate in w and achieve a lower computational complexity compared to similar existing methods. In addition, we show that our algorithms exhibit more robustness and better dependence on problem specific quantities, such as the condition number. We empirically demonstrate that our methods are at least twice as fast as Newton's methods on several real datasets.


Convex Two-Layer Modeling with Latent Structure

Neural Information Processing Systems

Unsupervised learning of structured predictors has been a long standing pursuit in machine learning. Recently a conditional random field auto-encoder has been proposed in a two-layer setting, allowing latent structured representation to be automatically inferred. Aside from being nonconvex, it also requires the demanding inference of normalization. In this paper, we develop a convex relaxation of two-layer conditional model which captures latent structure and estimates model parameters, jointly and optimally. We further expand its applicability by resorting to a weaker form of inference--maximum a-posteriori. The flexibility of the model is demonstrated on two structures based on total unimodularity--graph matching and linear chain. Experimental results confirm the promise of the method.


Coevolutionary Latent Feature Processes for Continuous-Time User-Item Interactions

Neural Information Processing Systems

Matching users to the right items at the right time is a fundamental task in recommendation systems. As users interact with different items over time, users' and items' feature may evolve and co-evolve over time. Traditional models based on static latent features or discretizing time into epochs can become ineffective for capturing the fine-grained temporal dynamics in the user-item interactions. We propose a coevolutionary latent feature process model that accurately captures the coevolving nature of users' and items' feature. To learn parameters, we design an efficient convex optimization algorithm with a novel low rank space sharing constraints. Extensive experiments on diverse real-world datasets demonstrate significant improvements in user behavior prediction compared to state-of-the-arts.


SDP Relaxation with Randomized Rounding for Energy Disaggregation

Neural Information Processing Systems

We develop a scalable, computationally efficient method for the task of energy disaggregation for home appliance monitoring. In this problem the goal is to estimate the energy consumption of each appliance over time based on the total energy-consumption signal of a household. The current state of the art is to model the problem as inference in factorial HMMs, and use quadratic programming to find an approximate solution to the resulting quadratic integer program. Here we take a more principled approach, better suited to integer programming problems, and find an approximate optimum by combining convex semidefinite relaxations randomized rounding, as well as a scalable ADMM method that exploits the special structure of the resulting semidefinite program. Simulation results both in synthetic and real-world datasets demonstrate the superiority of our method.