Goto

Collaborating Authors

 Optimization


(Probably) Concave Graph Matching

Neural Information Processing Systems

In this paper we address the graph matching problem. Following the recent works of \cite{zaslavskiy2009path,Vestner2017} we analyze and generalize the idea of concave relaxations. We introduce the concepts of \emph{conditionally concave} and \emph{probably conditionally concave} energies on polytopes and show that they encapsulate many instances of the graph matching problem, including matching Euclidean graphs and graphs on surfaces. We further prove that local minima of probably conditionally concave energies on general matching polytopes (\eg, doubly stochastic) are with high probability extreme points of the matching polytope (\eg, permutations).


An Efficient Pruning Algorithm for Robust Isotonic Regression

Neural Information Processing Systems

We study a generalization of the classic isotonic regression problem where we allow separable nonconvex objective functions, focusing on the case of estimators used in robust regression. A simple dynamic programming approach allows us to solve this problem to within ฮต-accuracy (of the global minimum) in time linear in 1/ฮต and the dimension. We can combine techniques from the convex case with branch-and-bound ideas to form a new algorithm for this problem that naturally exploits the shape of the objective function. Our algorithm achieves the best bounds for both the general nonconvex and convex case (linear in log (1/ฮต)), while performing much faster in practice than a straightforward dynamic programming approach, especially as the desired accuracy increases.


Fast Similarity Search via Optimal Sparse Lifting

Neural Information Processing Systems

Similarity search is a fundamental problem in computing science with various applications and has attracted significant research attention, especially in large-scale search with high dimensions. Motivated by the evidence in biological science, our work develops a novel approach for similarity search. Fundamentally different from existing methods that typically reduce the dimension of the data to lessen the computational complexity and speed up the search, our approach projects the data into an even higher-dimensional space while ensuring the sparsity of the data in the output space, with the objective of further improving precision and speed. Specifically, our approach has two key steps. Firstly, it computes the optimal sparse lifting for given input samples and increases the dimension of the data while approximately preserving their pairwise similarity. Secondly, it seeks the optimal lifting operator that best maps input samples to the optimal sparse lifting. Computationally, both steps are modeled as optimization problems that can be efficiently and effectively solved by the Frank-Wolfe algorithm. Simple as it is, our approach has reported significantly improved results in empirical evaluations, and exhibited its high potentials in solving practical problems.


Generalized Inverse Optimization through Online Learning

Neural Information Processing Systems

Inverse optimization is a powerful paradigm for learning preferences and restrictions that explain the behavior of a decision maker, based on a set of external signal and the corresponding decision pairs. However, most inverse optimization algorithms are designed specifically in batch setting, where all the data is available in advance. As a consequence, there has been rare use of these methods in an online setting suitable for real-time applications. In this paper, we propose a general framework for inverse optimization through online learning. Specifically, we develop an online learning algorithm that uses an implicit update rule which can handle noisy data. Moreover, under additional regularity assumptions in terms of the data and the model, we prove that our algorithm converges at a rate of $\mathcal{O}(1/\sqrt{T})$ and is statistically consistent. In our experiments, we show the online learning approach can learn the parameters with great accuracy and is very robust to noises, and achieves a dramatic improvement in computational efficacy over the batch learning approach.


Exact Guarantees on the Absence of Spurious Local Minima for Non-negative Robust Principal Component Analysis

arXiv.org Machine Learning

This work is concerned with the non-negative robust principal component analysis (PCA), where the goal is to recover the dominant non-negative principal component of a data matrix precisely, where a number of measurements could be grossly corrupted with sparse and arbitrary large noise. Most of the known techniques for solving the robust PCA rely on convex relaxation methods by lifting the problem to a higher dimension, which significantly increase the number of variables. As an alternative, the well-known Burer-Monteiro approach can be used to cast the robust PCA as a non-convex and non-smooth $\ell_1$ optimization problem with a significantly smaller number of variables. In this work, we show that the low-dimensional formulation of the symmetric and asymmetric positive robust PCA based on the Burer-Monteiro approach has benign landscape, i.e., 1) it does not have any spurious local solution, 2) has a unique global solution, and 3) its unique global solution coincides with the true components. An implication of this result is that simple local search algorithms are guaranteed to achieve a zero global optimality gap when directly applied to the low-dimensional formulation. Furthermore, we provide strong deterministic and statistical guarantees for the exact recovery of the true principal component. In particular, it is shown that a constant fraction of the measurements could be grossly corrupted and yet they would not create any spurious local solution.


Hessian-Aware Zeroth-Order Optimization for Black-Box Adversarial Attack

arXiv.org Machine Learning

Zeroth-order optimization or derivative-free optimization is an important research topic in machine learning. In recent, it has become a key tool in black-box adversarial attack to neural network based image classifiers. However, existing zeroth-order optimization algorithms rarely extract Hessian information of the model function. In this paper, we utilize the second-order information of the objective function and propose a novel \emph{Hessian-aware zeroth-order algorithm} called \texttt{ZO-HessAware}. Our theoretical result shows that \texttt{ZO-HessAware} has an improved zeroth-order convergence rate and query complexity under structured Hessian approximation, where we propose a few approximation methods of such. Our empirical studies on the black-box adversarial attack problem validate that our algorithm can achieve improved success rates with a lower query complexity.


A new Shamoon 3 sample uploaded to VirusTotal from France

#artificialintelligence

A new sample of the dreaded Shamoon wiper was uploaded on December 23 to the VirusTotal platform from France. This sample attempt to disguise itself as a system optimization tool developed by Chinese technology company Baidu. The new variant is signed with a digital certificate from Baidu that was issued on March 25, 2015 and that expired on March 26, 2016. AThis sample was packed using the commercial packing tool Enigma version 4. Researchers from Anomali Labs have analyzed the latest variant of the wiper and discovered that it uses an image of a burning US Dollar as part of its destructive attack and includes the text "WE WILL TAKE REVENGE ON THE BLOOD AND TEARS OF OUR CHILDREN." In the attempt to deceive the victims, attackers used the internal file name "Baidu PC Faster" and the "Baidu WiFi Hotspot Setup" in the description of the file.


Sparse Nonnegative CANDECOMP/PARAFAC Decomposition in Block Coordinate Descent Framework: A Comparison Study

arXiv.org Machine Learning

Nonnegative CANDECOMP/PARAFAC (NCP) decomposition is an important tool to process nonnegative tensor. Sometimes, additional sparse regularization is needed to extract meaningful nonnegative and sparse components. Thus, an optimization method for NCP that can impose sparsity efficiently is required. In this paper, we construct NCP with sparse regularization (sparse NCP) by l1-norm. Several popular optimization methods in block coordinate descent framework are employed to solve the sparse NCP, all of which are deeply analyzed with mathematical solutions. We compare these methods by experiments on synthetic and real tensor data, both of which contain third-order and fourth-order cases. After comparison, the methods that have fast computation and high effectiveness to impose sparsity will be concluded. In addition, we proposed an accelerated method to compute the objective function and relative error of sparse NCP, which has significantly improved the computation of tensor decomposition especially for higher-order tensor.


A Greedy Approach to $\ell_{0,\infty}$ Based Convolutional Sparse Coding

arXiv.org Machine Learning

Sparse coding techniques for image processing traditionally rely on a processing of small overlapping patches separately followed by averaging. This has the disadvantage that the reconstructed image no longer obeys the sparsity prior used in the processing. For this purpose convolutional sparse coding has been introduced, where a shift-invariant dictionary is used and the sparsity of the recovered image is maintained. Most such strategies target the $\ell_0$ "norm" or the $\ell_1$ norm of the whole image, which may create an imbalanced sparsity across various regions in the image. In order to face this challenge, the $\ell_{0,\infty}$ "norm" has been proposed as an alternative that "operates locally while thinking globally". The approaches taken for tackling the non-convexity of these optimization problems have been either using a convex relaxation or local pursuit algorithms. In this paper, we present an efficient greedy method for sparse coding and dictionary learning, which is specifically tailored to $\ell_{0,\infty}$, and is based on matching pursuit. We demonstrate the usage of our approach in salt-and-pepper noise removal and image inpainting. A code package which reproduces the experiments presented in this work is available at https://web.eng.tau.ac.il/~raja


Portfolio Optimization for Cointelated Pairs: SDEs vs. Machine Learning

arXiv.org Machine Learning

Abstract-- We investigate the problem of dynamic portfolio optimization in continuous-time, finite-horizon setting for a portfolio of two stocks and one risk-free asset. The stocks follow the Cointelation model recently introduced [7]. The proposed optimization methods are twofold. In what we call an Stochastic Differential Equation approach, we compute the optimal weights using mean-variance criterion and power utility maximization. We show that dynamically switching between these two optimal strategies by introducing a triggering function can further improve the portfolio returns. We contrast this with the machine learning clustering methodology inspired by the band-wise Gaussian mixture model [9]. The first benefit of the machine learning over the Stochastic Differential Equation approach is that we were able to achieve the same results though a simpler channel. The second advantage is a flexibility to regime change.