Goto

Collaborating Authors

 Optimization


Co-Separable Nonnegative Matrix Factorization

arXiv.org Machine Learning

Nonnegative matrix factorization (NMF) is a popular model in the field of pattern recognition. It aims to find a low rank approximation for nonnegative data M by a product of two nonnegative matrices W and H. In general, NMF is NP-hard to solve while it can be solved efficiently under separability assumption, which requires the columns of factor matrix are equal to columns of the input matrix. In this paper, we generalize separability assumption based on 3-factor NMF M=P_1SP_2, and require that S is a sub-matrix of the input matrix. We refer to this NMF as a Co-Separable NMF (CoS-NMF). We discuss some mathematics properties of CoS-NMF, and present the relationships with other related matrix factorizations such as CUR decomposition, generalized separable NMF(GS-NMF), and bi-orthogonal tri-factorization (BiOR-NM3F). An optimization model for CoS-NMF is proposed and alternated fast gradient method is employed to solve the model. Numerical experiments on synthetic datasets, document datasets and facial databases are conducted to verify the effectiveness of our CoS-NMF model. Compared to state-of-the-art methods, CoS-NMF model performs very well in co-clustering task, and preserves a good approximation to the input data matrix as well.


E-Commerce Promotions Personalization via Online Multiple-Choice Knapsack with Uplift Modeling

arXiv.org Artificial Intelligence

Promotions and discounts are essential components of modern e-commerce platforms, where they are often used to incentivize customers towards purchase completion. Promotions also affect revenue and may incur a monetary loss that is often limited by a dedicated promotional budget. We study the Online Constrained Multiple-Choice Promotions Personalization Problem, where the optimization goal is to select for each customer which promotion to present in order to maximize purchase completions, while also complying with global budget limitations. Our work formalizes the problem as an Online Multiple Choice Knapsack Problem and extends the existent literature by addressing cases with negative weights and values. We provide a real-time adaptive method that guarantees budget constraints compliance and achieves above 99.7% of the optimal promotional impact on various datasets. Our method is evaluated on a large-scale experimental study at one of the leading online travel platforms in the world.


Lagrange Multiplier Approach with Inequality Constraints

#artificialintelligence

In a previous post, we introduced the method of Lagrange multipliers to find local minima or local maxima of a function with equality constraints. The same method can be applied to those with inequality constraints as well. In this tutorial, you will discover the method of Lagrange multipliers applied to find the local minimum or maximum of a function when inequality constraints are present, optionally together with equality constraints. Lagrange Multiplier Approach with Inequality Constraints Photo by Christine Roy, some rights reserved. You can review these concepts by clicking on the links above.


Election Manipulation on Social Networks: Seeding, Edge Removal, Edge Addition

Journal of Artificial Intelligence Research

We focus on the election manipulation problem through social influence, where a manipulator exploits a social network to make her most preferred candidate win an election. Influence is due to information in favor of and/or against one or multiple candidates, sent  by seeds and spreading through the network according to the independent cascade model.  We provide a comprehensive theoretical study of the election control problem, investigating  two forms of manipulations: seeding to buy influencers given a social network and removing  or adding edges in the social network given the set of the seeds and the information sent.  In particular, we study a wide range of cases distinguishing in the number of candidates or  the kind of information spread over the network. Our main result shows that the election manipulation problem is not affordable in  the worst-case, even when one accepts to get an approximation of the optimal margin of  victory, except for the case of seeding when the number of hard-to-manipulate voters is not  too large, and the number of uncertain voters is not too small, where we say that a voter  that does not vote for the manipulator's candidate is hard-to-manipulate if there is no way  to make her vote for this candidate, and uncertain otherwise. We also provide some results showing the hardness of the problems in special cases.  More precisely, in the case of seeding, we show that the manipulation is hard even if the  graph is a line and that a large class of algorithms, including most of the approaches  recently adopted for social-influence problems (e.g., greedy, degree centrality, PageRank, VoteRank), fails to compute a bounded approximation even on elementary networks, such  as undirected graphs with every node having a degree at most two or directed trees. In the  case of edge removal or addition, our hardness results also apply to election manipulation  when the manipulator has an unlimited budget, being allowed to remove or add an arbitrary  number of edges, and to the basic case of social influence maximization/minimization in  the restricted case of finite budget. Interestingly, our hardness results for seeding and edge removal/addition still hold  in a re-optimization variant, where the manipulator already knows an optimal solution  to the problem and computes a new solution once a local modification occurs, e.g., the  removal/addition of a single edge.


Fast AutoML with FLAML + Ray Tune

#artificialintelligence

FLAML is a lightweight Python library from Microsoft Research that finds accurate machine learning models in an efficient and economical way using cutting edge algorithms designed to be resource-efficient and easily parallelizable. FLAML can also utilize Ray Tune for distributed hyperparameter tuning to scale up these AutoML methods across a cluster. AutoML is known to be a resource and time consuming operation as it involves trials and errors to find a hyperparameter configuration with good performance. Since the space of possible configuration values is often very large, there is a need for an economical AutoML method that can more effectively search them. To address both of these factors, Microsoft Researchers have developed FLAML (Fast Lightweight AutoML).


Few-shot Visual Relationship Co-localization

arXiv.org Artificial Intelligence

In this paper, given a small bag of images, each containing a common but latent predicate, we are interested in localizing visual subject-object pairs connected via the common predicate in each of the images. We refer to this novel problem as visual relationship co-localization or VRC as an abbreviation. VRC is a challenging task, even more so than the well-studied object co-localization task. This becomes further challenging when using just a few images, the model has to learn to co-localize visual subject-object pairs connected via unseen predicates. To solve VRC, we propose an optimization framework to select a common visual relationship in each image of the bag. The goal of the optimization framework is to find the optimal solution by learning visual relationship similarity across images in a few-shot setting. To obtain robust visual relationship representation, we utilize a simple yet effective technique that learns relationship embedding as a translation vector from visual subject to visual object in a shared space. Further, to learn visual relationship similarity, we utilize a proven meta-learning technique commonly used for few-shot classification tasks. Finally, to tackle the combinatorial complexity challenge arising from an exponential number of feasible solutions, we use a greedy approximation inference algorithm that selects approximately the best solution. We extensively evaluate our proposed framework on variations of bag sizes obtained from two challenging public datasets, namely VrR-VG and VG-150, and achieve impressive visual co-localization performance.


Optimization Case Study: Defining the problem -- Part 1

#artificialintelligence

This is a two-part case study where we define the optimization problem in part one, and we use the Pulp python library as a tool to solve the business problem in part two. Optimization problems are the dilemma of any business. This can come in two decision-making: either maximation and minimization, be it profit or cost. A typical optimization problem can be solved with an optimization method which in itself is mathematical. Therefore we need to represent our above component definitions mathematically.


Predicting Census Survey Response Rates via Interpretable Nonparametric Additive Models with Structured Interactions

arXiv.org Machine Learning

Accurate and interpretable prediction of survey response rates is important from an operational standpoint. The US Census Bureau's well-known ROAM application uses principled statistical models trained on the US Census Planning Database data to identify hard-to-survey areas. An earlier crowdsourcing competition revealed that an ensemble of regression trees led to the best performance in predicting survey response rates; however, the corresponding models could not be adopted for the intended application due to limited interpretability. In this paper, we present new interpretable statistical methods to predict, with high accuracy, response rates in surveys. We study sparse nonparametric additive models with pairwise interactions via $\ell_0$-regularization, as well as hierarchically structured variants that provide enhanced interpretability. Despite strong methodological underpinnings, such models can be computationally challenging -- we present new scalable algorithms for learning these models. We also establish novel non-asymptotic error bounds for the proposed estimators. Experiments based on the US Census Planning Database demonstrate that our methods lead to high-quality predictive models that permit actionable interpretability for different segments of the population. Interestingly, our methods provide significant gains in interpretability without losing in predictive performance to state-of-the-art black-box machine learning methods based on gradient boosting and feedforward neural networks. Our code implementation in python is available at https://github.com/ShibalIbrahim/Additive-Models-with-Structured-Interactions.


Vector Transport Free Riemannian LBFGS for Optimization on Symmetric Positive Definite Matrix Manifolds

arXiv.org Machine Learning

This work concentrates on optimization on Riemannian manifolds. The Limited-memory Broyden-Fletcher-Goldfarb-Shanno (LBFGS) algorithm is a commonly used quasi-Newton method for numerical optimization in Euclidean spaces. Riemannian LBFGS (RLBFGS) is an extension of this method to Riemannian manifolds. RLBFGS involves computationally expensive vector transports as well as unfolding recursions using adjoint vector transports. In this article, we propose two mappings in the tangent space using the inverse second root and Cholesky decomposition. These mappings make both vector transport and adjoint vector transport identity and therefore isometric. Identity vector transport makes RLBFGS less computationally expensive and its isometry is also very useful in convergence analysis of RLBFGS. Moreover, under the proposed mappings, the Riemannian metric reduces to Euclidean inner product, which is much less computationally expensive. We focus on the Symmetric Positive Definite (SPD) manifolds which are beneficial in various fields such as data science and statistics. This work opens a research opportunity for extension of the proposed mappings to other well-known manifolds.


New Q-Newton's method meets Backtracking line search: good convergence guarantee, saddle points avoidance, quadratic rate of convergence, and easy implementation

arXiv.org Machine Learning

In a recent joint work, the author has developed a modification of Newton's method, named New Q-Newton's method, which can avoid saddle points and has quadratic rate of convergence. While good theoretical convergence guarantee has not been established for this method, experiments on small scale problems show that the method works very competitively against other well known modifications of Newton's method such as Adaptive Cubic Regularization and BFGS, as well as first order methods such as Unbounded Two-way Backtracking Gradient Descent. In this paper, we resolve the convergence guarantee issue by proposing a modification of New Q-Newton's method, named New Q-Newton's method Backtracking, which incorporates a more sophisticated use of hyperparameters and a Backtracking line search. This new method has very good theoretical guarantees, which for a {\bf Morse function} yields the following (which is unknown for New Q-Newton's method): {\bf Theorem.} Let $f:\mathbb{R}^m\rightarrow \mathbb{R}$ be a Morse function, that is all its critical points have invertible Hessian. Then for a sequence $\{x_n\}$ constructed by New Q-Newton's method Backtracking from a random initial point $x_0$, we have the following two alternatives: i) $\lim _{n\rightarrow\infty}||x_n||=\infty$, or ii) $\{x_n\}$ converges to a point $x_{\infty}$ which is a {\bf local minimum} of $f$, and the rate of convergence is {\bf quadratic}. Moreover, if $f$ has compact sublevels, then only case ii) happens. As far as we know, for Morse functions, this is the best theoretical guarantee for iterative optimization algorithms so far in the literature. We have tested in experiments on small scale, with some further simplified versions of New Q-Newton's method Backtracking, and found that the new method significantly improve New Q-Newton's method.