Goto

Collaborating Authors

 Optimization


On the Algorithmics and Applications of a Mixed-norm based Kernel Learning Formulation

Neural Information Processing Systems

Motivated from real world problems, like object categorization, we study a particular mixed-norm regularization for Multiple Kernel Learning (MKL). It is assumed that the given set of kernels are grouped into distinct components where each component is crucial for the learning task at hand. The formulation hence employs $l_\infty$ regularization for promoting combinations at the component level and $l_1$ regularization for promoting sparsity among kernels in each component. While previous attempts have formulated this as a non-convex problem, the formulation given here is an instance of non-smooth convex optimization problem which admits an efficient Mirror-Descent (MD) based procedure. The MD procedure optimizes over product of simplexes, which is not a well-studied case in literature. Results on real-world datasets show that the new MKL formulation is well-suited for object categorization tasks and that the MD based algorithm outperforms state-of-the-art MKL solvers like \texttt{simpleMKL} in terms of computational effort.


Graph-based Consensus Maximization among Multiple Supervised and Unsupervised Models

Neural Information Processing Systems

Little work has been done to directly combine the outputs of multiple supervised and unsupervised models. However, it can increase the accuracy and applicability of ensemble methods. First, we can boost the diversity of classification ensemble by incorporating multiple clustering outputs, each of which provides grouping constraints for the joint label predictions of a set of related objects. Secondly, ensemble of supervised models is limited in applications which have no access to raw data but to the meta-level model outputs. In this paper, we aim at calculating a consolidated classification solution for a set of objects by maximizing the consensus among both supervised predictions and unsupervised grouping constraints. We seek a global optimal label assignment for the target objects, which is different from the result of traditional majority voting and model combination approaches. We cast the problem into an optimization problem on a bipartite graph, where the objective function favors smoothness in the conditional probability estimates over the graph, as well as penalizes deviation from initial labeling of supervised models. We solve the problem through iterative propagation of conditional probability estimates among neighboring nodes, and interpret the method as conducting a constrained embedding in a transformed space, as well as a ranking on the graph. Experimental results on three real applications demonstrate the benefits of the proposed method over existing alternatives.


An LP View of the M-best MAP problem

Neural Information Processing Systems

This is often referred to as the MAP (maximum a--posteriori) problem. Of particular interest is the case of MAP in graphical Inodels7 i.e., models where the prob--


A Data-Driven Approach to Modeling Choice

Neural Information Processing Systems

We visit the following fundamental problem: For a `generic model of consumer choice (namely, distributions over preference lists) and a limited amount of data on how consumers actually make decisions (such as marginal preference information), how may one predict revenues from offering a particular assortment of choices? This problem is central to areas within operations research, marketing and econometrics. We present a framework to answer such questions and design a number of tractable algorithms (from a data and computational standpoint) for the same.


Constructing Topological Maps using Markov Random Fields and Loop-Closure Detection

Neural Information Processing Systems

We present a system which constructs a topological map of an environment given a sequence of images. This system includes a novel image similarity score which uses dynamic programming to match images using both the appearance and relative positionsof local features simultaneously. Additionally, an MRF is constructed tomodel the probability of loop-closures. A locally optimal labeling is found using Loopy-BP. Finally we outline a method to generate a topological map from loop closure data. Results, presented on four urban sequences and one indoor sequence, outperform the state of the art.


RoxyBot-06: Stochastic Prediction and Optimization in TAC Travel

Journal of Artificial Intelligence Research

In this paper, we describe our autonomous bidding agent, RoxyBot, who emerged victorious in the travel division of the 2006 Trading Agent Competition in a photo finish. At a high level, the design of many successful trading agents can be summarized as follows: (i) price prediction: build a model of market prices; and (ii) optimization: solve for an approximately optimal set of bids, given this model. To predict, RoxyBot builds a stochastic model of market prices by simulating simultaneous ascending auctions. To optimize, RoxyBot relies on the sample average approximation method, a stochastic optimization technique.


Condition Number Analysis of Kernel-based Density Ratio Estimation

arXiv.org Machine Learning

The ratio of two probability densities can be used for solving various machine learning tasks such as covariate shift adaptation (importance sampling), outlier detection (likelihood-ratio test), and feature selection (mutual information). Recently, several methods of directly estimating the density ratio have been developed, e.g., kernel mean matching, maximum likelihood density ratio estimation, and least-squares density ratio fitting. In this paper, we consider a kernelized variant of the least-squares method and investigate its theoretical properties from the viewpoint of the condition number using smoothed analysis techniques--the condition number of the Hessian matrix determines the convergence rate of optimization and the numerical stability. We show that the kernel least-squares method has a smaller condition number than a version of kernel mean matching and other M-estimators, implying that the kernel least-squares method has preferable numerical properties. We further give an alternative formulation of the kernel least-squares estimator which is shown to possess an even smaller condition number. We show that numerical studies meet our theoretical analysis.


A Decision-Optimization Approach to Quantum Mechanics and Game Theory

arXiv.org Artificial Intelligence

The fundamental laws of quantum world upsets the logical foundation of classic physics. They are completely counter-intuitive with many bizarre behaviors. However, this paper shows that they may make sense from the perspective of a general decision-optimization principle for cooperation. This principle also offers a generalization of Nash equilibrium, a key concept in game theory, for better payoffs and stability of game playing.


Overcoming Hierarchical Difficulty by Hill-Climbing the Building Block Structure

arXiv.org Artificial Intelligence

The Building Block Hypothesis suggests that Genetic Algorithms (GAs) are well-suited for hierarchical problems, where efficient solving requires proper problem decomposition and assembly of solution from sub-solution with strong non-linear interdependencies. The paper proposes a hill-climber operating over the building block (BB) space that can efficiently address hierarchical problems. The new Building Block Hill-Climber (BBHC) uses past hill-climb experience to extract BB information and adapts its neighborhood structure accordingly. The perpetual adaptation of the neighborhood structure allows the method to climb the hierarchical structure solving successively the hierarchical levels. It is expected that for fully non deceptive hierarchical BB structures the BBHC can solve hierarchical problems in linearithmic time. Empirical results confirm that the proposed method scales almost linearly with the problem size thus clearly outperforms population based recombinative methods.


Minimum Cost Homomorphisms to Proper Interval Graphs and Bigraphs

arXiv.org Artificial Intelligence

For graphs $G$ and $H$, a mapping $f: V(G)\dom V(H)$ is a homomorphism of $G$ to $H$ if $uv\in E(G)$ implies $f(u)f(v)\in E(H).$ If, moreover, each vertex $u \in V(G)$ is associated with costs $c_i(u), i \in V(H)$, then the cost of the homomorphism $f$ is $\sum_{u\in V(G)}c_{f(u)}(u)$. For each fixed graph $H$, we have the {\em minimum cost homomorphism problem}, written as MinHOM($H)$. The problem is to decide, for an input graph $G$ with costs $c_i(u),$ $u \in V(G), i\in V(H)$, whether there exists a homomorphism of $G$ to $H$ and, if one exists, to find one of minimum cost. Minimum cost homomorphism problems encompass (or are related to) many well studied optimization problems. We describe a dichotomy of the minimum cost homomorphism problems for graphs $H$, with loops allowed. When each connected component of $H$ is either a reflexive proper interval graph or an irreflexive proper interval bigraph, the problem MinHOM($H)$ is polynomial time solvable. In all other cases the problem MinHOM($H)$ is NP-hard. This solves an open problem from an earlier paper. Along the way, we prove a new characterization of the class of proper interval bigraphs.