Goto

Collaborating Authors

 Genre


Structured Prediction Cascades

arXiv.org Machine Learning

Structured prediction tasks pose a fundamental trade-off between the need for model complexity to increase predictive power and the limited computational resources for inference in the exponentially-sized output spaces such models require. We formulate and develop the Structured Prediction Cascade architecture: a sequence of increasingly complex models that progressively filter the space of possible outputs. The key principle of our approach is that each model in the cascade is optimized to accurately filter and refine the structured output state space of the next model, speeding up both learning and inference in the next layer of the cascade. We learn cascades by optimizing a novel convex loss function that controls the trade-off between the filtering efficiency and the accuracy of the cascade, and provide generalization bounds for both accuracy and efficiency. We also extend our approach to intractable models using tree-decomposition ensembles, and provide algorithms and theory for this setting. We evaluate our approach on several large-scale problems, achieving state-of-the-art performance in handwriting recognition and human pose recognition. We find that structured prediction cascades allow tremendous speedups and the use of previously intractable features and models in both settings.


Payment Rules through Discriminant-Based Classifiers

arXiv.org Artificial Intelligence

In mechanism design it is typical to impose incentive compatibility and then derive an optimal mechanism subject to this constraint. By replacing the incentive compatibility requirement with the goal of minimizing expected ex post regret, we are able to adapt statistical machine learning techniques to the design of payment rules. This computational approach to mechanism design is applicable to domains with multi-dimensional types and situations where computational efficiency is a concern. Specifically, given an outcome rule and access to a type distribution, we train a support vector machine with a special discriminant function structure such that it implicitly establishes a payment rule with desirable incentive properties. We discuss applications to a multi-minded combinatorial auction with a greedy winner-determination algorithm and to an assignment problem with egalitarian outcome rule. Experimental results demonstrate both that the construction produces payment rules with low ex post regret, and that penalizing classification errors is effective in preventing failures of ex post individual rationality.


Credal nets under epistemic irrelevance

arXiv.org Artificial Intelligence

We present a new approach to credal nets, which are graphical models that generalise Bayesian nets to imprecise probability. Instead of applying the commonly used notion of strong independence, we replace it by the weaker notion of epistemic irrelevance. We show how assessments of epistemic irrelevance allow us to construct a global model out of given local uncertainty models and mention some useful properties. The main results and proofs are presented using the language of sets of desirable gambles, which provides a very general and expressive way of representing imprecise probability models.


System identification and modeling for interacting and non-interacting tank systems using intelligent techniques

arXiv.org Artificial Intelligence

System identification from the experimental data plays a vital role for model based controller design. Derivation of process model from first principles is often difficult due to its complexity. The first stage in the development of any control and monitoring system is the identification and modeling of the system. Each model is developed within the context of a specific control problem. Thus, the need for a general system identification framework is warranted. The proposed framework should be able to adapt and emphasize different properties based on the control objective and the nature of the behavior of the system. Therefore, system identification has been a valuable tool in identifying the model of the system based on the input and output data for the design of the controller. The present work is concerned with the identification of transfer function models using statistical model identification, process reaction curve method, ARX model, genetic algorithm and modeling using neural network and fuzzy logic for interacting and non interacting tank process. The identification technique and modeling used is prone to parameter change & disturbance. The proposed methods are used for identifying the mathematical model and intelligent model of interacting and non interacting process from the real time experimental data.


Toward an Integrated Framework for Automated Development and Optimization of Online Advertising Campaigns

arXiv.org Artificial Intelligence

Creating and monitoring competitive and cost-effective pay-per-click advertisement campaigns through the web-search channel is a resource demanding task in terms of expertise and effort. Assisting or even automating the work of an advertising specialist will have an unrivaled commercial value. In this paper we propose a methodology, an architecture, and a fully functional framework for semi- and fully- automated creation, monitoring, and optimization of cost-efficient pay-per-click campaigns with budget constraints. The campaign creation module generates automatically keywords based on the content of the web page to be advertised extended with corresponding ad-texts. These keywords are used to create automatically the campaigns fully equipped with the appropriate values set. The campaigns are uploaded to the auctioneer platform and start running. The optimization module focuses on the learning process from existing campaign statistics and also from applied strategies of previous periods in order to invest optimally in the next period. The objective is to maximize the performance (i.e. clicks, actions) under the current budget constraint. The fully functional prototype is experimentally evaluated on real world Google AdWords campaigns and presents a promising behavior with regards to campaign performance statistics as it outperforms systematically the competing manually maintained campaigns.


Cross-conformal predictors

arXiv.org Machine Learning

The method of conformal prediction produces set predictions that are automatically valid in the sense that their unconditional coverage probability is equal to or exceeds a preset confidence level ([14], Chapter 2). A more computationally efficient method of this kind is that of inductive conformal prediction ([12], [14], Section 4.1, [1]). However, inductive conformal predictors are typically less predictively efficient, in the sense of producing larger prediction sets as compared with conformal predictors. Motivated by the method of cross-validation [11, 13], this note explores a hybrid method, which we call cross-conformal prediction. We are mainly interested in the problems of classification and regression, in which we are given a training set consisting of examples, each example consisting of an object and a label, and asked to predict the label of a new test object; in the problem of classification labels are elements of a given finite set, and in the problem of regression labels are real numbers. If we are asked to predict labels for more than one test objects, the same prediction procedure can be applied to each test object separately. In this introductory section and in our empirical studies we consider the problem of binary classification, in which labels can take only two values, which we will encode as 0 and 1. We always assume that the examples (both the training examples and the test examples, consisting of given objects and unknown labels) are generated independently from the same probability measure; this assumption will be called the assumption of randomness.


Fast and Accurate Algorithms for Re-Weighted L1-Norm Minimization

arXiv.org Machine Learning

To recover a sparse signal from an underdetermined system, we often solve a constrained L1-norm minimization problem. In many cases, the signal sparsity and the recovery performance can be further improved by replacing the L1 norm with a "weighted" L1 norm. Without any prior information about nonzero elements of the signal, the procedure for selecting weights is iterative in nature. Common approaches update the weights at every iteration using the solution of a weighted L1 problem from the previous iteration. In this paper, we present two homotopy-based algorithms that efficiently solve reweighted L1 problems. First, we present an algorithm that quickly updates the solution of a weighted L1 problem as the weights change. Since the solution changes only slightly with small changes in the weights, we develop a homotopy algorithm that replaces the old weights with the new ones in a small number of computationally inexpensive steps. Second, we propose an algorithm that solves a weighted L1 problem by adaptively selecting the weights while estimating the signal. This algorithm integrates the reweighting into every step along the homotopy path by changing the weights according to the changes in the solution and its support, allowing us to achieve a high quality signal reconstruction by solving a single homotopy problem. We compare the performance of both algorithms, in terms of reconstruction accuracy and computational complexity, against state-of-the-art solvers and show that our methods have smaller computational cost. In addition, we will show that the adaptive selection of the weights inside the homotopy often yields reconstructions of higher quality.


Evolutionary Inference for Function-valued Traits: Gaussian Process Regression on Phylogenies

arXiv.org Machine Learning

In this paper we consider statistical inference for function-valued data which are correlated due to phylogenetic relationships. A schematic example is given in Figure 1A: in this case, given functional data observed at the tips of a phylogeny, the task is to perform inference on the (unobserved) functional data at the root of the phylogeny. Alternatively, if the phylogeny is uncertain we may wish to perform phylogenetic inference, or our interest may be inferring the dynamics of the evolutionary process which produced the data. The term'function-valued' is meant in the sense of [1], where a datum is a continuous functionf (x) of a variablex, such as time or temperature: an examples are therefore curves for ambient temperature versus growth rate for caterpillars, a heart rhythm time series [2], or a spectrogram of audio data. Our approach is to combine the theory of Gaussian processes with assumptions from phylogenetics, to obtain a flexible nonparametric model for such data.


Ancestral Inference from Functional Data: Statistical Methods and Numerical Examples

arXiv.org Machine Learning

Many biological characteristics of evolutionary interest are not scalar variables but continuous functions. Here we use phylogenetic Gaussian process regression to model the evolution of simulated function-valued traits. Given function-valued data only from the tips of an evolutionary tree and utilising independent principal component analysis (IPCA) as a method for dimension reduction, we construct distributional estimates of ancestral function-valued traits, and estimate parameters describing their evolutionary dynamics.


Multidimensional Membership Mixture Models

arXiv.org Machine Learning

We present the multidimensional membership mixture (M3) models where every dimension of the membership represents an independent mixture model and each data point is generated from the selected mixture components jointly. This is helpful when the data has a certain shared structure. For example, three unique means and three unique variances can effectively form a Gaussian mixture model with nine components, while requiring only six parameters to fully describe it. In this paper, we present three instantiations of M3 models (together with the learning and inference algorithms): infinite, finite, and hybrid, depending on whether the number of mixtures is fixed or not. They are built upon Dirichlet process mixture models, latent Dirichlet allocation, and a combination respectively. We then consider two applications: topic modeling and learning 3D object arrangements. Our experiments show that our M3 models achieve better performance using fewer topics than many classic topic models. We also observe that topics from the different dimensions of M3 models are meaningful and orthogonal to each other.