Goto

Collaborating Authors

 Genre


Online Structured Prediction via Coactive Learning

arXiv.org Artificial Intelligence

We propose Coactive Learning as a model of interaction between a learning system and a human user, where both have the common goal of providing results of maximum utility to the user. At each step, the system (e.g. search engine) receives a context (e.g. query) and predicts an object (e.g. ranking). The user responds by correcting the system if necessary, providing a slightly improved -- but not necessarily optimal -- object as feedback. We argue that such feedback can often be inferred from observable user behavior, for example, from clicks in web-search. Evaluating predictions by their cardinal utility to the user, we propose efficient learning algorithms that have ${\cal O}(\frac{1}{\sqrt{T}})$ average regret, even though the learning algorithm never observes cardinal utility values as in conventional online learning. We demonstrate the applicability of our model and learning algorithms on a movie recommendation task, as well as ranking for web-search.


Directed Time Series Regression for Control

arXiv.org Machine Learning

We propose directed time series regression, a new approach to estimating parameters of time-series models for use in certainty equivalent model predictive control. The approach combines merits of least squares regression and empirical optimization. Through a computational study involving a stochastic version of a well known inverted pendulum balancing problem, we demonstrate that directed time series regression can generate significant improvements in controller performance over either of the aforementioned alternatives.


Reading Dependencies from Covariance Graphs

arXiv.org Artificial Intelligence

The covariance graph (aka bi-directed graph) of a probability distribution $p$ is the undirected graph $G$ where two nodes are adjacent iff their corresponding random variables are marginally dependent in $p$. In this paper, we present a graphical criterion for reading dependencies from $G$, under the assumption that $p$ satisfies the graphoid properties as well as weak transitivity and composition. We prove that the graphical criterion is sound and complete in certain sense. We argue that our assumptions are not too restrictive. For instance, all the regular Gaussian probability distributions satisfy them.


Predictive Approaches For Gaussian Process Classifier Model Selection

arXiv.org Machine Learning

In this paper we consider the problem of Gaussian process classifier (GPC) model selection with different Leave-One-Out (LOO) Cross Validation (CV) based optimization criteria and provide a practical algorithm using LOO predictive distributions with such criteria to select hyperparameters. Apart from the standard average negative logarithm of predictive probability (NLP), we also consider smoothed versions of criteria such as F-measure and Weighted Error Rate (WER), which are useful for handling imbalanced data. Unlike the regression case, LOO predictive distributions for the classifier case are intractable. We use approximate LOO predictive distributions arrived from Expectation Propagation (EP) approximation. We conduct experiments on several real world benchmark datasets. When the NLP criterion is used for optimizing the hyperparameters, the predictive approaches show better or comparable NLP generalization performance with existing GPC approaches. On the other hand, when the F-measure criterion is used, the F-measure generalization performance improves significantly on several datasets. Overall, the EP-based predictive algorithm comes out as an excellent choice for GP classifier model selection with different optimization criteria.


An Additive Model View to Sparse Gaussian Process Classifier Design

arXiv.org Machine Learning

We consider the problem of designing a sparse Gaussian process classifier (SGPC) that generalizes well. Viewing SGPC design as constructing an additive model like in boosting, we present an efficient and effective SGPC design method to perform a stage-wise optimization of a predictive loss function. We introduce new methods for two key components viz., site parameter estimation and basis vector selection in any SGPC design. The proposed adaptive sampling based basis vector selection method aids in achieving improved generalization performance at a reduced computational cost. This method can also be used in conjunction with any other site parameter estimation methods. It has similar computational and storage complexities as the well-known information vector machine and is suitable for large datasets. The hyperparameters can be determined by optimizing a predictive loss function. The experimental results show better generalization performance of the proposed basis vector selection method on several benchmark datasets, particularly for relatively smaller basis vector set sizes or on difficult datasets.


Transductive Classification Methods for Mixed Graphs

arXiv.org Machine Learning

In this paper we provide a principled approach to solve a transductive classification problem involving a similar graph (edges tend to connect nodes with same labels) and a dissimilar graph (edges tend to connect nodes with opposing labels). Most of the existing methods, e.g., Information Regularization (IR), Weighted vote Relational Neighbor classifier (WvRN) etc, assume that the given graph is only a similar graph. We extend the IR and WvRN methods to deal with mixed graphs. We evaluate the proposed extensions on several benchmark datasets as well as two real world datasets and demonstrate the usefulness of our ideas. Categories and Subject Descriptors: I.5[Pattern Recognition] Design Methodology - Classifier design and evaluation General Terms: Algorithms, Experimentation Keywords: Classification, Graph based semi-supervised learning, Transductive learning, Mixed graphs


Towards a Mathematical Foundation of Immunology and Amino Acid Chains

arXiv.org Machine Learning

We attempt to set a mathematical foundation of immunology and amino acid chains. To measure the similarities of these chains, a kernel on strings is defined using only the sequence of the chains and a good amino acid substitution matrix (e.g. BLOSUM62). The kernel is used in learning machines to predict binding affinities of peptides to human leukocyte antigens DR (HLA-DR) molecules. On both fixed allele (Nielsen and Lund 2009) and pan-allele (Nielsen et.al. 2010) benchmark databases, our algorithm achieves the state-of-the-art performance. The kernel is also used to define a distance on an HLA-DR allele set based on which a clustering analysis precisely recovers the serotype classifications assigned by WHO (Nielsen and Lund 2009, and Marsh et.al. 2010). These results suggest that our kernel relates well the chain structure of both peptides and HLA-DR molecules to their biological functions, and that it offers a simple, powerful and promising methodology to immunology and amino acid chain studies.


Analysis of a Nature Inspired Firefly Algorithm based Back-propagation Neural Network Training

arXiv.org Artificial Intelligence

Optimization algorithms are normally influenced by meta-heuristic approach. In recent years several hybrid methods for optimization are developed to find out a better solution. The proposed work using meta-heuristic Nature Inspired algorithm is applied with back-propagation method to train a feed-forward neural network. Firefly algorithm is a nature inspired meta-heuristic algorithm, and it is incorporated into back-propagation algorithm to achieve fast and improved convergence rate in training feed-forward neural network. The proposed technique is tested over some standard data set. It is found that proposed method produces an improved convergence within very few iteration. This performance is also analyzed and compared to genetic algorithm based back-propagation. It is observed that proposed method consumes less time to converge and providing improved convergence rate with minimum feed-forward neural network design.


Keyphrase Based Arabic Summarizer (KPAS)

arXiv.org Artificial Intelligence

This paper describes a computationally inexpensive and efficient generic summarization algorithm for Arabic texts. The algorithm belongs to extractive summarization family, which reduces the problem into representative sentences identification and extraction sub-problems. Important keyphrases of the document to be summarized are identified employing combinations of statistical and linguistic features. The sentence extraction algorithm exploits keyphrases as the primary attributes to rank a sentence. The present experimental work, demonstrates different techniques for achieving various summarization goals including: informative richness, coverage of both main and auxiliary topics, and keeping redundancy to a minimum. A scoring scheme is then adopted that balances between these summarization goals. To evaluate the resulted Arabic summaries with well-established systems, aligned English/Arabic texts are used through the experiments.


Hidden Markov Models with mixtures as emission distributions

arXiv.org Machine Learning

In unsupervised classification, Hidden Markov Models (HMM) are used to account for a neighborhood structure between observations. The emission distributions are often supposed to belong to some parametric family. In this paper, a semiparametric modeling where the emission distributions are a mixture of parametric distributions is proposed to get a higher flexibility. We show that the classical EM algorithm can be adapted to infer the model parameters. For the initialisation step, starting from a large number of components, a hierarchical method to combine them into the hidden states is proposed. Three likelihood-based criteria to select the components to be combined are discussed. To estimate the number of hidden states, BIC-like criteria are derived. A simulation study is carried out both to determine the best combination between the merging criteria and the model selection criteria and to evaluate the accuracy of classification. The proposed method is also illustrated using a biological dataset from the model plant Arabidopsis thaliana. A R package HMMmix is freely available on the CRAN.