Goto

Collaborating Authors

 Genre


Learning Multi-modal Similarity

arXiv.org Artificial Intelligence

In many applications involving multi-media data, the definition of similarity between items is integral to several key tasks, e.g., nearest-neighbor retrieval, classification, and recommendation. Data in such regimes typically exhibits multiple modalities, such as acoustic and visual content of video. Integrating such heterogeneous data to form a holistic similarity space is therefore a key challenge to be overcome in many real-world applications. We present a novel multiple kernel learning technique for integrating heterogeneous data into a single, unified similarity space. Our algorithm learns an optimal ensemble of kernel transfor- mations which conform to measurements of human perceptual similarity, as expressed by relative comparisons. To cope with the ubiquitous problems of subjectivity and inconsistency in multi- media similarity, we develop graph-based techniques to filter similarity measurements, resulting in a simplified and robust training procedure.


Prediction by Compression

arXiv.org Artificial Intelligence

It is well known that text compression can be achieved by predicting the next symbol in the stream of text data based on the history seen up to the current symbol. The better the prediction the more skewed the conditional probability distribution of the next symbol and the shorter the codeword that needs to be assigned to represent this next symbol. What about the opposite direction ? suppose we have a black box that can compress text stream. Can it be used to predict the next symbol in the stream ? We introduce a criterion based on the length of the compressed data and use it to predict the next symbol. We examine empirically the prediction error rate and its dependency on some compression parameters.


Sparse Group Restricted Boltzmann Machines

arXiv.org Machine Learning

Since learning is typically very slow in Boltzmann machines, there is a need to restrict connections within hidden layers. However, the resulting states of hidden units exhibit statistical dependencies. Based on this observation, we propose using $l_1/l_2$ regularization upon the activation possibilities of hidden units in restricted Boltzmann machines to capture the loacal dependencies among hidden units. This regularization not only encourages hidden units of many groups to be inactive given observed data but also makes hidden units within a group compete with each other for modeling observed data. Thus, the $l_1/l_2$ regularization on RBMs yields sparsity at both the group and the hidden unit levels. We call RBMs trained with the regularizer \emph{sparse group} RBMs. The proposed sparse group RBMs are applied to three tasks: modeling patches of natural images, modeling handwritten digits and pretaining a deep networks for a classification task. Furthermore, we illustrate the regularizer can also be applied to deep Boltzmann machines, which lead to sparse group deep Boltzmann machines. When adapted to the MNIST data set, a two-layer sparse group Boltzmann machine achieves an error rate of $0.84\%$, which is, to our knowledge, the best published result on the permutation-invariant version of the MNIST task.


Entropy-Based Search Algorithm for Experimental Design

arXiv.org Machine Learning

The scientific method relies on the iterated processes of inference and inquiry. The inference phase consists of selecting the most probable models based on the available data; whereas the inquiry phase consists of using what is known about the models to select the most relevant experiment. Optimizing inquiry involves searching the parameterized space of experiments to select the experiment that promises, on average, to be maximally informative. In the case where it is important to learn about each of the model parameters, the relevance of an experiment is quantified by Shannon entropy of the distribution of experimental outcomes predicted by a probable set of models. If the set of potential experiments is described by many parameters, we must search this high-dimensional entropy space. Brute force search methods will be slow and computationally expensive. We present an entropy-based search algorithm, called nested entropy sampling, to select the most informative experiment for efficient experimental design. This algorithm is inspired by Skilling's nested sampling algorithm used in inference and borrows the concept of a rising threshold while a set of experiment samples are maintained. We demonstrate that this algorithm not only selects highly relevant experiments, but also is more efficient than brute force search. Such entropic search techniques promise to greatly benefit autonomous experimental design.


Non-Transferable Utility Coalitional Games via Mixed-Integer Linear Constraints

Journal of Artificial Intelligence Research

Coalitional games serve the purpose of modeling payoff distribution problems in scenarios where agents can collaborate by forming coalitions in order to obtain higher worths than by acting in isolation. In the classical Transferable Utility (TU) setting, coalition worths can be freely distributed amongst agents. However, in several application scenarios, this is not the case and the Non-Transferable Utility setting (NTU) must be considered, where additional application-oriented constraints are imposed on the possible worth distributions. In this paper, an approach to define NTU games is proposed which is based on describing allowed distributions via a set of mixed-integer linear constraints applied to an underlying TU game. It is shown that such games allow non-transferable conditions on worth distributions to be specified in a natural and succinct way. The properties and the relationships among the most prominent solution concepts for NTU games that hold when they are applied on (mixed-integer) constrained games are investigated. Finally, a thorough analysis is carried out to assess the impact of issuing constraints on the computational complexity of some of these solution concepts.


Cause Identification from Aviation Safety Incident Reports via Weakly Supervised Semantic Lexicon Construction

Journal of Artificial Intelligence Research

The Aviation Safety Reporting System collects voluntarily submitted reports on aviation safety incidents to facilitate research work aiming to reduce such incidents. To effectively reduce these incidents, it is vital to accurately identify why these incidents occurred. More precisely, given a set of possible causes, or shaping factors, this task of cause identification involves identifying all and only those shaping factors that are responsible for the incidents described in a report. We investigate two approaches to cause identification. Both approaches exploit information provided by a semantic lexicon, which is automatically constructed via Thelen and Riloff's Basilisk framework augmented with our linguistic and algorithmic modifications. The first approach labels a report using a simple heuristic, which looks for the words and phrases acquired during the semantic lexicon learning process in the report. The second approach recasts cause identification as a text classification problem, employing supervised and transductive text classification algorithms to learn models from incident reports labeled with shaping factors and using the models to label unseen reports. Our experiments show that both the heuristic-based approach and the learning-based approach (when given sufficient training data) outperform the baseline system significantly.


Directed Plateau Search for MAX-k-SAT

AAAI Conferences

Local search algorithms for MAX-k-SAT must often explore large regions of mutually connected equal moves, or plateaus, typically by taking random walks through the region. In this paper, we develop a surrogate plateau "gradient" function using a Walsh transform of the objective function. This function gives the mean value of the objective function over localized volumes of the search space.  This information can be used to direct search through plateaus more quickly. The focus of this paper is on demonstrating that formal analysis of search space structure can direct existing algorithms in a more principled manner than random walks.  We show that embedding the gradient computation into a hill-climbing local search for MAX-k-SAT improves its convergence profile.


Anytime Heuristic Search: Frameworks and Algorithms

AAAI Conferences

Anytime search is a pragmatic approach for trading solution cost and solving time. It can also be used for solving problems within a time bound. Three frameworks for constructing anytime algorithms from bounded suboptimal search have been proposed: continuing search, repairing search, and restarting search, but what combination of suboptimal search and anytime framework performs best? An extensive empirical evaluation results in several novel algorithms and reveals that the relative performance of frameworks is essentially fixed, with the repairing framework having the strongest overall performance. As part of our study, we present two enhancements to Anytime Window A* that allow it to solve a wider range of problems and hastens its convergance on optimal solutions.


Potential Search: A New Greedy Anytime Heuristic Search

AAAI Conferences

In this paper we explore a novel approach for anytime heuristic search, in which the node that is most probable to improve the incumbent solution is expanded first. This is especially suited for the "anytime aspect" of anytime algorithms - the possibility that the algorithm will be be halted anytime throughout the search. The potential of a node to improve the incumbent solution is estimated by a custom cost function, resulting in Potential Search, an anytime best-first search. Experimental results on the 15-puzzle and on the key player problem in communication networks (KPP-COM) show that this approach is competitive with state-of-the-art anytime heuristic search algorithms, and is more robust.


Adaptive K-Parallel Best-First Search: A Simple but Efficient Algorithm for Multi-Core Domain-Independent Planning

AAAI Conferences

Motivated by the recent hardware evolution towards multi-core machines, we investigate parallel planning techniques in a shared-memory environment. We consider, more specifically, parallel versions of a best-first search algorithm that run K threads, each expanding the next best node from the open list. We show that the proposed technique has a number of advantages. First, it is (reasonably) simple: we show how the algorithm can be obtained from a sequential version mostly by adding parallel annotations. Second, we conduct an extensive empirical study that shows that this approach is quite effective.  It is also dynamic in the sense that the number of nodes expanded in parallel is adapted during the search. Overall we show that the approach is promising for parallel domain-independent, suboptimal planning.