Goto

Collaborating Authors

 Country


Information theoretic model validation for clustering

arXiv.org Machine Learning

Model selection in clustering requires (i) to specify a suitable clustering principle and (ii) to control the model order complexity by choosing an appropriate number of clusters depending on the noise level in the data. We advocate an information theoretic perspective where the uncertainty in the measurements quantizes the set of data partitionings and, thereby, induces uncertainty in the solution space of clusterings. A clustering model, which can tolerate a higher level of fluctuations in the measurements than alternative models, is considered to be superior provided that the clustering solution is equally informative. This tradeoff between \emph{informativeness} and \emph{robustness} is used as a model selection criterion. The requirement that data partitionings should generalize from one data set to an equally probable second data set gives rise to a new notion of structure induced information.


Brain-Like Stochastic Search: A Research Challenge and Funding Opportunity

arXiv.org Artificial Intelligence

- Brain-Like Stochastic Search (BLiSS) refers to this task: given a family of utility functions U(u,), where is a vector of parameters or task descriptors, maximize or minimize U with respect to u, using networks (Option Nets) which input and learn to generate good options u stochastically. This paper discusses why this is crucial to brain-like intelligence (an area funded by NSF) and to many applications, and discusses various possibilities for network design and training. Of course, there are many forms of evolutionary computing. There are also classical methods, like Gibbs search and the sophisticated trust region approaches recently developed by Barhen et al and used on the Desert Storm tank routing problem. There are a few neural net designs (like Kohonen nets, but not Hopfield nets) which have had competitive performance on some specialized large-scale optimization problems.


Learning Probabilistic Hierarchical Task Networks to Capture User Preferences

arXiv.org Artificial Intelligence

We propose automatically learning probabilistic Hierarchical Task Networks (pH-TNs) in order to capture a user's preferences on plans, by observing only the user's behavior. HTNs are a common choice of representation for a variety of purposes in planning, including work on learning in planning. Our contributions are (a) learning structure and (b) representing preferences. In contrast, prior work employing HTNs considers learning method preconditions (instead of structure) and representing domain physics or search control knowledge (rather than preferences). Initially we will assume that the observed distribution of plans is an accurate representation of user preference, and then generalize to the situation where feasibility constraints frequently prevent the execution of preferred plans. In order to learn a distribution on plans we adapt an Expectation-Maximization (EM) technique from the discipline of (probabilistic) grammar induction, taking the perspective of task reductions as productions in a context-free grammar over primitive actions. To account for the difference between the distributions of possible and preferred plans we subsequently modify this core EM technique, in short, by rescaling its input.


Structured Variable Selection with Sparsity-Inducing Norms

arXiv.org Machine Learning

We consider the empirical risk minimization problem for linear supervised learning, with regularization by structured sparsity-inducing norms. These are defined as sums of Euclidean norms on certain subsets of variables, extending the usual $\ell_1$-norm and the group $\ell_1$-norm by allowing the subsets to overlap. This leads to a specific set of allowed nonzero patterns for the solutions of such problems. We first explore the relationship between the groups defining the norm and the resulting nonzero patterns, providing both forward and backward algorithms to go back and forth from groups to patterns. This allows the design of norms adapted to specific prior knowledge expressed in terms of nonzero patterns. We also present an efficient active set algorithm, and analyze the consistency of variable selection for least-squares linear regression in low and high-dimensional settings.


A Survey of Paraphrasing and Textual Entailment Methods

arXiv.org Artificial Intelligence

Paraphrasing methods recognize, generate, or extract phrases, sentences, or longer natural language expressions that convey almost the same information. Textual entailment methods, on the other hand, recognize, generate, or extract pairs of natural language expressions, such that a human who reads (and trusts) the first element of a pair would most likely infer that the other element is also true. Paraphrasing can be seen as bidirectional textual entailment and methods from the two areas are often similar. Both kinds of methods are useful, at least in principle, in a wide range of natural language processing applications, including question answering, summarization, text generation, and machine translation. We summarize key ideas from the two areas by considering in turn recognition, generation, and extraction methods, also pointing to prominent articles and resources.


Grounding FO and FO(ID) with Bounds

Journal of Artificial Intelligence Research

Grounding is the task of reducing a first-order theory and finite domain to an equivalent propositional theory. It is used as preprocessing phase in many logic-based reasoning systems. Such systems provide a rich first-order input language to a user and can rely on efficient propositional solvers to perform the actual reasoning. Besides a first-order theory and finite domain, the input for grounders contains in many applications also additional data. By exploiting this data, the size of the grounder's output can often be reduced significantly. A common practice to improve the efficiency of a grounder in this context is by manually adding semantically redundant information to the input theory, indicating where and when the grounder should exploit the data. In this paper we present a method to compute and add such redundant information automatically. Our method therefore simplifies the task of writing input theories that can be grounded efficiently by current systems. We first present our method for classical first-order logic (FO) theories. Then we extend it to FO(ID), the extension of FO with inductive definitions, which allows for more concise and comprehensive input theories. We discuss implementation issues and experimentally validate the practical applicability of our method.


Constructing Reference Sets from Unstructured, Ungrammatical Text

Journal of Artificial Intelligence Research

Vast amounts of text on the Web are unstructured and ungrammatical, such as classified ads, auction listings, forum postings, etc. We call such text posts. Despite their inconsistent structure and lack of grammar, posts are full of useful information. This paper presents work on semi-automatically building tables of relational information, called reference sets, by analyzing such posts directly. Reference sets can be applied to a number of tasks such as ontology maintenance and information extraction. Our reference-set construction method starts with just a small amount of background knowledge, and constructs tuples representing the entities in the posts to form a reference set. We also describe an extension to this approach for the special case where even this small amount of background knowledge is impossible to discover and use. To evaluate the utility of the machine-constructed reference sets, we compare them to manually constructed reference sets in the context of reference-set-based information extraction. Our results show the reference sets constructed by our method outperform manually constructed reference sets. We also compare the reference-set-based extraction approach using the machine-constructed reference set to supervised extraction approaches using generic features. These results demonstrate that using machine-constructed reference sets outperforms the supervised methods, even though the supervised methods require training data.


A Survey of Paraphrasing and Textual Entailment Methods

Journal of Artificial Intelligence Research

Paraphrasing methods recognize, generate, or extract phrases, sentences, or longer natural language expressions that convey almost the same information. Textual entailment methods, on the other hand, recognize, generate, or extract pairs of natural language expressions, such that a human who reads (and trusts) the first element of a pair would most likely infer that the other element is also true. Paraphrasing can be seen as bidirectional textual entailment and methods from the two areas are often similar. Both kinds of methods are useful, at least in principle, in a wide range of natural language processing applications, including question answering, summarization, text generation, and machine translation. We summarize key ideas from the two areas by considering in turn recognition, generation, and extraction methods, also pointing to prominent articles and resources.


An Empirical Study of the Manipulability of Single Transferable Voting

arXiv.org Artificial Intelligence

Voting is a simple mechanism to combine together the preferences of multiple agents. Agents may try to manipulate the result of voting by mis-reporting their preferences. One barrier that might exist to such manipulation is computational complexity. In particular, it has been shown that it is NP-hard to compute how to manipulate a number of different voting rules. However, NP-hardness only bounds the worst-case complexity. Recent theoretical results suggest that manipulation may often be easy in practice. In this paper, we study empirically the manipulability of single transferable voting (STV) to determine if computational complexity is really a barrier to manipulation. STV was one of the first voting rules shown to be NP-hard. It also appears one of the harder voting rules to manipulate. We sample a number of distributions of votes including uniform and real world elections. In almost every election in our experiments, it was easy to compute how a single agent could manipulate the election or to prove that manipulation by a single agent was impossible.


Distilled Sensing: Adaptive Sampling for Sparse Detection and Estimation

arXiv.org Machine Learning

Adaptive sampling results in dramatic improvements in the recovery of sparse signals in white Gaussian noise. A sequential adaptive sampling-and-refinement procedure called Distilled Sensing (DS) is proposed and analyzed. DS is a form of multi-stage experimental design and testing. Because of the adaptive nature of the data collection, DS can detect and localize far weaker signals than possible from non-adaptive measurements. In particular, reliable detection and localization (support estimation) using non-adaptive samples is possible only if the signal amplitudes grow logarithmically with the problem dimension. Here it is shown that using adaptive sampling, reliable detection is possible provided the amplitude exceeds a constant, and localization is possible when the amplitude exceeds any arbitrarily slowly growing function of the dimension.