Asia
Controlling Complexity in Part-of-Speech Induction
Graca, J. V., Ganchev, K., Coheur, L., Pereira, F., Taskar, B.
We consider the problem of fully unsupervised learning of grammatical (part-of-speech) categories from unlabeled text. The standard maximum-likelihood hidden Markov model for this task performs poorly, because of its weak inductive bias and large model capacity. We address this problem by refining the model and modifying the learning objective to control its capacity via para- metric and non-parametric constraints. Our approach enforces word-category association sparsity, adds morphological and orthographic features, and eliminates hard-to-estimate parameters for rare words. We develop an efficient learning algorithm that is not much more computationally intensive than standard training. We also provide an open-source implementation of the algorithm. Our experiments on five diverse languages (Bulgarian, Danish, English, Portuguese, Spanish) achieve significant improvements compared with previous methods for the same task.
A Probabilistic Framework for Learning Kinematic Models of Articulated Objects
Sturm, J., Stachniss, C., Burgard, W.
Robots operating in domestic environments generally need to interact with articulated objects, such as doors, cabinets, dishwashers or fridges. In this work, we present a novel, probabilistic framework for modeling articulated objects as kinematic graphs. Vertices in this graph correspond to object parts, while edges between them model their kinematic relationship. In particular, we present a set of parametric and non-parametric edge models and how they can robustly be estimated from noisy pose observations. We furthermore describe how to estimate the kinematic structure and how to use the learned kinematic models for pose prediction and for robotic manipulation tasks. We finally present how the learned models can be generalized to new and previously unseen objects. In various experiments using real robots with different camera systems as well as in simulation, we show that our approach is valid, accurate and efficient. Further, we demonstrate that our approach has a broad set of applications, in particular for the emerging fields of mobile manipulation and service robotics.
Confidentiality-Preserving Data Publishing for Credulous Users by Extended Abduction
Inoue, Katsumi, Sakama, Chiaki, Wiese, Lena
Publishing private data on external servers incurs the problem of how to avoid unwanted disclosure of confidential data. We study a problem of confidentiality in extended disjunctive logic programs and show how it can be solved by extended abduction. In particular, we analyze how credulous non-monotonic reasoning affects confidentiality.
Event in Compositional Dynamic Semantics
We present a framework which constructs an event-style dis- course semantics. The discourse dynamics are encoded in continuation semantics and various rhetorical relations are embedded in the resulting interpretation of the framework. We assume discourse and sentence are distinct semantic objects, that play different roles in meaning evalua- tion. Moreover, two sets of composition functions, for handling different discourse relations, are introduced. The paper first gives the necessary background and motivation for event and dynamic semantics, then the framework with detailed examples will be introduced.
Toward Parts-Based Scene Understanding with Pixel-Support Parts-Sparse Pictorial Structures
Scene understanding remains a significant challenge in the computer vision community. The visual psychophysics literature has demonstrated the importance of interdependence among parts of the scene. Yet, the majority of methods in computer vision remain local. Pictorial structures have arisen as a fundamental parts-based model for some vision problems, such as articulated object detection. However, the form of classical pictorial structures limits their applicability for global problems, such as semantic pixel labeling. In this paper, we propose an extension of the pictorial structures approach, called pixel-support parts-sparse pictorial structures, or PS3, to overcome this limitation. Our model extends the classical form in two ways: first, it defines parts directly based on pixel-support rather than in a parametric form, and second, it specifies a space of plausible parts-based scene models and permits one to be used for inference on any given image. PS3 makes strides toward unifying object-level and pixel-level modeling of scene elements. In this report, we implement the first half of our model and rely upon external knowledge to provide an initial graph structure for a given image. Our experimental results on benchmark datasets demonstrate the capability of this new parts-based view of scene modeling.
A Machine Learning Perspective on Predictive Coding with PAQ
Knoll, Byron, de Freitas, Nando
PAQ8 is an open source lossless data compression algorithm that currently achieves the best compression rates on many benchmarks. This report presents a detailed description of PAQ8 from a statistical machine learning perspective. It shows that it is possible to understand some of the modules of PAQ8 and use this understanding to improve the method. However, intuitive statistical explanations of the behavior of other modules remain elusive. We hope the description in this report will be a starting point for discussions that will increase our understanding, lead to improvements to PAQ8, and facilitate a transfer of knowledge from PAQ8 to other machine learning methods, such a recurrent neural networks and stochastic memoizers. Finally, the report presents a broad range of new applications of PAQ to machine learning tasks including language modeling and adaptive text prediction, adaptive game playing, classification, and compression using features from the field of deep learning.
A sticky HDP-HMM with application to speaker diarization
Fox, Emily B., Sudderth, Erik B., Jordan, Michael I., Willsky, Alan S.
We consider the problem of speaker diarization, the problem of segmenting an audio recording of a meeting into temporal segments corresponding to individual speakers. The problem is rendered particularly difficult by the fact that we are not allowed to assume knowledge of the number of people participating in the meeting. To address this problem, we take a Bayesian nonparametric approach to speaker diarization that builds on the hierarchical Dirichlet process hidden Markov model (HDP-HMM) of Teh et al. [J. Amer. Statist. Assoc. 101 (2006) 1566--1581]. Although the basic HDP-HMM tends to over-segment the audio data---creating redundant states and rapidly switching among them---we describe an augmented HDP-HMM that provides effective control over the switching rate. We also show that this augmentation makes it possible to treat emission distributions nonparametrically. To scale the resulting architecture to realistic diarization problems, we develop a sampling algorithm that employs a truncated approximation of the Dirichlet process to jointly resample the full state sequence, greatly improving mixing rates. Working with a benchmark NIST data set, we show that our Bayesian nonparametric architecture yields state-of-the-art speaker diarization results.
Finding Similar/Diverse Solutions in Answer Set Programming
Eiter, Thomas, Erdem, Esra, Erdogan, Halit, Fink, Michael
For some computational problems (e.g., product configuration, planning, diagnosis, query answering, phylogeny reconstruction) computing a set of similar/diverse solutions may be desirable for better decision-making. With this motivation, we studied several decision/optimization versions of this problem in the context of Answer Set Programming (ASP), analyzed their computational complexity, and introduced offline/online methods to compute similar/diverse solutions of such computational problems with respect to a given distance function. All these methods rely on the idea of computing solutions to a problem by means of finding the answer sets for an ASP program that describes the problem. The offline methods compute all solutions in advance using the ASP formulation of the problem with an ASP solver, like Clasp, and then identify similar/diverse solutions using clustering methods. The online methods compute similar/diverse solutions following one of the three approaches: by reformulating the ASP representation of the problem to compute similar/diverse solutions at once using an ASP solver; by computing similar/diverse solutions iteratively (one after other) using an ASP solver; by modifying the search algorithm of an ASP solver to compute similar/diverse solutions incrementally. We modified Clasp to implement the last online method and called it Clasp-NK. In the first two online methods, the given distance function is represented in ASP; in the last one it is implemented in C++. We showed the applicability and the effectiveness of these methods on reconstruction of similar/diverse phylogenies for Indo-European languages, and on several planning problems in Blocks World. We observed that in terms of computational efficiency the last online method outperforms the others; also it allows us to compute similar/diverse solutions when the distance function cannot be represented in ASP.
A Knowledge Mining Model for Ranking Institutions using Rough Computing with Ordering Rules and Formal Concept analysis
Acharjya, D. P., Ezhilarasi, L.
Emergences of computers and information technological revolution made tremendous changes in the real world and provides a different dimension for the intelligent data analysis. Well formed fact, the information at right time and at right place deploy a better knowledge. However, the challenge arises when larger volume of inconsistent data is given for decision making and knowledge extraction. To handle such imprecise data certain mathematical tools of greater importance has developed by researches in recent past namely fuzzy set, intuitionistic fuzzy set, rough Set, formal concept analysis and ordering rules. It is also observed that many information system contains numerical attribute values and therefore they are almost similar instead of exact similar. To handle such type of information system, in this paper we use two processes such as pre process and post process. In pre process we use rough set on intuitionistic fuzzy approximation space with ordering rules for finding the knowledge whereas in post process we use formal concept analysis to explore better knowledge and vital factors affecting decisions.
Between Frustration and Elation: Sense of Control Regulates the lntrinsic Motivation for Motor Learning
Grzyb, Beata J. (Jaume I University and Osaka University) | Boedecker, Joschka (Osaka University) | Asada, Minoru (Osaka University) | Pobil, Angel P. del (Jaume I University) | Smith, Linda B. (Indiana University)
Frustration has been generally viewed in a negative light and its potential role in learning neglected. We propose a new approach to intrinsically motivated learning where frustration is a key factor that allows to dynamically balance exploration and exploitation. Moreover, based on the result obtained from our experiment with older infants, we propose that a temporary decrease in learning from negative feedback can also be beneficial in fine-tuning a newly learned behavior. We suggest that this temporal indifference to the outcome of an action may be related to the sense of control, and results from the state of elation, that is the experience of overcoming a very difficult task after prolonged frustration. Our preliminary simulation results serve as a proof-of-concept for our approach.