Learning Graphical Models
Online Max-Margin Weight Learning with Markov Logic Networks
Huynh, Tuyen N. (The University of Texas at Austin) | Mooney, Raymond J. (The University of Texas at Austin)
Most of the existing weight-learning algorithms for Markov Logic Networks (MLNs) use batch training which becomes computationally expensive and even infeasible for very large datasets since the training examples may not fit in main memory. To overcome this problem, previous work has used online learning algorithms to learn weights for MLNs. However, this prior work has only applied existing online algorithms, and there is no comprehensive study of online weight learning for MLNs. In this paper, we derive new online algorithms for structured prediction using the primal-dual framework, apply them to learn weights for MLNs, and compare against existing online algorithms on two large, real-world datasets. The experimental results show that the new algorithms achieve better accuracy than existing methods.
Relational Learning for Collective Classification of Entities in Images
Chechetka, Anton (Carnegie Mellon University) | Dash, Denver (Intel Labs Pittsburgh) | Philipose, Matthai (Intel Labs Seattle)
We consider the problem of discrete multi-label entity classification in images. We argue that the framework of Markov Logic can provide a unified, well-grounded mechanism to incorporate arbitrary logical relationships between entities to improve classification in images, and thus generalizes much of the recent work on exploiting local and global context in object recognition and scene understanding. Furthermore, we show that Markov Logic can provide a powerful new set of contexts that can relate entities across images in a database for joint classification of all entities in a test set simultaneously. We relate this collective classification of images to graph-based semi-supervised learning approaches, and show that Markov Logic can effectively provide a method to unify context-related work with semi-supervised approaches in a way that neither techniques could easily do on their own. Finally, we show the efficacy of these techniques on a face recognition task on three datasets showing that adding contextual relations dramatically improves accuracy over semi-supervised learning approaches alone.
Activity Recognition Based on Home to Home Transfer Learning
Rashidi, Parisa (Washington State University) | Cook, Diane J. (Washington State University)
Activity recognition plays an important role in many areas such as smart environments by offering unprecedented opportunities for assisted living, automation, security and energy efficiency. It’s also an essential component for planning and plan recognition in smart environments. One challenge of activity recognition is the need for collecting and annotating huge amounts of data for each new physical setting in order to be able to carry out the conventional activity discovery and recognition algorithms. This extensive initial phase of data collection and annotation results in a prolonged installation process and excessive time investment for each new space. In this paper we propose a new method of transferring learned knowledge of activities to a new physical space in order to leverage the learning process in the new environment. Our method called ”Home to Home Transfer Learning” (HHTL) is based on using a semi EM framework and modeling activities using structural, temporal and spatial features. This method allows us to avoid the tedious task of collecting and labeling huge amounts of data in the target space, and allows for a more accelerated and more scalable deployment cycle in the real world. It also allows us to exploit the insights learned in previous spaces. To validate our algorithms, we use the data collected in several smart apartments with different physical layouts.
Handling Looping and Optional Actions in YAPPR
Geib, Christopher (University of Edinburgh) | Goldman, Robert (SIFT LLC)
Previous work on the YAPPR plan recognition system provided algorithms for translating conventional HTN plan libraries into lexicalized grammars and treated the problem of plan recognition as one of parsing. To produce these grammars required a fixed bound for any loops within the grammar and a presented a problem for optional actions within HTN plans. In this work we show that well known transformations from formal language theory can be used to rewrite the plan grammars to remove these limitations on the plan libraries.
Teamwork and Coordination under Model Uncertainty in DEC-POMDPs
Kwak, Jun-young (University of Southern California) | Yang, Rong (University of Southern California) | Yin, Zhengyu (University of Southern California) | Taylor, Matthew E. (University of Southern California) | Tambe, Milind (University of Southern California)
Distributed Partially Observable Markov Decision Processes (DEC-POMDPs) are a popular planning framework for multiagent teamwork to compute (near-)optimal plans. However, these methods assume a complete and correct world model, which is often violated in real-world domains. We provide a new algorithm for DEC-POMDPs that is more robust to model uncertainty, with a focus on domains with sparse agent interactions. Our STC algorithm relies on the following key ideas: (1) reduce planning-time computation by shifting some of the burden to execution-time reasoning, (2) exploit sparse interactions between agents, and (3) maintain an approximate model of agents’ beliefs. We empirically show that STC is often substantially faster to existing DEC-POMDP methods without sacrificing reward performance.
A Computational Decision Theory for Interactive Assistants
Fern, Alan (Oregon State University) | Tadepalli, Prasad (Oregon State University)
We study several classes of interactive assistants from the points of view of decision theory and computational complexity. We first introduce a special class of POMDPs called hidden-goal MDPs (HGMDPs), which formalize the problem of interactively assisting an agent whose goal is hidden and whose actions are observable. In spite of its restricted nature, we show that optimal action selection in finite horizon HGMDPs is PSPACE-complete even in domains with deterministic dynamics. We then introduce a more restricted model called helper action MDPs (HAMDPs), where the assistant’s action is accepted by the agent when it is helpful, and can be easily ignored by the agent otherwise. We show classes of HAMDPs that are complete for PSPACE and NP along with a polynomial time class. Furthermore, we show that for general HAMDPs a simple myopic policy achieves a regret, compared to an omniscient assistant, that is bounded by the entropy of the initial goal distribution. A variation of this policy is also shown to achieve worst-case regret that is logarithmic in the number of goals for any goal distribution.
Application of Data Mining to Network Intrusion Detection: Classifier Selection Model
As network attacks have increased in number and severity over the past few years, intrusion detection system (IDS) is increasingly becoming a critical component to secure the network. Due to large volumes of security audit data as well as complex and dynamic properties of intrusion behaviors, optimizing performance of IDS becomes an important open problem that is receiving more and more attention from the research community. The uncertainty to explore if certain algorithms perform better for certain attack classes constitutes the motivation for the reported herein. In this paper, we evaluate performance of a comprehensive set of classifier algorithms using KDD99 dataset. Based on evaluation results, best algorithms for each attack category is chosen and two classifier algorithm selection models are proposed. The simulation result comparison indicates that noticeable performance improvement and real-time intrusion detection can be achieved as we apply the proposed models to detect different kinds of network attacks.
An axiomatic formalization of bounded rationality based on a utility-information equivalence
Ortega, Pedro A., Braun, Daniel A.
Classic decision-theory is based on the maximum expected utility (MEU) principle, but crucially ignores the resource costs incurred when determining optimal decisions. Here we propose an axiomatic framework for bounded decision-making that considers resource costs. Agents are formalized as probability measures over input-output streams. We postulate that any such probability measure can be assigned a corresponding conjugate utility function based on three axioms: utilities should be real-valued, additive and monotonic mappings of probabilities. We show that these axioms enforce a unique conversion law between utility and probability (and thereby, information). Moreover, we show that this relation can be characterized as a variational principle: given a utility function, its conjugate probability measure maximizes a free utility functional. Transformations of probability measures can then be formalized as a change in free utility due to the addition of new constraints expressed by a target utility function. Accordingly, one obtains a criterion to choose a probability measure that trades off the maximization of a target utility function and the cost of the deviation from a reference distribution. We show that optimal control, adaptive estimation and adaptive control problems can be solved this way in a resource-efficient way. When resource costs are ignored, the MEU principle is recovered. Our formalization might thus provide a principled approach to bounded rationality that establishes a close link to information theory.
Feature Construction for Relational Sequence Learning
Di Mauro, Nicola, Basile, Teresa M. A., Ferilli, Stefano, Esposito, Floriana
We tackle the problem of multi-class relational sequence learning using relevant patterns discovered from a set of labelled sequences. To deal with this problem, firstly each relational sequence is mapped into a feature vector using the result of a feature construction method. Since, the efficacy of sequence learning algorithms strongly depends on the features used to represent the sequences, the second step is to find an optimal subset of the constructed features leading to high classification accuracy. This feature selection task has been solved adopting a wrapper approach that uses a stochastic local search algorithm embedding a naive Bayes classifier. The performance of the proposed method applied to a real-world dataset shows an improvement when compared to other established methods, such as hidden Markov models, Fisher kernels and conditional random fields for relational sequences.
Learning to Predict Combinatorial Structures
The major challenge in designing a discriminative learning algorithm for predicting structured data is to address the computational issues arising from the exponential size of the output space. Existing algorithms make different assumptions to ensure efficient, polynomial time estimation of model parameters. For several combinatorial structures, including cycles, partially ordered sets, permutations and other graph classes, these assumptions do not hold. In this thesis, we address the problem of designing learning algorithms for predicting combinatorial structures by introducing two new assumptions: (i) The first assumption is that a particular counting problem can be solved efficiently. The consequence is a generalisation of the classical ridge regression for structured prediction. (ii) The second assumption is that a particular sampling problem can be solved efficiently. The consequence is a new technique for designing and analysing probabilistic structured prediction models. These results can be applied to solve several complex learning problems including but not limited to multi-label classification, multi-category hierarchical classification, and label ranking.