Overview
Experiments with Massively Parallel Constraint Solving
Bordeaux, Lucas (Microsoft Research) | Hamadi, Youssef (Microsoft Research) | Samulowitz, Horst (Microsoft Research)
The computing industry is currently facing a major architectural shift. Extra computing power is not coming anymore from higher processor frequencies, but from a growing number of computing cores and processors. For AI, and constraint solving in particular, this raises the question of how to scale current solving techniques to massively parallel architectures. While prior work focusses mostly on small scale parallel constraint solving, we conduct the first study on scalability of constraint solving on 100 processors and beyond in this paper. We propose techniques that are simple to apply and show empirically that they scale surprisingly well. These techniques establish a performance baseline for parallel constraint solving technologies against which more sophisticated parallel algorithms need to compete in the future.
Predicting Learnt Clauses Quality in Modern SAT Solvers
Audemard, Gilles (University of Artois) | Simon, Laurent (University Paris-Sud)
Beside impressive progresses made by SAT solvers over the last ten years, only few works tried to understand why Conflict Directed Clause Learning algorithms (CDCL) are so strong and efficient on most industrial applications. We report in this work a key observation of CDCL solvers behavior on this family of benchmarks and explain it by an unsuspected side effect of their particular Clause Learning scheme. This new paradigm allows us to solve an important, still open, question: How to designing a fast, static, accurate, and predictive measure of new learnt clauses pertinence. Our paper is followed by empirical evidences that show how our new learning scheme improves state-of-the art results by an order of magnitude on both SAT and UNSAT industrial problems.
Investigations of Continual Computation
Shahaf, Dafna (Carnegie Mellon) | Horvitz, Eric (Microsoft Research)
Autonomous agents that sense, reason, and act in real-world environments for extended periods often need to solve streams of incoming problems. Traditionally, effort is applied only to problems that have already arrived and have been noted. We examine continual computation methods that allow agents to ideally allocate time to solving current as well as potential future problems under uncertainty. We first review prior work on continual computation. Then, we present new directions and results, including the consideration of shared subtasks and multiple tasks. We present results on the computational complexity of the continual-computation problem and provide approximations for arbitrary models of computational performance. Finally, we review special formulations for addressing uncertainty about the best algorithm to apply, learning about performance, and considering costs associated with delayed use of results.
Activity Recognition: Linking Low-Level Sensors to High-Level Intelligence
Yang, Qiang (Hong Kong Hong Kong University of Science and Technology)
Sensors provide computer systems with a window to the outside world. Activity recognition "sees" what is in the window to predict the locations, trajectories, actions, goals and plans of humans and objects. Building an activity recognition system requires a full range of interaction from statistical inference on lower level sensor data to symbolic AI at higher levels, where prediction results and acquired knowledge are passed up each level to form a knowledge food chain. In this article, I will give an overview of some of the current activity recognition research works and explore a life-cycle of learning and inference that allows the lowest-level radio-frequency signals to be transformed into symbolic logical representations for AI planning, which in turn controls the robots or guides human users through a sensor network, thus completing a full life-cycle of knowledge.
Towards Improving Validation, Verification, Crash Investigations, and Event Reconstruction of Flight-Critical Systems with Self-Forensics
In this paper we introduce a new concept for flight-critical integrated software and hardware systems to analyze themselves forensically as needed as well as keeping forensics data for further automated analysis in cases of reports of anomalies, failures, and crashes. We insist this should be a part of the protocol for each system, (even not only flight systems), but any large and/or critical self-managed system. This proposition is a rehash of the related work of the author during his PhD studies [1, 2] for the NASA spacecraft self-forensics concept as well as a work towards improving the safety and crash investigation of read vehicles with similar means. We review some of the related work that these ideas are built upon prior describing the requirements for self-forensics components. We describe the general requirements as well as limitations and advantages. This is a draft sketch.
Symmetry in Data Mining and Analysis: A Unifying View based on Hierarchy
Data analysis and data mining are concerned with unsupervised pattern finding and structure determination in data sets. The data sets themselves are explicitly linked as a form of representation to an observational or otherwise empirical domain of interest. "Structure" has long been understood as symmetry which can take many forms with respect to any transformation, including point, translational, rotational, and many others. Beginning with the role of number theory in expressing data, we show how we can naturally proceed to hierarchical structures. We show how this both encapsulates traditional paradigms in data analysis, and also opens up new perspectives towards issues that are on the order of the day, including data mining of massive, high dimensional, heterogeneous data sets. Linkages with other fields are also discussed including computational logic and symbolic dynamics. The structures in data surveyed here are based on hierarchy, represented as p-adic numbers or an ultrametric topology.
SlidesGen: Automatic Generation of Presentation Slides for a Technical Paper Using Summarization
Sravanthi, M. (Indian Institute of Technology Madras) | Chowdary, C. Ravindranath (Indian Institute of Technology) | Kumar, P. Sreenivasa
Presentations are one of the most common and effective ways of communicating the overview of a work to the audience. Given a technical paper, automatic generation of presentation slides reduces the effort of the presenter and helps in creating a structured summary of the paper. In this paper, we propose the framework of a novel system that does this task. Any paper that has an abstract and whose sections can be categorized under introduction, related work, model, experiments and conclusions can be given as input. As documents in LaTeX are rich in structural and semantic information we used them as input to our system. These documents are initially converted to XML format. This XML file is parsed and information in it is extracted. A query specific extractive summarizer has been used to generate slides. All graphical elements from the paper are made well use of by placing them at appropriate locations in the slides. These slides are presented in the document order.
Mining Meaning from Wikipedia
Medelyan, Olena, Milne, David, Legg, Catherine, Witten, Ian H.
Wikipedia is a goldmine of information; not just for its many readers, but also for the growing community of researchers who recognize it as a resource of exceptional scale and utility. It represents a vast investment of manual effort and judgment: a huge, constantly evolving tapestry of concepts and relations that is being applied to a host of tasks. This article provides a comprehensive description of this work. It focuses on research that extracts and makes use of the concepts, relations, facts and descriptions found in Wikipedia, and organizes the work into four broad categories: applying Wikipedia to natural language processing; using it to facilitate information retrieval and information extraction; and as a resource for ontology building. The article addresses how Wikipedia is being used as is, how it is being improved and adapted, and how it is being combined with other structures to create entirely new resources. We identify the research groups and individuals involved, and how their work has developed in the last few years. We provide a comprehensive list of the open-source software they have produced.
Unsupervised Methods for Determining Object and Relation Synonyms on the Web
The task of identifying synonymous relations and objects, or synonym resolution, is critical for high-quality information extraction. This paper investigates synonym resolution in the context of unsupervised information extraction, where neither hand-tagged training examples nor domain knowledge is available. The paper presents a scalable, fully-implemented system that runs in O(KN log N) time in the number of extractions, N, and the maximum number of synonyms per word, K. The system, called Resolver , introduces a probabilistic relational model for predicting whether two strings are co-referential based on the similarity of the assertions containing them. On a set of two million assertions extracted from the Web, Resolver resolves objects with 78% precision and 68% recall, and resolves relations with 90% precision and 35% recall. Several variations of resolver's probabilistic model are explored, and experiments demonstrate that under appropriate conditions these variations can improve F1 by 5%. An extension to the basic Resolver system allows it to handle polysemous names with 97% precision and 95% recall on a data set from the TREC corpus.
An introduction to DSmT
Dezert, Jean, Smarandache, Florentin
The management and combination of uncertain, imprecise, fuzzy and even paradoxical or high conflicting sources of information has always been, and still remains today, of primal importance for the development of reliable modern information systems involving artificial reasoning. The combination (fusion) of information arises in many fields of applications nowadays (especially in defense, medicine, finance, geo-science, economy, etc). When several sensors, observers or experts have to be combined together to solve a problem, or if one wants to update our current estimation of solutions for a given problem with some new information available, we need powerful and solid mathematical tools for the fusion, specially when the information one has to deal with is imprecise and uncertain. In this paper, we present a survey of our recent theory of plausible and paradoxical reasoning, known as Dezert-Smarandache Theory (DSmT) in the literature, developed for dealing with imprecise, uncertain and conflicting sources of information. Recent publications have shown the interest and the ability of DSmT to solve problems where other approaches fail, especially when conflict between sources becomes high. We focus this presentation rather on the foundations of DSmT, and on the main important rules of combination, than on browsing specific applications of DSmT available in literature. Several simple examples are given throughout the presentation to show the efficiency and the generality of DSmT.