Europe
Mining Meaning from Wikipedia
Medelyan, Olena, Milne, David, Legg, Catherine, Witten, Ian H.
Wikipedia is a goldmine of information; not just for its many readers, but also for the growing community of researchers who recognize it as a resource of exceptional scale and utility. It represents a vast investment of manual effort and judgment: a huge, constantly evolving tapestry of concepts and relations that is being applied to a host of tasks. This article provides a comprehensive description of this work. It focuses on research that extracts and makes use of the concepts, relations, facts and descriptions found in Wikipedia, and organizes the work into four broad categories: applying Wikipedia to natural language processing; using it to facilitate information retrieval and information extraction; and as a resource for ontology building. The article addresses how Wikipedia is being used as is, how it is being improved and adapted, and how it is being combined with other structures to create entirely new resources. We identify the research groups and individuals involved, and how their work has developed in the last few years. We provide a comprehensive list of the open-source software they have produced.
Feasibility of random basis function approximators for modeling and control
Tyukin, Ivan, Prokhorov, Danil
Abstract-- We discuss the role of random basis function approximators in modeling and control. We analyze the published work on random basis function approximators and demonstrate that their favorable error rate of convergence O(1/n) is guaranteed only with very substantial computational resources. We also discuss implications of our analysis for applications of neural networks in modeling and control. I. INTRODUCTION Efficient modeling and control of complex systems in the presence of uncertainties is important for modern engineering. This is especially true in the domain of intelligent systems that are designed to operate in uncertain environments. Physical models of such relations f(·) are not always available, and it is quite common to use mathematical substitutes such as, e.g., superpositions of (basis) functions that are capable of approximating a-priori unknown f(·) with the required precision.
Gaussian Belief with dynamic data and in dynamic network
In this paper we analyse Belief Propagation over a Gaussian model in a dynamic environment. Recently, this has been proposed as a method to average local measurement values by a distributed protocol ("Consensus Propagation", Moallemi & Van Roy, 2006), where the average is available for read-out at every single node. In the case that the underlying network is constant but the values to be averaged fluctuate ("dynamic data"), convergence and accuracy are determined by the spectral properties of an associated Ruelle-Perron-Frobenius operator. For Gaussian models on Erdos-Renyi graphs, numerical computation points to a spectral gap remaining in the large-size limit, implying exceptionally good scalability. In a model where the underlying network also fluctuates ("dynamic network"), averaging is more effective than in the dynamic data case. Altogether, this implies very good performance of these methods in very large systems, and opens a new field of statistical physics of large (and dynamic) information systems.
FaceBots: Steps Towards Enhanced Long-Term Human-Robot Interaction by Utilizing and Publishing Online Social Information
Mavridis, Nikolaos, Emami, Shervin, Datta, Chandan, Kamzi, Wajahat, BenAbdelkader, Chiraz, Toulis, Panos, Tanoto, Andry, Rabie, Tamer
The main problem addressed by this project is that of the creation of sustainable and meaningful long-term human robot relationships. This is a most important problem towards our ultimate goal of human-robot symbiosis, i.e. harmonious and mutually beneficial living together of the two species. In the shorter term, this is an important problem towards the successful application of robots to numerous areas: disabled and elderly assistance / companionship, supporting education, and more. So far, empirical investigations have shown that we have not advanced significantly yet towards its solution: Although existing robotic systems are interesting to interact with in the short term, it has been shown that after some weeks of quasi-regular encounters, humans gradually lose their interest, and meaningful longer-term human-robot relationships are not established. For example, in the case of Robovie [1], there was a steady and significant decrease in the total time of interaction of the robot with humans over six months - interest had worn off.
Conservative Inference Rule for Uncertain Reasoning under Incompleteness
In this paper we formulate the problem of inference under incomplete information in very general terms. This includes modelling the process responsible for the incompleteness, which we call the incompleteness process. We allow the process' behaviour to be partly unknown. Then we use Walley's theory of coherent lower previsions, a generalisation of the Bayesian theory to imprecision, to derive the rule to update beliefs under incompleteness that logically follows from our assumptions, and that we call conservative inference rule. This rule has some remarkable properties: it is an abstract rule to update beliefs that can be applied in any situation or domain; it gives us the opportunity to be neither too optimistic nor too pessimistic about the incompleteness process, which is a necessary condition to draw reliable while strong enough conclusions; and it is a coherent rule, in the sense that it cannot lead to inconsistencies. We give examples to show how the new rule can be applied in expert systems, in parametric statistical inference, and in pattern classification, and discuss more generally the view of incompleteness processes defended here as well as some of its consequences.
Efficient Informative Sensing using Multiple Robots
Singh, A., Krause, A., Guestrin, C., Kaiser, W. J.
The need for efficient monitoring of spatio-temporal dynamics in large environmental applications, such as the water quality monitoring in rivers and lakes, motivates the use of robotic sensors in order to achieve sufficient spatial coverage. Typically, these robots have bounded resources, such as limited battery or limited amounts of time to obtain measurements. Thus, careful coordination of their paths is required in order to maximize the amount of information collected, while respecting the resource constraints. In this paper, we present an efficient approach for near-optimally solving the NP-hard optimization problem of planning such informative paths. In particular, we first develop eSIP (efficient Single-robot Informative Path planning), an approximation algorithm for optimizing the path of a single robot. Hereby, we use a Gaussian Process to model the underlying phenomenon, and use the mutual information between the visited locations and remainder of the space to quantify the amount of information collected. We prove that the mutual information collected using paths obtained by using eSIP is close to the information obtained by an optimal solution. We then provide a general technique, sequential allocation, which can be used to extend any single robot planning algorithm, such as eSIP, for the multi-robot problem. This procedure approximately generalizes any guarantees for the single-robot problem to the multi-robot case. We extensively evaluate the effectiveness of our approach on several experiments performed in-field for two important environmental sensing applications, lake and river monitoring, and simulation experiments performed using several real world sensor network data sets.
Planning over Chain Causal Graphs for Variables with Domains of Size 5 Is NP-Hard
Recently, considerable focus has been given to the problem of determining the boundary between tractable and intractable planning problems. In this paper, we study the complexity of planning in the class C_n of planning problems, characterized by unary operators and directed path causal graphs. Although this is one of the simplest forms of causal graphs a planning problem can have, we show that planning is intractable for C_n (unless P = NP), even if the domains of state variables have bounded size. In particular, we show that plan existence for C_n^k is NP-hard for k>=5 by reduction from CNFSAT. Here, k denotes the upper bound on the size of the state variable domains. Our result reduces the complexity gap for the class C_n^k to cases k=3 and k=4 only, since C_n^2 is known to be tractable.
Sentence Compression as Tree Transduction
This paper presents a tree-to-tree transduction method for sentence compression. Our model is based on synchronous tree substitution grammar, a formalism that allows local distortion of the tree topology and can thus naturally capture structural mismatches. We describe an algorithm for decoding in this framework and show how the model can be trained discriminatively within a large margin framework. Experimental results on sentence compression bring significant improvements over a state-of-the-art model.
Inferring Shallow-Transfer Machine Translation Rules from Small Parallel Corpora
Sánchez-Martínez, F., Forcada, M. L.
This paper describes a method for the automatic inference of structural transfer rules to be used in a shallow-transfer machine translation (MT) system from small parallel corpora. The structural transfer rules are based on alignment templates, like those used in statistical MT. Alignment templates are extracted from sentence-aligned parallel corpora and extended with a set of restrictions which are derived from the bilingual dictionary of the MT system and control their application as transfer rules. The experiments conducted using three different language pairs in the free/open-source MT platform Apertium show that translation quality is improved as compared to word-for-word translation (when no transfer rules are used), and that the resulting translation quality is close to that obtained using hand-coded transfer rules. The method we present is entirely unsupervised and benefits from information in the rest of modules of the MT system in which the inferred rules are applied.
Semantic Social Network Analysis
Erétéo, Guillaume, Gandon, Fabien, Corby, Olivier, Buffa, Michel
Since its birth, the web provided many ways of interacting between us [6], revealing huge social network structures [17], a phenomenon amplified by web 2.0 applications [11]. Researchers extracted social networks from emails, mailinglist archives, hyperlink structure of homepages, cooccurrence of names in documents and from the digital traces created by web 2.0 application usages [9]. Facebook, LinkedIn or Myspace provide huge amounts of structured network data. The emergence of the semantic web approaches led researchers to build models of such online interactions using ontologies like FOAF, SIOC or SCOT. This paper starts with a brief state of the art on these enhanced RDF-based representations. We will see that the graphs built using these ontologies have a great potential that is not fully exploited so far. Then, we present a new framework for applying SNA to RDF representations of social data. In particular, the use of graph models underlying RDF and SPARQL extensions enables us to extract efficiently and to parameterize the classic SNA features directly from these representations.