Expert Systems
Finding New Information Via Robust Entity Detection
Iacobelli, Francisco (Northwestern University) | Nichols, Nathan (Northwestern University) | Birnbaum, Larry (Northwestern University) | Hammond, Kristian (Northwestern University)
Journalists and editors work under pressure to collect relevant details and background information about specific events. They spend a significant amount of time sifting through documents and finding new information such as facts, opinions or stakeholders (i.e. people, places and organizations that have a stake in the news). Spotting them is a tedious and cognitively intense process. One task, essential to this process, is to find and keep track of stakeholders. This task is taxing cognitively and in terms of memory. Tell Me More offers an automatic aid to this task. Tell Me More is a system that, given a seed story, mines the web for similar stories reported by different sources and selects only those stories which offer new information with respect to that original seed story. Much like a journalist, the task of detecting named entities is central to its success. In this paper we briefly describe Tell Me More and, in particular, we focus on Tell Me More's entity detection component. We describe an approach that combines off-the-shelf named entity recognizers (NERs) with WPED, an in-house publicly available NER that uses Wikipedia as its knowledge base. We show significant increase in precision scores with respect to traditional NERs. Lastly, we present an overall evaluation of Tell Me More using this approach.
Graph-Based Reasoning and Reinforcement Learning for Improving Q/A Performance in Large Knowledge-Based Systems
Sharma, Abhishek (Northwestern University) | Forbus, Kenneth D. (Northwestern University)
Learning to plausibly reason with minimal user intervention could significantly improve knowledge acquisition. We describe how to integrate graph-based heuristic generalization, higher-order knowledge, and reinforcement learning to learn to produce plausible inferences with only small amounts of user training. Experiments on ResearchCyc KB contents show significant improvement in Q/A performance with high accuracy.
A Japanese Natural Language Toolset Implementation for ConceptNet
Roberts, Tyson Michael (Hokkaido University) | Rzepka, Rafal (Hokkaido University) | Araki, Kenji (Hokkaido University)
In recent years, ConceptNet has gained notoriety in the Natural Language Processing (NLP) as a textual commonsense knowledge base (CSKB) for its utilization of k-lines (Liu and Sing, 2004a) which make it suitable for making practical inferences on corpora (Liu and Sing, 2004b). However, until now, ConceptNet has lacked support for many non-English languages. To alleviate this problem, we have implemented a software toolset for the Japanese Language that allows Japanese to be used with ConceptNet's concept inference system. This paper discusses the implementation of this toolset and a possible path for the development of toolsets in other languages with similar features.
A Commonsense Knowledge Base for Generating Children’s Stories
Ong, Ethel ChuaJoy (De La Salle University - Manila)
This paper presents our work in developing a commonsense knowledge source based on semantic concepts about objects, activities and their relationships in a child’s daily life. This commonsense ontology is then used by our automatic story generator to output children's stories of the fable form from a given input picture. The generated story is a narration of the events of a basic plot that flows from negative to positive (rule violation to value acquisition), using themes that are familiar to children. The paper ends with descriptions of further investigations that are underway to extend the system, including using a formal upper ontology to represent storytelling knowledge, and the generation of stories from a given set of sequential scenes.
Goal-Oriented Knowledge Collection
Kuo, Yen-Ling (National Taiwan University) | Hsu, Jane Yung-jen (National Taiwan University)
Games with A Purpose (GWAP) has been demonstrated to be efficient in collecting large amount of knowledge from online users, e.g. Verbosity and Virtual Pet game. However, its effectiveness in knowledge base (KB) construction has not been explored in previous research. This paper examines the knowledge collected in the Vir- tual Pet game and presents an approach to collect more knowledge driven by the existing relations in KB. In this paper, goal-oriented knowledge collection successfully draws 10572 answers for the "food” domain. The answers are verified by online voting to show that 92.07% of them are good sentences and 95.89% of them are new sentences. This result is a significant improvement over the original Virtual Pet game, with 80.58% good sentences and 67.56% weekly new information.
CrossBridge: Finding Analogies Using Dimensionality Reduction
Krishnamurthy, Jayant (Carnegie Mellon University) | Lieberman, Henry (MIT Media Laboratory)
We present CrossBridge, a practical algorithm for retrieving analogies in large, sparse semantic networks. Other algorithms adopt a generate-and-test approach, retrieving candidate analogies by superficial similarity of concepts, then testing them for the particular relations involved in the analogy. CrossBridge adopts a global approach. It organizes the entire knowledge space at once, as a matrix of small concept-and-relation subgraph patterns versus actual occurrences of subgraphs from the knowledge base. It uses the familiar mathematics of dimensionality reduction to reorganize this space along dimensions representing approximate semantic similarity of these subgraphs. Analogies can then be retrieved by simple nearest-neighbor comparison. CrossBridge also takes into account not only knowledge directly related to the source and target domains, but also a large background Commonsense knowledge base. Commonsense influences the mapping between domains, preserving important relations while ignoring others. This property allows CrossBridge to find more intuitive and extensible analogies. We compare our approach with an implementation of structure mapping and show that our algorithm consistently finds analogies in cases where structure mapping fails. We also present some discovered analogies.
Acquiring Common Sense Knowledge from Smart Environments
Barraquand, Rémi (INRIA Grenoble Rhones-Alpes Research Center) | Crowley, James (INRIA Grenoble Rhones-Alpes Research Center)
We present an approach for acquiring common sense knowledge from social interaction. We argue that social common sense should be learned from daily interactions using implicit user's feedbacks and requires shared understanding of social situations. A service-oriented architecture, inspired from cognitive science, that foster mutual understanding between a smart environment and its inhabitants is presented. The method makes use of ConceptNet to work with common sense knowledge. We are able to successfully use and learn common sense knowledge.
Cross-Domain Scruffy Inference
Arnold, Kenneth Charles (Massachusetts Institute of Technology) | Lieberman, Henry (Massachusetts Institute of Technology)
Reasoning about Commonsense knowledge poses many problems that traditional logical inference doesn't handle well. Among these is cross-domain inference: how to draw on multiple independently produced knowledge bases. Since knowledge bases may not have the same vocabulary, level of detail, or accuracy, that inference should be "scruffy." The AnalogySpace technique showed that a factored inference approach is useful for approximate reasoning over noisy knowledge bases like ConceptNet. A straightforward extension of factored inference to multiple datasets, called Blending, has seen productive use for commonsense reasoning. We show that Blending is a kind of Collective Matrix Factorization (CMF): the factorization spreads out the prediction loss between each dataset. We then show that blending additional data causes the singular vectors to rotate between the two domains, which enables cross-domain inference. We show, in a simplified example, that the maximum interaction occurs when the magnitudes (as defined by the largest singular values) of the two matrices are equal, confirming previous empirical conclusions. Finally, we describe and mathematically justify Bridge Blending, which facilitates inference between datasets by specifically adding knowledge that "bridges" between the two, in terms of CMF.
Significance of Classification Techniques in Prediction of Learning Disabilities
Balakrishnan, Julie M. David And Kannan
The aim of this study is to show the importance of two classification techniques, viz. decision tree and clustering, in prediction of learning disabilities (LD) of school-age children. LDs affect about 10 percent of all children enrolled in schools. The problems of children with specific learning disabilities have been a cause of concern to parents and teachers for some time. Decision trees and clustering are powerful and popular tools used for classification and prediction in Data mining. Different rules extracted from the decision tree are used for prediction of learning disabilities. Clustering is the assignment of a set of observations into subsets, called clusters, which are useful in finding the different signs and symptoms (attributes) present in the LD affected child. In this paper, J48 algorithm is used for constructing the decision tree and K-means algorithm is used for creating the clusters. By applying these classification techniques, LD in any child can be identified.
Learning under Concept Drift: an Overview
Concept drift refers to a non stationary learning problem over time. The training and the application data often mismatch in real life problems. In this report we present a context of concept drift problem 1. We focus on the issues relevant to adaptive training set formation. We present the framework and terminology, and formulate a global picture of concept drift learners design. We start with formalizing the framework for the concept drifting data in Section 1. In Section 2 we discuss the adaptivity mechanisms of the concept drift learners. In Section 3 we overview the principle mechanisms of concept drift learners. In this chapter we give a general picture of the available algorithms and categorize them based on their properties. Section 5 discusses the related research fields and Section 5 groups and presents major concept drift applications. This report is intended to give a bird's view of concept drift research field, provide a context of the research and position it within broad spectrum of research fields and applications.