Asia
Co-Occurrence-Based Error Correction Approach to Word Segmentation
Chaowicharat, Ekawat (Mahidol University) | Naruedomkul, Kanlaya (Mahidol University)
To overcome the problems in Thai word segmentation, a number of word segmentation has been proposed during the long period of time until today. We propose a novel Thai word segmentation approach so called Co-occurrence-Based Error Correction (CBEC). CBEC generates all possible segmentation candidates using the classical maximal matching algorithm and then selects the most accurate segmentation based on co-occurrence and an error correction algorithm. CBEC was trained and evaluated on BEST 2009 corpus.
Commonsense Knowledge Extraction Using Concepts Properties
Blanco, Eduardo (The University of Texas at Dallas) | Cankaya, Hakki (Izmir University of Economics) | Moldovan, Dan (The University of Texas at Dallas)
This paper presents a semantically grounded method for extracting commonsense knowledge. First, commonsense rules are identified, e.g., one cannot see imaginary objects. Second, those rules are combined with a basic semantic representation in order to infer commonsense knowledge facts, e.g. one cannot see a flying carpet. Further combinations of semantic relations with inferred commonsense facts are proposed and analyzed. Results show that this novel method is able to extract thousands of commonsense facts with little human interaction and high accuracy.
Shared Experiences, Shared Representations, and the Implications for Applied Natural Language Processing
Stent, Amanda J. (AT&T Labs &ndash)
When people interact with language-producing agents (other people or computers), they assume that the shared experience leads to shared representations — of the world, the interaction, and the language used in the interaction. This phenomenon occurs even during interaction with systems that give no evidence of building shared representations. The absence of shared representations leads to errors and delays; alternatively, even simple shared representations can lead to reduced error rates and more efficient interaction. In this talk, we present three case studies: a mobile local business search application that builds no interaction representations; a telephone-based recommendation and review system that builds limited representations of the shared language in the interaction; and computer models of coreference that use shared representations to permit both coreference resolution and referring expression generation. We lay out a range of possibilities for shared representations, show that they can be built incrementally as an interaction progresses, and point to possibilities for future work in probabilistic shared representations for interactive systems.
Reasoning with Annotations of Texts
Ma, Yue (Université) | Lévy, François (Paris13-CNRS) | Ghimire, Sudeep (Université)
Linguistic and semantic annotations are important features for text-based applications. However, achieving and maintaining a good quality of a set of annotations is known to be a complex task. Many ad hoc approaches have been developed to produce various types of annotations, while comparing those annotations to improve their quality is still rare. In this paper, we propose a framework in which both linguistic and domain information can cooperate to reason with annotations. The underlying knowledge representation issues are carefully analyzed and solved by studying a higher order logic, which accounts for the cooperation of different sorts of knowledge. Our prototype implements this logic based on a reduction to classical description logics by preserving the semantics, allowing us to benefit from cutting-edge Semantic Web reasoners. An application scenario shows interesting merits of this framework on reasoning with annotations of texts.
Mapping Syntactic to Semantic Generalizations of Linguistic Parse Trees
Galitsky, Boris Lluis de la (University of Girona) | Rose, Josep Lluis Lluis de la de la (University of Girona) | Dobrocsi, Gabor Lluis de la (University of Miskolc Miskolc)
We define sentence generalization and generalization diagrams as a special case of least general generalization (LGG) as applied to linguistic parse trees. Similarity measure between linguistic parse trees is developed as LGG operation on the lists of sub-trees of these trees. The diagrams introduced are representation of mapping between the syntactic generalization level and semantic generalization level. Generalization diagrams are intended as a framework to compute semantic similarity between texts relying on linguistic parse tree data. Such structured approach significantly improves text relevance assessment in a horizontal domain, where ontologies are not available
EcoLexicon and FunGramKB: Applying COREL to Domain-Specific Knowledge
Araúz, Pilar León (University of Granada) | Reimerink, Arianne (University of Granada)
EcoLexicon is a multilingual terminological knowledge base (TKB) on the environment. It is currently being converted into a domain-specific ontology, however, ontological properties are modelled according to surface semantics. For this reason, we are integrating our TKB in the form of a “satellite ontology” into FunGramKB, a multipurpose knowledge base specifically designed for natural language understanding. We explain how the dynamism of environmental concepts can benefit from a formal description in meaning postulates and their inclusion in FunGramKB Cognicon scripts. This would lead to the automatic generation of flexible conceptual networks and definitional templates across different contexts.
Cognitive Load Theory: Implications for Affective Computing
Kalyuga, Slava (University of New South Wales)
It has been also demonstrated that emotional In its basic underpinning assumptions, cognitive load states (e.g., negative mood or anxiety) directly influence theory relies on the analogy between the information cognitive task performance and the operation of working processing aspects of evolution by natural selection and memory, while less evidence exists about the effect of the human cognition (Sweller & Sweller, 2006). It considers emotional content of the processed information (e.g., both biological evolution and human cognition as Kensinger & Corkin, 2003).
Consensus Clustering + Meta Clustering = Multiple Consensus Clustering
Zhang, Yi (Florida International University) | Li, Tao (Florida International University)
Consensus clustering and meta clustering are two important extensions of the classical clustering problem. Given a set of input clusterings of a given dataset, consensus clustering aims to find a single final clustering which is a better fit in some sense than the existing clusterings, and meta clustering aims to group similar input clusterings together so that users only need to examine a small number of different clusterings. In this paper, we present a new approach, MCC (stands for multiple consensus clustering), to explore multiple clustering views of a given dataset from the input clusterings by combining consensus clustering and meta clustering. In particular, given a set of input clusterings of a particular data set, MCC employs meta clustering to cluster the input clusterings and then uses consensus clustering to generate a consensus for each cluster of the input clusterings. Extensive experimental results on 11 real world data sets demonstrate the effectiveness of our proposed method.
Prime Normal Forms in Belief Merging
Marchi, Jerusa (Universidade Federal de Santa Catarina) | Perrussel, Laurent (Institut de Recherche en Informatique de Toulouse)
The aim of Belief Merging is to aggregate possibly conflicting pieces of information issued from different sources. The quality of the resulting set is usually considered in terms of a closeness criterion between the resulting belief set and the initial belief sets. The notion of distance between belief sets is thus a crucial issue when we face the merging problem. The aim of this paper is twofold: introducing a syntactical way to calculate distances and proposing the use of a distance based on prime implicants and prime implicates that considers the importance of each propositional symbol in the belief set.