Natural Language
Obtaining Hidden Relations from a Syntactically Annotated Corpus - From Word Relationships to Clause Relationships
Kruza, Oldrich (Charles University in Prague) | Kubon, Vladislav (Charles University in Prague)
The paper concentrates on obtaining hidden relationships among individual clauses of complex sentences from the Prague Dependency Treebank. The treebank contains only an information about mutual relationships among individual tokens (words, punctuation marks), not about more complex units (clauses). For the experiments with clauses and their parts (segments) it was therefore necessary to develop an automatic method transforming the original annotation into a scheme describing the syntactic relationships between clauses. The task was complicated by a certain degree of inconsistency in original annotation with regard to clauses and their structure. The paper describes the algorithm of deriving clause-related information from the existing annotation and its evaluation.
The Implementation of Arabic Subject Markers in the LKB System
Jebali, Adel (Université du Québec à Montréal)
Arabic Subject Markers are interface phenomena (specifically between morphology and syntax). In this paper, I describe them briefly, I give my linguistic analysis within the framework of the Head-Driven Phrase Structure Grammar and I show how I implement them in the LKB system. I show that this system, despite its strength, does not allow for a proper implementation of these units.
Effect of Tuned Parameters on a LSA MCQ Answering Model
Lifchitz, Alain, Jhean-Larose, Sandra, Denhière, Guy
This paper presents the current state of a work in progress, whose objective is to better understand the effects of factors that significantly influence the performance of Latent Semantic Analysis (LSA). A difficult task, which consists in answering (French) biology Multiple Choice Questions, is used to test the semantic properties of the truncated singular space and to study the relative influence of main parameters. A dedicated software has been designed to fine tune the LSA semantic space for the Multiple Choice Questions task. With optimal parameters, the performances of our simple model are quite surprisingly equal or superior to those of 7th and 8th grades students. This indicates that semantic spaces were quite good despite their low dimensions and the small sizes of training data sets. Besides, we present an original entropy global weighting of answers' terms of each question of the Multiple Choice Questions which was necessary to achieve the model's success.
Mining Meaning from Wikipedia
Medelyan, Olena, Milne, David, Legg, Catherine, Witten, Ian H.
Wikipedia is a goldmine of information; not just for its many readers, but also for the growing community of researchers who recognize it as a resource of exceptional scale and utility. It represents a vast investment of manual effort and judgment: a huge, constantly evolving tapestry of concepts and relations that is being applied to a host of tasks. This article provides a comprehensive description of this work. It focuses on research that extracts and makes use of the concepts, relations, facts and descriptions found in Wikipedia, and organizes the work into four broad categories: applying Wikipedia to natural language processing; using it to facilitate information retrieval and information extraction; and as a resource for ontology building. The article addresses how Wikipedia is being used as is, how it is being improved and adapted, and how it is being combined with other structures to create entirely new resources. We identify the research groups and individuals involved, and how their work has developed in the last few years. We provide a comprehensive list of the open-source software they have produced.
Switcher-random-walks: a cognitive-inspired mechanism for network exploration
Goñi, Joaquín, Martincorena, Iñigo, Corominas-Murtra, Bernat, Arrondo, Gonzalo, Ardanza-Trevijano, Sergio, Villoslada, Pablo
Semantic memory is the subsystem of human memory that stores knowledge of concepts or meanings, as opposed to life specific experiences. The organization of concepts within semantic memory can be understood as a semantic network, where the concepts (nodes) are associated (linked) to others depending on perceptions, similarities, etc. Lexical access is the complementary part of this system and allows the retrieval of such organized knowledge. While conceptual information is stored under certain underlying organization (and thus gives rise to a specific topology), it is crucial to have an accurate access to any of the information units, e.g. the concepts, for efficiently retrieving semantic information for real-time needings. An example of an information retrieval process occurs in verbal fluency tasks, and it is known to involve two different mechanisms: -clustering-, or generating words within a subcategory, and, when a subcategory is exhausted, -switching- to a new subcategory. We extended this approach to random-walking on a network (clustering) in combination to jumping (switching) to any node with certain probability and derived its analytical expression based on Markov chains. Results show that this dual mechanism contributes to optimize the exploration of different network models in terms of the mean first passage time. Additionally, this cognitive inspired dual mechanism opens a new framework to better understand and evaluate exploration, propagation and transport phenomena in other complex systems where switching-like phenomena are feasible.
AAAI-08 and IAAI-08 Conferences Provide Focal Point for AI
Hedberg, Sara Reese (Emergent, In.c)
This summer's AAAI Conference on Artificial Intelligence (AAAI-08) and its sister Conference on Innovative Applications of AI (IAAI-08) continued their long tradition of being a focal point of AI. This year's conferences were held in Chicago at the Hyatt Regency McCormick Place, July 13-17, 2008. The multidimensional conference offerings included nine invited talks, 251 technical papers, 22 innovative applications of AI papers, three competitions (poker, AI video, and general game playing), three special tracks (AI and the web, integrated intelligence, and physically grounded AI), 15 tutorials, 15 workshops, and 11 intelligent system demonstrations, as well as a number of awards, a doctoral consortium, student poster session and programs, and a vendor exhibit. This translated into a plethora of choices for the 921 conference attendees. An additional 175 people exclusively attended the tutorials, workshops, or exhibit.
Preference Handling - An Introductory Tutorial
Brafman, Ronen (Ben-Gurion University) | Domshlak, Carmel
Early work in AI focused on the notion of a goal--an explicit target that must be achieved--and this paradigm is still dominant in AI problem solving. But as application domains become more complex and realistic, it is apparent that the dichotomic notion of a goal, while adequate for certain puzzles, is too crude in general. The problem is that in many contemporary application domains, for example, information retrieval from large databases or the web, or planning in complex domains, the user has little knowledge about the set of possible solutions or feasible items, and what she or he typically seeks is the best that's out there. But since the user does not know what is the best achievable plan or the best available document or product, he or she typically cannot characterize it or its properties specifically. As a result, the user will end up either asking for an unachievable goal, getting no solution in response, or asking for too little, obtaining a solution that can be substantially improved. Of course, the user can gradually adjust the stated goals. This, however, is not a very appealing mode of interaction because the space of alternative solutions in such applications can be combinatorially huge, or even infinite. Moreover, such incremental goal refinement is simply infeasible when the goal must be supplied offline, as in the case of autonomous agents (whether on the web or on Mars).
Report on the First Conference on Artificial General Intelligence (AGI-08)
Garis, Hugo Roland de (Xiamen University) | Goertzel, Ben (Novamente LLC)
On a technical chaired by Sibley Verbeck (CEO of algorithmics hugely, for instance level, the work involved using a Electric Sheep Company); and the session we can now solve Boolean satisfaction logic-based AI system to control a humanoid on neural nets was chaired by problems with hundreds of virtual agent in the Second Randal Koene (a neuroscientist from thousands of variables. We can use automated Life virtual world, which interacted Boston University).
Accelerating and Evaluation of Syntactic Parsing in Natural Language Question Answering Systems
With the development of Natural Language Processing (NLP), more and more systems want to adopt NLP in User Interface Module to process user input, in order to communicate with user in a natural way. However, this raises a speed problem. That is, if NLP module can not process sentences in durable time delay, users will never use the system. As a result, systems which are strict with processing time, such as dialogue systems, web search systems, automatic customer service systems, especially real-time systems, have to abandon NLP module in order to get a faster system response. This paper aims to solve the speed problem. In this paper, at first, the construction of a syntactic parser which is based on corpus machine learning and statistics model is introduced, and then a speed problem analysis is performed on the parser and its algorithms. Based on the analysis, two accelerating methods, Compressed POS Set and Syntactic Patterns Pruning, are proposed, which can effectively improve the time efficiency of parsing in NLP module. To evaluate different parameters in the accelerating algorithms, two new factors, PT and RT, are introduced and explained in detail. Experiments are also completed to prove and test these methods, which will surely contribute to the application of NLP.