Goto

Collaborating Authors

 Education


Effect of Tuned Parameters on a LSA MCQ Answering Model

arXiv.org Artificial Intelligence

This paper presents the current state of a work in progress, whose objective is to better understand the effects of factors that significantly influence the performance of Latent Semantic Analysis (LSA). A difficult task, which consists in answering (French) biology Multiple Choice Questions, is used to test the semantic properties of the truncated singular space and to study the relative influence of main parameters. A dedicated software has been designed to fine tune the LSA semantic space for the Multiple Choice Questions task. With optimal parameters, the performances of our simple model are quite surprisingly equal or superior to those of 7th and 8th grades students. This indicates that semantic spaces were quite good despite their low dimensions and the small sizes of training data sets. Besides, we present an original entropy global weighting of answers' terms of each question of the Multiple Choice Questions which was necessary to achieve the model's success.


Mining Meaning from Wikipedia

arXiv.org Artificial Intelligence

Wikipedia is a goldmine of information; not just for its many readers, but also for the growing community of researchers who recognize it as a resource of exceptional scale and utility. It represents a vast investment of manual effort and judgment: a huge, constantly evolving tapestry of concepts and relations that is being applied to a host of tasks. This article provides a comprehensive description of this work. It focuses on research that extracts and makes use of the concepts, relations, facts and descriptions found in Wikipedia, and organizes the work into four broad categories: applying Wikipedia to natural language processing; using it to facilitate information retrieval and information extraction; and as a resource for ontology building. The article addresses how Wikipedia is being used as is, how it is being improved and adapted, and how it is being combined with other structures to create entirely new resources. We identify the research groups and individuals involved, and how their work has developed in the last few years. We provide a comprehensive list of the open-source software they have produced.


Sentence Compression as Tree Transduction

Journal of Artificial Intelligence Research

This paper presents a tree-to-tree transduction method for sentence compression. Our model is based on synchronous tree substitution grammar, a formalism that allows local distortion of the tree topology and can thus naturally capture structural mismatches. We describe an algorithm for decoding in this framework and show how the model can be trained discriminatively within a large margin framework. Experimental results on sentence compression bring significant improvements over a state-of-the-art model.


Induction of High-level Behaviors from Problem-solving Traces using Machine Learning Tools

arXiv.org Machine Learning

Many learning environments are able to store very detailed traces of students' activities thus producing huge sets of low-level data. However, identifying high-level behaviors from these data is not straightforward, especially if the concepts of the domain knowledge are not explicitly encoded together with the corresponding traces. In this paper we present a general approach that aims at discovering patterns of student behaviors. Its principles are applicable whenever the information carried by the traces may be split as finite sequences of {initial state, final state} pairs, where the final states are the result of basic student transformations performed on the corresponding initial states. Within this context, final states are the initial states of subsequent {initial state, final state} pairs (unless they are at the end of the sequence).


Wikipedia-based Semantic Interpretation for Natural Language Processing

Journal of Artificial Intelligence Research

Adequate representation of natural language semantics requires access to vast amounts of common sense and domain-specific world knowledge. Prior work in the field was based on purely statistical techniques that did not make use of background knowledge, on limited lexicographic knowledge bases such as WordNet, or on huge manual efforts such as the CYC project. Here we propose a novel method, called Explicit Semantic Analysis (ESA), for fine-grained semantic interpretation of unrestricted natural language texts. Our method represents meaning in a high-dimensional space of concepts derived from Wikipedia, the largest encyclopedia in existence. We explicitly represent the meaning of any text in terms of Wikipedia-based concepts. We evaluate the effectiveness of our method on text categorization and on computing the degree of semantic relatedness between fragments of natural language text. Using ESA results in significant improvements over the previous state of the art in both tasks. Importantly, due to the use of natural concepts, the ESA model is easy to explain to human users.


A Stochastic View of Optimal Regret through Minimax Duality

arXiv.org Machine Learning

We study the regret of optimal strategies for online convex optimization games. Using von Neumann's minimax theorem, we show that the optimal regret in this adversarial setting is closely related to the behavior of the empirical minimization algorithm in a stochastic process setting: it is equal to the maximum, over joint distributions of the adversary's action sequence, of the difference between a sum of minimal expected losses and the minimal empirical loss. We show that the optimal regret has a natural geometric interpretation, since it can be viewed as the gap in Jensen's inequality for a concave functional--the minimizer over the player's actions of expected loss--defined on a set of probability distributions. We use this expression to obtain upper and lower bounds on the regret of an optimal strategy for a variety of online learning problems. Our method provides upper bounds without the need to construct a learning algorithm; the lower bounds provide explicit optimal strategies for the adversary.


Online Multi-task Learning with Hard Constraints

arXiv.org Machine Learning

We discuss multi-task online learning when a decision maker has to deal simultaneously with M tasks. The tasks are related, which is modeled by imposing that the M-tuple of actions taken by the decision maker needs to satisfy certain constraints. We give natural examples of such restrictions and then discuss a general class of tractable constraints, for which we introduce computationally efficient ways of selecting actions, essentially by reducing to an on-line shortest path problem. We briefly discuss "tracking" and "bandit" versions of the problem and extend the model in various ways, including non-additive global losses and uncountably infinite sets of tasks.


AAAI-08 and IAAI-08 Conferences Provide Focal Point for AI

AI Magazine

This year's conferences were held in Perhaps one of the true litmus tests of any conference is the caliber of the invited speakers. Sensibility: Sentiment Analysis, Opinion and research manager at Microsoft Research) The distinguished Robert S. Englemore Mining, and the Computational who gave his AAAI presidential Memorial Award Lecture was delivered Treatment of Subjective Language"), address, "Artificial Intelligence in the by Kenneth Ford (Florida Institute while Seth C. Goldstein (Carnegie Open World." Mel lon University) discussed revolutionary Chris Urmson (Carnegie Mellon In his lecture, "Toward Cognitive work in self-reconfiguring programmable University), a leading member of the Prostheses," Ford discussed human-centered matter composed of ensembles of submillimeter robots in his DARPA Urban Grand Challenge winning computing to amplify talk, "Realizing Claytronics: A Challenge team, described the race and winning human cognition and perception. Instead of the learning for network analysis in ("From Images to Scenes: Using popular competition, which has his talk, "Making Sense of Complex Lots of Data to Infer Geometric, Photometric, pushed the envelope of mobile robotics Networks." David Haussler (University and Semantic Scene Properties since its inception, this year was of California, Santa Cruz) traced the from a Single Image"), and Lillian host to a Robot Workshop and Exhibition.


Report on the Fourth International Conference on Knowledge Capture (K-CAP 2007)

AI Magazine

The Fourth International Conference on Knowledge Capture was held October 28-31, 2007, in Whistler, British Columbia. The topics covered in the invited talks, technical papers, posters, and demonstrations included knowledge engineering and modeling methodologies, knowledge engineering and the semantic web, mixedinitiative planning and decision-support tools, acquisition of problem-solving knowledge, knowledge-based markup techniques, knowledge extraction systems, knowledge acquisition tools, and advice-taking systems. These events, which were from web-based game-playing systems. The title of his talk was "Human Ken Barker and John Gennari Derek Sleeman noted in his introductory Etzioni's invited talk and had primary responsibilities for comments, knowledge capture is gave some technical details of the systems the conference and workshop programs. In the The best technical paper Since the K-CAP series was initiated, last decade or so, knowledge capture award was presented to Kai Eckert, the K-CAP and European Knowledge has again expanded its horizons significantly Heiner Stuckenschmidt, and Magnus Acquisition Workshop (EKAW) meetings to embrace information-extraction Pfeffer for their paper "Interactive have been held in alternate years, techniques, and more recently Thesaurus Assessment for Automatic with the K-CAP meetings taking place the web and enhanced connectivity Document Annotation."


The AAAI 2008 Robotics and Creativity Workshop

AI Magazine

Developments in mechanical control and complex motion planning have enabled robots to become almost commonplace in situations requiring precise but menial, tedious, and repetitive tasks. Recent robotics research has targeted the mechanical and computational challenges inherent in performing a much broader range of tasks autonomously. These problems are less well-defined, requiring greater intelligence, commonsense reasoning, and oftentimes novel solutions. By most definitions, creativity (the generation of novel and useful ideas) is necessary for intelligence; thus research efforts focusing on robotics and creativity are also efforts toward artificial intelligence. As robots and computer physical systems become more capable, they are increasingly useful in the study of creativity itself.