Goto

Collaborating Authors

 Natural Language


Towards Large-Scale Collaborative Planning: Answering High-Level Search Queries Using Human Computation

AAAI Conferences

Behind every search query is a high-level mission that the user wants to accomplish.ย  While current search engines can often provide relevant information in response to well-specified queries, they place the heavy burden of making a plan for achieving a mission on the user. We take the alternative approach of tackling users' high-level missions directly by introducing a human computation system that generates simple plans, by decomposing a mission into goals and retrieving search results tailored to each goal. Results show that our system is able to provide users with diverse, actionable search results and useful roadmaps for accomplishing their missions.


Creative Introspection and Knowledge Acquisition

AAAI Conferences

Introspection is a question-led process in which one builds on what one already knows to explore what is possible and plausible. In creative introspection, whether in art or in science, framing the right question is as important as finding the right answer. Presupposition-laden questions are themselves a source of knowledge, and in this paper we show how widely-held beliefs about the world can be dynamically acquired by harvesting such questions from the Web. We show how metaphorical reasoning can be modeled as an introspective process, one that builds on questions harvested from the Web to pose further speculative questions and queries. Metaphor is much more than a knowledge-hungry rhetorical device: it is a conceptual lever that allows a system to extend its model of the world.


Semantic Relatedness Using Salient Semantic Analysis

AAAI Conferences

Semantic relatedness is the task of finding and quantifying Knowledge-based measures such as L&C (Leacock the strength of the semantic connections that exist between and Chodorow 1998), Lesk (Lesk 1986), Wu&Palmer (Wu textual units, be they word pairs, sentence pairs, or document and Palmer 1994), Resnik (Resnik 1995), J&C (Jiang and pairs. For instance, one may want to determine how Conrath 1997), H&S (Hirst and St Onge 1998), and many semantically related are car and automobile, ornoon and others, employ information extracted from manually constructed string. To make such a judgment, we rely on our accumulated lexical taxonomies like Wordnet (Fellbaum 1998), knowledge and experiences, and utilize our ability Roget (Jarmasz 2003), and Wiktionary (Zesch, Muller, and of conceptual thinking, abstraction, and generalization.


Finding Answers and Generating Explanations for Complex Biomedical Queries

AAAI Conferences

Some of these complex queries, such as Q1 or Q2, Recent advances in health and life sciences have led to generation can be represented in a formal query language (e.g., of a large amount of biomedical data. To facilitate access SQL/SPARQL) and then answered using Semantic Web to its desired parts, such a big mass of data has been represented technologies. However, queries, like Q4, that require auxiliary in structured forms, like biomedical ontologies and recursive definitions (such as transitive closure) cannot databases. On the other hand, representing these biomedical be directly represented in these languages; and thus such ontologies and databases in different forms, constructing queries cannot be answered directly using Semantic Web them independently from each other, and storing them at technologies. The experts usually compute auxiliary relations different locations have brought about many challenges for externally, for instance, by enumerating all drug-drug answering queries about the knowledge represented in these interaction chains or gene cliques, and then use these auxiliary ontologies and databases.


Identifying Evaluative Sentences in Online Discussions

AAAI Conferences

Much of opinion mining research focuses on product reviews because reviews are opinion-rich and contain little irrelevant information. However, this cannot be said about online discussions and comments. In such postings, the discussions can get highly emotional and heated with many emotional statements, and even personal attacks. As a result, many of the postings and sentences do not express positive or negative opinions about the topic being discussed. To find peopleโ€™s opinions on a topic and its different aspects, which we call evaluative opinions, those irrelevant sentences should be removed. The goal of this research is thus to identify evaluative opinion sentences. A novel unsupervised approach is proposed to solve the problem, and our experimental results show that it performs well.


SemRec: A Semantic Enhancement Framework for Tag Based Recommendation

AAAI Conferences

Collaborative tagging services provided by various social web sites become popular means to mark web resources for different purposes such as categorization, expression of a preference and so on. However, the tags are of syntactic nature, in a free style and do not reflect semantics, resulting in the problems of redundancy, ambiguity and less semantics. Current tag-based recommender systems mainly take the explicit structural information among users, resources and tags into consideration, while neglecting the important implicit semantic relationships hidden in tagging data. In this study, we propose a Semantic Enhancement Recommendation strategy (SemRec), based on both structural information and semantic information through a unified fusion model. Extensive experiments conducted on two real datasets demonstarte the effectiveness of our approaches.


Cross-Language Latent Relational Search: Mapping Knowledge across Languages

AAAI Conferences

Latent relational search (LRS) is a novel approach for mapping knowledge across two domains. Given a source domain knowledge concerning the Moon, "The Moon is a satellite of the Earth," one can form a question {(Moon, Earth), (Ganymede, ?)} to query an LRS engine for new knowledge in the target domain concerning the Ganymede. An LRS engine relies on some supporting sentences such as ``Ganymede is a natural satellite of Jupiter.'' to retrieve and rank "Jupiter" as the first answer. This paper proposes cross-language latent relational search (CLRS) to extend the knowledge mapping capability of LRS from cross-domain knowledge mapping to cross-domain and cross-language knowledge mapping. In CLRS, the supporting sentences for the source pair might be in a different language with that of the target pair. We represent the relation between two entities in an entity pair by lexical patterns of the context surrounding the two entities. We then propose a novel hybrid lexical pattern clustering algorithm to capture the semantic similarity between paraphrased lexical patterns across languages. Experiments on Japanese-English datasets show that the proposed method achieves an MRR of 0.579 for CLRS task, which is comparable to the MRR of an existing monolingual LRS engine.


Leveraging Wikipedia Characteristics for Search and Candidate Generation in Question Answering

AAAI Conferences

Most existing Question Answering (QA) systems adopt a type-and-generate approach to candidate generation that relies on a pre-defined domain ontology. This paper describes a type independent search and candidate generation paradigm for QA that leverages Wikipedia characteristics. This approach is particularly useful for adapting QA systems to domains where reliable answer type identification and type-based answer extraction are not available. We present a three-pronged search approach motivated by relations an answer-justifying title-oriented document may have with the question/answer pair. We further show how Wikipedia metadata such as anchor texts and redirects can be utilized to effectively extract candidate answers from search results without a type ontology. Our experimental results show that our strategies obtained high binary recall in both search and candidate generation on TREC questions, a domain that has mature answer type extraction technology, as well as on Jeopardy! questions, a domain without such technology. Our high-recall search and candidate generation approach has also led to high overall QA performance in Watson, our end-to-end system.


Understanding Natural Language Commands for Robotic Navigation and Mobile Manipulation

AAAI Conferences

This paper describes a new model for understanding natural language commands given to autonomous systems that perform navigation and mobile manipulation in semi-structured environments. Previous approaches have used models with fixed structure to infer the likelihood of a sequence of actions given the environment and the command. In contrast, our framework, called Generalized Grounding Graphs, dynamically instantiates a probabilistic graphical model for a particular natural language command according to the command's hierarchical and compositional semantic structure. Our system performs inference in the model to successfully find and execute plans corresponding to natural language commands such as "Put the tire pallet on the truck." The model is trained using a corpus of commands collected using crowdsourcing. We pair each command with robot actions and use the corpus to learn the parameters of the model. We evaluate the robot's performance by inferring plans from natural language commands, executing each plan in a realistic robot simulator, and asking users to evaluate the system's performance. We demonstrate that our system can successfully follow many natural language commands from the corpus.


Transfer Latent Semantic Learning: Microblog Mining with Less Supervision

AAAI Conferences

The increasing volume of information generated on micro-blogging sites such as Twitter raises several challenges to traditional text mining techniques. First, most texts from those sites are abbreviated due to the constraints of limited characters in one post; second, the input usually comes in streams of large-volumes. Therefore, it is of significant importance to develop effective and efficient representations of abbreviated texts for better filtering and mining. In this paper, we introduce a novel transfer learning approach, namely transfer latent semantic learning, that utilizes a large number of related tagged documents with rich information from other sources (source domain) to help build a robust latent semantic space for the abbreviated texts (target domain). This is achieved by simultaneously minimizing the document reconstruction error and the classification error of the labeled examples from the source domain by building a classifier with hinge loss in the latent semantic space. We demonstrate the effectiveness of our method by applying them to the task of classifying and tagging abbreviated texts. Experimental results on both synthetic datasets and real application datasets, including Reuters-21578 and Twitter data, suggest substantial improvements using our approach over existing ones.