AITopics | Information Retrieval

Collaborating Authors

Information Retrieval

Our accustomed systems of retrieving particular bits of information no longer fill the needs of many people. Searching traditional indexes of print publications has been aided by computerized databases, but still usually requires time-consuming serial searching of one database after the other, and then moving on to other methods of searching for internet sources. And what if the information being sought is a sound byte? A video clip? Yesterday's e-mail exchange between respected scientists? Artificial intelligence may hold the key to information retrieval in an age where widely different formats contain the information being sought, and the universe of knowledge is simply too big and growing too rapidly for successful searching to proceed at a human's slow speed.

News Overviews Instructional Materials AI-Alerts Classics

Improved Asymmetric Locality Sensitive Hashing (ALSH) for Maximum Inner Product Search (MIPS)

Shrivastava, Anshumali, Li, Ping

arXiv.org Machine LearningNov-13-2014

Recently it was shown that the problem of Maximum Inner Product Search (MIPS) is efficient and it admits provably sub-linear hashing algorithms. Asymmetric transformations before hashing were the key in solving MIPS which was otherwise hard. In the prior work, the authors use asymmetric transformations which convert the problem of approximate MIPS into the problem of approximate near neighbor search which can be efficiently solved using hashing. In this work, we provide a different transformation which converts the problem of approximate MIPS into the problem of approximate cosine similarity search which can be efficiently solved using signed random projections. Theoretical analysis show that the new scheme is significantly better than the original scheme for MIPS. Experimental evaluations strongly support the theoretical findings.

information retrieval, machine learning, natural language, (19 more...)

arXiv.org Machine Learning

1410.541

Country: North America > United States (0.68)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Data Science (0.93)
Information Technology > Information Management > Search (0.70)
(2 more...)

Add feedback

Mining Large-Scale Knowledge Graphs to Discover Inference Paths for Query Expansion in NLIDB

Yeh, Peter Z. (Nuance Communications) | Ratnaparkhi, Adwait (Nuance Communications)

AAAI ConferencesNov-1-2014

In this paper, we present an approach to mine large-scale knowledge graphs to discover inference paths for query expansion in NLIDB (Natural Language Interface to Databases). Addressing this problem is important in order for NLIDB applications to effectively handle relevant concepts in the domain of interest that do not correspond to any structured fields in the target database. We also present preliminary observations on the performance of our approach applied to Freebase, and conclude with discussions on next steps to further evaluate and extend our approach.

artificial intelligence, information retrieval query processing, natural language, (18 more...)

AAAI Conferences

2014 AAAI Fall Symposium Series

Country: North America > United States > California > Santa Clara County > Sunnyvale (0.04)

Industry:

Leisure & Entertainment (0.69)
Media > Film (0.47)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Semantic Networks (0.64)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval > Query Processing (0.64)

Add feedback

Ontology-Based Translation of Natural Language Queries to SPARQL

Sander, Malte (Technische Universität (TU) München and Siemens AG) | Waltinger, Ulli (Siemens AG) | Roshchin, Mikhail (Siemens AG) | Runkler, Thomas (Technische Universität (TU) München and Siemens AG)

AAAI ConferencesNov-1-2014

We present an implemented approach to transform natural language sentences into SPARQL, using background knowledge from ontologies and lexicons. Therefore, eligible technologies and data storage possibilities are analyzed and evaluated. The contributions of this paper are twofold. Firstly, we describe the motivation and current needs for a natural language access to industry data. We describe several scenarios where the proposed solution is required. Resulting in an architectural approach based on automatic SPARQL query construction for effective natural language queries. Secondly, we analyze the performance of RDBMS, RDF and Triple Stores for the knowledge representation. The proposed approach will be evaluated on the basis of a query catalog by means of query efficiency, accuracy, and data storage performance. The results show, that natural language access to industry data using ontologies and lexicons, is a simple but effective approach to improve the diagnosis process and the data search for a broad range of users. Furthermore, virtual RDF graphs do support the DB-driven knowledge graph representation process, but do not perform efficient under industry conditions in terms of performance and scalability.

artificial intelligence, information retrieval query processing, natural language, (18 more...)

AAAI Conferences

2014 AAAI Fall Symposium Series

Country:

North America > United States (0.04)
Europe > Germany > North Rhine-Westphalia > Upper Bavaria > Munich (0.04)
Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Ontologies (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval > Query Processing (0.46)

Add feedback

Algorithm Selection for Combinatorial Search Problems: A Survey

Kotthoff, Lars (University College Cork)

AI MagazineSep-29-2014

The algorithm selection problem is concerned with selecting the best algorithm to solve a given problem instance on a case-by-case basis. It has become especially relevant in the last decade, with researchers increasingly investigating how to identify the most suitable existing algorithm for solving a problem instance instead of developing new algorithms. This survey presents an overview of this work focusing on the contributions made in the area of combinatorial search problems, where algorithm selection techniques have achieved significant performance improvements. We unify and organise the vast literature according to criteria that determine algorithm selection systems in practice.

algorithm selection, artificial intelligence, survey article, (6 more...)

AI Magazine

Genre: Overview (0.69)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.67)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.67)

Add feedback

Topic Similarity Networks: Visual Analytics for Large Document Sets

Maiya, Arun S., Rolfe, Robert M.

arXiv.org Machine LearningSep-26-2014

We investigate ways in which to improve the interpretability of LDA topic models by better analyzing and visualizing their outputs. We focus on examining what we refer to as topic similarity networks: graphs in which nodes represent latent topics in text collections and links represent similarity among topics. We describe efficient and effective approaches to both building and labeling such networks. Visualizations of topic models based on these networks are shown to be a powerful means of exploring, characterizing, and summarizing large collections of unstructured text documents. They help to "tease out" non-obvious connections among different sets of documents and provide insights into how topics form larger themes. We demonstrate the efficacy and practicality of these approaches through two case studies: 1) NSF grants for basic research spanning a 14 year period and 2) the entire English portion of Wikipedia.

information retrieval, machine learning, natural language, (17 more...)

arXiv.org Machine Learning

1409.7591

Country:

North America > United States (1.00)
North America > Canada > Alberta (0.28)

Genre: Research Report (0.82)

Industry:

Leisure & Entertainment > Sports (1.00)
Health & Medicine (0.93)
Government > Regional Government > North America Government > United States Government (0.68)
Media (0.68)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Communications (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
(3 more...)

Add feedback

Solving the Target-Value Search Problem

López, Carlos Linares (Universidad Carlos III de Madrid) | Stern, Roni (Ben Gurion University) | Felner, Ariel (Ben Gurion University)

AAAI ConferencesJul-22-2014

This paper addresses the Target-Value Search (TVS) problem, which is the problem of finding a path between two nodes in a graph whose cost is as close as possible to a given target value, T. This problem has been previously addressed: first, for directed acyclic graphs; second, for general graphs under the assumption that nodes can be revisited given that the same edge can not be traversed twice. In this work we focus on a more restrictive variant of the same problem where nodes can not be revisited. We prove that this variant is NP-complete and discuss novel theoretical properties and provide empirical results to solve this problem optimally.

target-value search problem

AAAI Conferences

Seventh Annual Symposium on Combinatorial Search

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.40)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.40)

Add feedback

CiteSeerX: AI in a Digital Library Search Engine

Wu, Jian (The Pennsylvania State University) | Williams, Kyle (The Pennsylvania State University) | Chen, Hung-Hsuan (The Pennsylvania State University) | Khabsa, Madian (The Pennsylvania State University) | Caragea, Cornelia (University of North Texas) | Ororbia, Alexander (The Pennsylvania State University) | Jordan, Douglas (The Pennsylvania State University) | Giles, C. Lee (The Pennsylvania State University)

AAAI ConferencesJul-14-2014

CiteSeerX is a digital library search engine that provides access to more than 4 million academic documents with nearly a million users and millions of hits per day. Artificial intelligence (AI) technologies are used in many components of CiteSeerX, e.g. to accurately extract metadata, intelligently crawl the web, and ingest documents. We present key AI technologies used in the following components: document classification and deduplication, document and citation clustering, automatic metadata extraction and indexing, and author disambiguation. These AI technologies have been developed by CiteSeerX group members over the past 5–6 years. We also show the usage status, payoff, development challenges, main design concepts, and deployment and maintenance requirements. While it is challenging to rebuild a system like CiteSeerX from scratch, many of these AI technologies are transferable to other digital libraries and/or search engines.

digital library search engine, information retrieval, natural language, (3 more...)

AAAI Conferences

Twenty-Sixth IAAI Conference

Technology:

Information Technology > Information Management > Metadata Management (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.80)

Add feedback

Cost-Based Query Optimization via AI Planning

Robinson, Nathan (Australian National University) | McIlraith, Sheila (University of Toronto) | Toman, David (University of Waterloo)

AAAI ConferencesJul-14-2014

In this paper we revisit the problem of generating query plans using AI automated planning with a view to leveraging significant recent advances in state-of-the-art planning techniques. Our efforts focus on the specific problem of cost-based join-order optimization for conjunctive relational queries, a critical component of production-quality query optimizers. We characterize the general query-planning problem as a delete-free planning problem, and query plan optimization as a context-sensitive cost-optimal planning problem. We propose algorithms that generate high-quality query plans, guaranteeing optimality under certain conditions. Our approach is general, supporting the use of a broad suite of domain-independent and domain-specific optimization criteria. Experimental results demonstrate the effectiveness of AI planning techniques for query plan generation and optimization.

artificial intelligence, information retrieval query processing, natural language, (20 more...)

AAAI Conferences

Twenty-Eighth AAAI Conference on Artificial Intelligence

Country:

North America > Canada > Ontario > Toronto (0.14)
Oceania > Australia > Australian Capital Territory > Canberra (0.04)
North America > United States > California > Los Angeles County > Long Beach (0.04)
North America > Canada > Ontario > Waterloo Region > Waterloo (0.04)

Genre: Research Report > New Finding (0.88)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval > Query Processing (1.00)

Add feedback

Extracting Keyphrases from Research Papers Using Citation Networks

Gollapalli, Sujatha Das (University of North Texas) | Caragea, Cornelia (University of North Texas)

AAAI ConferencesJul-14-2014

Keyphrases for a document concisely describe the document using a small set of phrases. Keyphrases were previously shown to improve several document processing and retrieval tasks. In this work, we study keyphrase extraction from research papers by leveraging citation networks. We propose CiteTextRank for keyphrase extraction from research articles, a graph-based algorithm that incorporates evidence from both a document's content as well as the contexts in which the document is referenced within a citation network. Our model obtains significant improvements over the state-of-the-art models for this task. Specifically, on several datasets of research papers, CiteTextRank improves precision at rank 1 by as much as 9-20% over state-of-the-art baselines.

data mining, information retrieval, machine learning, (19 more...)

AAAI Conferences

Twenty-Eighth AAAI Conference on Artificial Intelligence

Country:

North America > United States > Texas (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
North America > United States > Maryland (0.04)

Genre: Research Report > Promising Solution (0.48)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Information Management > Search (0.95)
(2 more...)

Add feedback

Learning Concept Embeddings for Query Expansion by Quantum Entropy Minimization

Sordoni, Alessandro (Université de Montréal) | Bengio, Yoshua (Université de Montréal) | Nie, Jian-Yun (Université de Montréal)

AAAI ConferencesJul-14-2014

In web search, users queries are formulated using only few terms and term-matching retrieval functions could fail at retrieving relevant documents. Given a user query, the technique of query expansion (QE) consists in selecting related terms that could enhance the likelihood of retrieving relevant documents. Selecting such expansion terms is challenging and requires a computational framework capable of encoding complex semantic relationships. In this paper, we propose a novel method for learning, in a supervised way, semantic representations for words and phrases. By embedding queries and documents in special matrices, our model disposes of an increased representational power with respect to existing approaches adopting a vector representation. We show that our model produces high-quality query expansion terms. Our expansion increase IR measures beyond expansion from current word-embeddings models and well-established traditional QE methods.

artificial intelligence, natural language, text processing, (16 more...)

AAAI Conferences

Twenty-Eighth AAAI Conference on Artificial Intelligence

Country:

North America > United States > New York > New York County > New York City (0.05)
North America > Canada > Quebec > Montreal (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report > New Finding (0.46)

Industry: Automobiles & Trucks > Manufacturer (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval > Query Processing (0.93)

Add feedback