AITopics | Information Retrieval

Collaborating Authors

Information Retrieval

Our accustomed systems of retrieving particular bits of information no longer fill the needs of many people. Searching traditional indexes of print publications has been aided by computerized databases, but still usually requires time-consuming serial searching of one database after the other, and then moving on to other methods of searching for internet sources. And what if the information being sought is a sound byte? A video clip? Yesterday's e-mail exchange between respected scientists? Artificial intelligence may hold the key to information retrieval in an age where widely different formats contain the information being sought, and the universe of knowledge is simply too big and growing too rapidly for successful searching to proceed at a human's slow speed.

News Overviews Instructional Materials AI-Alerts Classics

Recommendation as Collaboration in Web Search

Smyth, Barry (CLARITY: Centre for Sensor Web Technologies) | Freyne, Jill (Tasmanian ICT Centre, CSIRO) | Coyle, Maurice (HeyStaks Technologies Limited) | Briggs, Peter (HeyStaks Technologies Limited)

AI MagazineOct-31-2011

Recommender systems now play an important role in online information discovery, complementing traditional approaches such as search and navigation, with a more proactive approach to discovery that is informed by the users interests and preferences. To date recommender systems have been deployed within a variety of e-commerce domains, covering a range of products such as books, music, movies, and have proven to be a successful way to convert browsers into buyers. Recommendation technologies have a potentially much greater role to play in information discovery however and in this article we consider recent research that takes a fresh look at web search as a fertile platform for recommender systems research as users demand a new generation of search engines that are less susceptible to manipulation and more responsive to searcher needs and preferences.

artificial intelligence, information management, recommendation, (8 more...)

AI Magazine

Industry: Information Technology > Services (0.68)

Technology:

Information Technology > Information Management > Search (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (1.00)

Add feedback

Query-time Entity Resolution

Bhattacharya, I., Getoor, L.

arXiv.org Artificial IntelligenceOct-31-2011

Entity resolution is the problem of reconciling database references corresponding to the same real-world entities. Given the abundance of publicly available databases that have unresolved entities, we motivate the problem of query-time entity resolution quick and accurate resolution for answering queries over such unclean databases at query-time. Since collective entity resolution approaches --- where related references are resolved jointly --- have been shown to be more accurate than independent attribute-based resolution for off-line entity resolution, we focus on developing new algorithms for collective resolution for answering entity resolution queries at query-time. For this purpose, we first formally show that, for collective resolution, precision and recall for individual entities follow a geometric progression as neighbors at increasing distances are considered. Unfolding this progression leads naturally to a two stage expand and resolve query processing strategy. In this strategy, we first extract the related records for a query using two novel expansion operators, and then resolve the extracted records collectively. We then show how the same strategy can be adapted for query-time entity resolution by identifying and resolving only those database references that are the most helpful for processing the query. We validate our approach on two large real-world publication databases where we show the usefulness of collective resolution and at the same time demonstrate the need for adaptive strategies for query processing. We then show how the same queries can be answered in real-time using our adaptive approach while preserving the gains of collective resolution. In addition to experiments on real datasets, we use synthetically generated data to empirically demonstrate the validity of the performance trends predicted by our analysis of collective entity resolution over a wide range of structural characteristics in the data.

entity resolution, query, resolution, (16 more...)

arXiv.org Artificial Intelligence

doi: 10.1613/jair.2290

1111.0045

Country:

North America > United States > Maryland > Prince George's County > College Park (0.14)
North America > United States > District of Columbia > Washington (0.04)
North America > United States > Washington > King County > Seattle (0.04)
(15 more...)

Genre: Research Report (1.00)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.88)

Add feedback

A Comparison of Different Machine Transliteration Models

Choi, K., Isahara, H., Oh, J.

arXiv.org Artificial IntelligenceOct-6-2011

Machine transliteration is a method for automatically converting words in one language into phonetically equivalent ones in another language. Machine transliteration plays an important role in natural language applications such as information retrieval and machine translation, especially for handling proper nouns and technical terms. Four machine transliteration models -- grapheme-based transliteration model, phoneme-based transliteration model, hybrid transliteration model, and correspondence-based transliteration model -- have been proposed by several researchers. To date, however, there has been little research on a framework in which multiple transliteration models can operate simultaneously. Furthermore, there has been no comparison of the four models within the same framework and using the same data. We addressed these problems by 1) modeling the four models within the same framework, 2) comparing them under the same conditions, and 3) developing a way to improve machine transliteration through this comparison. Our comparison showed that the hybrid and correspondence-based models were the most effective and that the four models can be used in a complementary manner to improve machine transliteration performance.

information retrieval, machine learning, transliteration, (20 more...)

arXiv.org Artificial Intelligence

doi: 10.1613/jair.1999

1110.1391

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
North America > United States > New York (0.04)
Asia > Japan > Honshū > Kansai > Kyoto Prefecture > Kyoto (0.04)
Asia > South Korea > Daejeon > Daejeon (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.89)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.69)

Add feedback

An Expressive Language and Efficient Execution System for Software Agents

Barish, G., Knoblock, C. A.

arXiv.org Artificial IntelligenceSep-9-2011

Software agents can be used to automate many of the tedious, time-consuming information processing tasks that humans currently have to complete manually. However, to do so, agent plans must be capable of representing the myriad of actions and control flows required to perform those tasks. In addition, since these tasks can require integrating multiple sources of remote information ? typically, a slow, I/O-bound process ? it is desirable to make execution as efficient as possible. To address both of these needs, we present a flexible software agent plan language and a highly parallel execution system that enable the efficient execution of expressive agent plans. The plan language allows complex tasks to be more easily expressed by providing a variety of operators for flexibly processing the data as well as supporting subplans (for modularity) and recursion (for indeterminate looping). The executor is based on a streaming dataflow model of execution to maximize the amount of operator and data parallelism possible at runtime. We have implemented both the language and executor in a system called THESEUS. Our results from testing THESEUS show that streaming dataflow execution can yield significant speedups over both traditional serial (von Neumann) as well as non-streaming dataflow-style execution that existing software and robot agent execution systems currently support. In addition, we show how plans written in the language we present can represent certain types of subtasks that cannot be accomplished using the languages supported by network query engines. Finally, we demonstrate that the increased expressivity of our plan language does not hamper performance; specifically, we show how data can be integrated from multiple remote sources just as efficiently using our architecture as is possible with a state-of-the-art streaming-dataflow network query engine.

artificial intelligence, information retrieval query processing, natural language, (14 more...)

arXiv.org Artificial Intelligence

doi: 10.1613/jair.1548

1109.2048

Country:

Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
North America > United States > Texas > Dallas County > Dallas (0.04)
North America > United States > Washington > King County > Seattle (0.04)
(19 more...)

Genre: Research Report > New Finding (0.48)

Industry: Government > Regional Government > North America Government > United States Government (0.92)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval > Query Processing (1.00)

Add feedback

Many Bills: Visualizing the Anatomy of Congressional Legislation

Aktolga, Elif (University of Massachusetts Amherst) | Ros, Irene (IBM Watson Research Center) | Assogba, Yannick (IBM Watson Research Center) | DiMicco, Joan (IBM Watson Research Center)

AAAI ConferencesAug-8-2011

US Federal Legislation is a common subject of discussion and advocacy on the web. The contents of bills present a significant challenge to both experts and average citizens due to their length and complex legal language. To make bills more accessible to the general public, we present Many Bills: a web-based visualization prototype that reveals the underlying semantics of a bill. We classify the sections of a bill into topics and visualize them using different colors. Further, using information retrieval techniques, we locate sections that don't seem to fit with the overall topic of the bill. To highlight outliers in our `misfit mode', we visualize them in red, which builds a contrast against the remaining gray sections. Both topic and misfit visualizations provide an overview and detail view of bills, enabling users to read individual sections of a bill and compare topic patterns across multiple bills. We obtained initial user feedback and continue collecting label corrections from users through the interface.

congressional legislation, outlier, visualization, (13 more...)

AAAI Conferences

Workshops at the Twenty-Fifth AAAI Conference on Artificial Intelligence

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
North America > United States > Massachusetts > Hampshire County > Amherst (0.14)
North America > United States > New York > New York County > New York City (0.06)
(2 more...)

Industry:

Law > Statutes (1.00)
Government (1.00)

Technology:

Information Technology > Information Management (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.54)

Add feedback

Towards Large-Scale Collaborative Planning: Answering High-Level Search Queries Using Human Computation

Law, Edith (Carnegie Mellon University) | Zhang, Haoqi (Harvard University)

AAAI ConferencesAug-4-2011

Behind every search query is a high-level mission that the user wants to accomplish. While current search engines can often provide relevant information in response to well-specified queries, they place the heavy burden of making a plan for achieving a mission on the user. We take the alternative approach of tackling users' high-level missions directly by introducing a human computation system that generates simple plans, by decomposing a mission into goals and retrieving search results tailored to each goal. Results show that our system is able to provide users with diverse, actionable search results and useful roadmaps for accomplishing their missions.

health & medicine, information management, search result, (19 more...)

AAAI Conferences

Twenty-Fifth AAAI Conference on Artificial Intelligence

Country: North America > United States (0.14)

Genre: Research Report > New Finding (1.00)

Industry:

Health & Medicine > Consumer Health (0.68)
Energy > Oil & Gas (0.68)
Leisure & Entertainment > Sports (0.46)

Technology:

Information Technology > Information Management > Search (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.50)

Add feedback

A Whole Page Click Model to Better Interpret Search Engine Click Data

Chen, Weizhu (Microsoft Research Asia and Hong Kong University of Science and Technology) | Ji, Zhanglong (Microsoft Research Asia) | Shen, Si (Hong Kong University of Science and Technology) | Yang, Qiang (Hong Kong University of Science and Technology)

AAAI ConferencesAug-4-2011

Recent advances in click modeling have established it as an attractive approach to interpret search click data. These advances characterize users' search behavior either in advertisement blocks, or within an organic search block through probabilistic models. Yet, when searching for information on a search result page, one is often interacting with the search engine via an entire page instead of a single block. Consequently, previous works that exclusively modeled user behavior in a single block may sacrifice much useful user behavior information embedded in other blocks. To solve this problem, in this paper, we put forward a novel Whole Page Click (WPC) Model to characterize user behavior in multiple blocks. Specifically, WPC uses a Markov chain to learn the user transition probabilities among different blocks in the whole page. To compare our model with the best alternatives in the Web-Search literature, we run a large-scale experiment on a real dataset and demonstrate the advantage of the WPC model in terms of both the whole page and each block in the page. Especially, we find that WPC can achieve significant gain in interpreting the advertisement data, despite of the sparsity of the advertisement click data.

click model, interpret search engine click data, user behavior, (2 more...)

AAAI Conferences

Twenty-Fifth AAAI Conference on Artificial Intelligence

Technology:

Information Technology > Information Management > Search (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.65)

Add feedback

Monitoring Entities in an Uncertain World: Entity Resolution and Referential Integrity

AAAI ConferencesAug-4-2011

This paper describes a system to help intelligence analysts track and analyze information being published in multiple sources, particularly open sources on the Web. The system integrates technology for Web harvesting, natural language extraction, and network analytics, and allows analysts to view and explore the results via a Web application. One of the difficult problems we address is the entity resolution problem, which occurs when there are multiple, differing ways to refer to the same entity. The problem is particularly complex when noisy data is being aggregated over time, there is no clean master list of entities, and the entities under investigation are intentionally being deceptive. Our system must not only perform entity resolution with noisy data, but must also gracefully recover when entity resolution mistakes are subsequently corrected. We present a case study in arms trafficking that illustrates the issues, and describe how they are addressed.

application, descriptor, entitybase, (15 more...)

AAAI Conferences

Twenty-Third IAAI Conference

Country:

Asia > Middle East > Iran (0.04)
North America > United States > California > Los Angeles County > El Segundo (0.04)
Asia > Middle East > Qatar (0.04)
(2 more...)

Industry:

Transportation > Air (1.00)
Government > Military (0.67)
Law Enforcement & Public Safety (0.67)

Technology: Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (1.00)

Add feedback

The News that Matters to You: Design and Deployment of a Personalized News Service

Stefik, Mark Jeffrey (PARC) | Good, Lance (Google)

AAAI ConferencesAug-4-2011

With the growth of online information, many people are challenged in finding and reading the information most important for their interests. From 2008-2010 we built an experimental personalized news system where readers can subscribe to organized channels of information that are curated by experts. AI technology was employed to radically reduce the work load of curators and to efficiently present information to readers. The system has gone through three implementation cycles and processed over 16 million news stories from about 12,000 RSS feeds on over 8000 topics organized by 160 curators for over 600 registered readers. This paper describes the approach, engineering and AI technology of the system.

curator, information, query, (14 more...)

AAAI Conferences

Twenty-Third IAAI Conference

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > California > Santa Clara County > Palo Alto (0.05)
North America > United States > New York > New York County > New York City (0.04)
(6 more...)

Industry: Media > News (1.00)

Technology:

Information Technology > Communications > Social Media (0.95)
Information Technology > Communications > Web (0.69)
Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (0.65)
(2 more...)

Add feedback

Web Personalization and Cohort Information Services for Natural Resource Managers

Redman, Crystal E. (Colorado State University)

AAAI ConferencesAug-4-2011

Their information needs are long and popular information needs of the masses. Topic term and highly dynamic - nearly everything about this topic specificity, customizability, and automatically pursuing the is in flux. For these users, information search can be made long term unique information needs of individual users are more effective with knowledge about the field and about the not among the strengths of current main stream search engines types of documents being retrieved. Because the resource (Jansen, Spink, and Saracevic 2000) (Teevan, Dumais, management decisions require judgment about the materials and Horvitz 2005). This gap has inspired web personalization collected, the users require confidentiality and must trust the and collaborative information seeking tools such as sources. Google Alerts and has encouraged topic-specific blogs and Matilda is designed to 1) tailor information collection for podcasts.

information retrieval, machine learning, natural language, (14 more...)

AAAI Conferences

Twenty-Fifth AAAI Conference on Artificial Intelligence

Country:

North America > United States > New York > New York County > New York City (0.05)
North America > United States > Colorado > Larimer County > Fort Collins (0.05)

Industry: Information Technology (0.49)

Technology:

Information Technology > Information Management (1.00)
Information Technology > Communications > Web (1.00)
Information Technology > Artificial Intelligence > Machine Learning (0.71)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.36)

Add feedback