AITopics | Information Retrieval

Collaborating Authors

Information Retrieval

Our accustomed systems of retrieving particular bits of information no longer fill the needs of many people. Searching traditional indexes of print publications has been aided by computerized databases, but still usually requires time-consuming serial searching of one database after the other, and then moving on to other methods of searching for internet sources. And what if the information being sought is a sound byte? A video clip? Yesterday's e-mail exchange between respected scientists? Artificial intelligence may hold the key to information retrieval in an age where widely different formats contain the information being sought, and the universe of knowledge is simply too big and growing too rapidly for successful searching to proceed at a human's slow speed.

News Overviews Instructional Materials AI-Alerts Classics

Determining the Unithood of Word Sequences using a Probabilistic Approach

Wong, Wilson, Liu, Wei, Bennamoun, Mohammed

arXiv.org Artificial IntelligenceOct-1-2008

Most research related to unithood were conducted as part of a larger effort for the determination of termhood. Consequently, novelties are rare in this small sub-field of term extraction. In addition, existing work were mostly empirically motivated and derived. We propose a new probabilistically-derived measure, independent of any influences of termhood, that provides dedicated measures to gather linguistic evidence from parsed text and statistical evidence from Google search engine for the measurement of unithood. Our comparative study using 1,825 test cases against an existing empirically-derived function revealed an improvement in terms of precision, recall and accuracy.

information retrieval, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

0810.0139

Country:

Oceania > Australia > Western Australia (0.04)
Oceania > Australia > Victoria > Melbourne (0.04)
Europe > Spain > Galicia > Madrid (0.04)
(5 more...)

Genre: Research Report (0.64)

Technology:

Information Technology > Information Management > Search (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.49)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.47)

Add feedback

On the Use of Automatically Acquired Examples for All-Nouns Word Sense Disambiguation

Martinez, D., Lopez de Lacalle, O., Agirre, E.

Journal of Artificial Intelligence ResearchSep-25-2008

This article focuses on Word Sense Disambiguation (WSD), which is a Natural Language Processing task that is thought to be important for many Language Technology applications, such as Information Retrieval, Information Extraction, or Machine Translation. One of the main issues preventing the deployment of WSD technology is the lack of training examples for Machine Learning systems, also known as the Knowledge Acquisition Bottleneck. A method which has been shown to work for small samples of words is the automatic acquisition of examples. We have previously shown that one of the most promising example acquisition methods scales up and produces a freely available database of 150 million examples from Web snippets for all polysemous nouns in WordNet. This paper focuses on the issues that arise when using those examples, all alone or in addition to manually tagged examples, to train a supervised WSD system for all nouns. The extensive evaluation on both lexical-sample and all-words Senseval benchmarks shows that we are able to improve over commonly used baselines and to achieve top-rank performance. The good use of the prior distributions from the senses proved to be a crucial factor.

proceedings, semcor, sensecorpus, (14 more...)

Journal of Artificial Intelligence Research

doi: 10.1613/jair.2395

AI Access Foundation

10569

Journal of Artificial Intelligence Research

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.14)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)
(24 more...)

Genre:

Research Report > Experimental Study (0.68)
Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.90)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.66)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.66)

Add feedback

Intelligent Peer Networks for Collaborative Web Search

Menczer, Filippo, Wu, Le-Shin, Akavipat, Ruj

AI MagazineSep-15-2008

artificial intelligence, information management, network, (8 more...)

AI Magazine

Technology:

Information Technology > Information Management > Search (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

The Information Ecology of Social Media and Online Communities

Finin, Tim (University of Maryland, Baltimore County) | Joshi, Anupam (University of Maryland, Baltimore County) | Kolari, Pranam (Yahoo! Applied Research) | Java, Akshay (University of Maryland, Baltimore County) | Kale, Anubhav (Microsoft) | Karandikar, Amit (Microsoft)

AI MagazineSep-15-2008

Citizens, both young and feeds, and semistructured metadata old, are also discovering how social media in the form of extensible markup language technology can improve their lives and (XML) and resource description give them more voice in the world. We they provide more useful, trustworthy, begin by describing an overarching task of and reliable. Pursuing this task uncovers It differs, however, in ways a number of problems that must be addressed, that affect how it should be modeled, analyzed, three of which we describe in and exploited. The first is recognizing spam model for the general web is as a directed graph of web pages with undifferentiated in the form of spam blogs (splogs) and links between pages. The second is developing has a much richer network structure more effective techniques to recognize in that there are more types of nodes the social structure of blog communities. For example, the abstract model for the underlying blog people who contribute to blogs and au-network structure and how it evolves. Figure 2 shows a hypothetical blog graph and its corresponding flow of information in the influence graph. Studies on influence in social networks and collaboration graphs have typically focused on the task of identifying key individuals who play an important role in propagating information. This is similar to finding authoritative pages on the web.

artificial intelligence, information retrieval, natural language, (18 more...)

AI Magazine

Country:

North America > United States > Maryland > Baltimore County (0.15)
North America > United States > Maryland > Baltimore (0.15)

Genre: Research Report > New Finding (0.46)

Industry:

Media (0.99)
Information Technology > Services (0.67)

Technology:

Information Technology > Communications > Web (1.00)
Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.46)

Add feedback

Intelligent Peer Networks for Collaborative Web Search

Menczer, Filippo, Wu, Le-Shin, Akavipat, Ruj

AI MagazineSep-15-2008

Collaborative query routing is a new paradigm for Web search that treats both established search engines and other publicly available indices as intelligent peer agents in a search network. The approach makes it transparent for anyone to build their own (micro) search engine, by integrating established Web search services, desktop search, and topical crawling techniques. The challenge in this model is that each of these agents must learn about its environment— the existence, knowledge, diversity, reliability, and trustworthiness of other agents — by analyzing the queries received from and results exchanged with these other agents. We present the 6S peer network, which uses machine learning techniques to learn about the changing query environment. We show that simple reinforcement learning algorithms are sufficient to detect and exploit semantic locality in the network, resulting in efficient routing and high-quality search results. A prototype of 6S is available for public use and is intended to assist in the evaluation of different AI techniques employed by the networked agents.

information retrieval, machine learning, natural language, (19 more...)

AI Magazine

Country: North America > United States > California (0.46)

Industry: Information Technology (0.46)

Technology:

Information Technology > Information Management > Search (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Conditioning Probabilistic Databases

Koch, Christoph, Olteanu, Dan

arXiv.org Artificial IntelligenceJun-16-2008

Past research on probabilistic databases has studied the problem of answering queries on a static database. Application scenarios of probabilistic databases however often involve the conditioning of a database using additional information in the form of new evidence. The conditioning problem is thus to transform a probabilistic database of priors into a posterior probabilistic database which is materialized for subsequent query processing or further refinement. It turns out that the conditioning problem is closely related to the problem of computing exact tuple confidence values. It is known that exact confidence computation is an NP-hard problem. This has led researchers to consider approximation techniques for confidence computation. However, neither conditioning nor exact confidence computation can be solved using such techniques. In this paper we present efficient techniques for both problems. We study several problem decomposition methods and heuristics that are based on the most successful search techniques from constraint satisfaction, such as the Davis-Putnam algorithm. We complement this with a thorough experimental evaluation of the algorithms proposed. Our experiments show that our exact algorithms scale well to realistic database sizes and can in some scenarios compete with the most efficient previous approximation algorithms.

artificial intelligence, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

0803.2212

Country:

Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
North America > United States > New York > Tompkins County > Ithaca (0.04)
North America > Canada (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval > Query Processing (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Add feedback

Cooperative Search with Concurrent Interactions

Manisterski, E., Sarne, D., Kraus, S.

Journal of Artificial Intelligence ResearchMay-8-2008

In this paper we show how taking advantage of autonomous agents' capability to maintain parallel interactions with others, and incorporating it into the cooperative economic search model results in a new search strategy which outperforms current strategies in use. As a framework for our analysis we use the electronic marketplace, where buyer agents have the incentive to search cooperatively. The new search technique is quite intuitive, however its analysis and the process of extracting the optimal search strategy are associated with several significant complexities. These difficulties are derived mainly from the unbounded search space and simultaneous dual affects of decisions taken along the search. We provide a comprehensive analysis of the model, highlighting, demonstrating and proving important characteristics of the optimal search strategy. Consequently, we manage to come up with an efficient modular algorithm for extracting the optimal cooperative search strategy for any given environment. A computational based comparative illustration of the system performance using the new search technique versus the traditional methods is given, emphasizing the main differences in the optimal strategy's structure and the advantage of using the proposed model.

agent, coalition, interaction, (16 more...)

Journal of Artificial Intelligence Research

doi: 10.1613/jair.2335

AI Access Foundation

10543

Journal of Artificial Intelligence Research

Country:

Asia > Middle East > Israel (0.04)
North America > United States > New York (0.04)

Industry:

Banking & Finance (0.67)
Information Technology > Services > e-Commerce Services (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (1.00)

Add feedback

Query-time Entity Resolution

Bhattacharya, I., Getoor, L.

Journal of Artificial Intelligence ResearchDec-27-2007

Entity resolution is the problem of reconciling database references corresponding to the same real-world entities. Given the abundance of publicly available databases that have unresolved entities, we motivate the problem of query-time entity resolution quick and accurate resolution for answering queries over such `unclean' databases at query-time. Since collective entity resolution approaches --- where related references are resolved jointly --- have been shown to be more accurate than independent attribute-based resolution for off-line entity resolution, we focus on developing new algorithms for collective resolution for answering entity resolution queries at query-time. For this purpose, we first formally show that, for collective resolution, precision and recall for individual entities follow a geometric progression as neighbors at increasing distances are considered. Unfolding this progression leads naturally to a two stage `expand and resolve' query processing strategy. In this strategy, we first extract the related records for a query using two novel expansion operators, and then resolve the extracted records collectively. We then show how the same strategy can be adapted for query-time entity resolution by identifying and resolving only those database references that are the most helpful for processing the query. We validate our approach on two large real-world publication databases where we show the usefulness of collective resolution and at the same time demonstrate the need for adaptive strategies for query processing. We then show how the same queries can be answered in real-time using our adaptive approach while preserving the gains of collective resolution. In addition to experiments on real datasets, we use synthetically generated data to empirically demonstrate the validity of the performance trends predicted by our analysis of collective entity resolution over a wide range of structural characteristics in the data.

entity resolution, query, resolution, (16 more...)

Journal of Artificial Intelligence Research

doi: 10.1613/jair.2290

AI Access Foundation

10524

Journal of Artificial Intelligence Research

Country:

North America > United States > Maryland > Prince George's County > College Park (0.14)
North America > United States > District of Columbia > Washington (0.04)
North America > United States > Washington > King County > Seattle (0.04)
(15 more...)

Genre: Research Report (0.67)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.88)

Add feedback

Topic and Role Discovery in Social Networks with Experiments on Enron and Academic Email

McCallum, A., Wang, X., Corrada-Emmanuel, A.

Journal of Artificial Intelligence ResearchOct-13-2007

Previous work in social network analysis (SNA) has modeled the existence of links from one entity to another, but not the attributes such as language content or topics on those links. We present the Author-Recipient-Topic (ART) model for social network analysis, which learns topic distributions based on the direction-sensitive messages sent between entities. The model builds on Latent Dirichlet Allocation (LDA) and the Author-Topic (AT) model, adding the key attribute that distribution over topics is conditioned distinctly on both the sender and recipient---steering the discovery of topics according to the relationships between people. We give results on both the Enron email corpus and a researcher's email archive, providing evidence not only that clearly relevant topics are discovered, but that the ART model better predicts people's roles and gives lower perplexity on previously unseen messages. We also present the Role-Author-Recipient-Topic (RART) model, an extension to ART that explicitly represents people's roles.

art model, mccallum, topic and role discovery, (12 more...)

Journal of Artificial Intelligence Research

doi: 10.1613/jair.2229

AI Access Foundation

10515

Journal of Artificial Intelligence Research

Country:

Asia > Middle East > Jordan (0.05)
North America > United States > Massachusetts > Hampshire County > Amherst (0.04)
North America > United States > California (0.04)
(3 more...)

Genre: Research Report > New Finding (0.46)

Industry:

Government > Regional Government > North America Government > United States Government (1.00)
Information Technology > Services (0.92)
Energy > Power Industry (0.72)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.69)
(3 more...)

Add feedback

Practical Approach to Knowledge-based Question Answering with Natural Language Understanding and Advanced Reasoning

Wong, Wilson

arXiv.org Artificial IntelligenceJul-24-2007

This research hypothesized that a practical approach in the form of a solution framework known as Natural Language Understanding and Reasoning for Intelligence (NaLURI), which combines full-discourse natural language understanding, powerful representation formalism capable of exploiting ontological information and reasoning approach with advanced features, will solve the following problems without compromising practicality factors: 1) restriction on the nature of question and response, and 2) limitation to scale across domains and to real-life natural language text.

information retrieval, natural language, question answering, (21 more...)

arXiv.org Artificial Intelligence

0707.3559

Country:

Asia > Thailand (0.13)
Asia > South Korea (0.13)
Asia > Malaysia > Kuala Lumpur > Kuala Lumpur (0.04)
(21 more...)

Genre: Research Report > Promising Solution (0.45)

Industry:

Leisure & Entertainment (1.00)
Law > Litigation (1.00)
Law > Intellectual Property & Technology Law (1.00)
(5 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Rule-Based Reasoning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Expert Systems (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
(6 more...)

Add feedback