AITopics | Information Retrieval

Collaborating Authors

Information Retrieval

Our accustomed systems of retrieving particular bits of information no longer fill the needs of many people. Searching traditional indexes of print publications has been aided by computerized databases, but still usually requires time-consuming serial searching of one database after the other, and then moving on to other methods of searching for internet sources. And what if the information being sought is a sound byte? A video clip? Yesterday's e-mail exchange between respected scientists? Artificial intelligence may hold the key to information retrieval in an age where widely different formats contain the information being sought, and the universe of knowledge is simply too big and growing too rapidly for successful searching to proceed at a human's slow speed.

News Overviews Instructional Materials AI-Alerts Classics

Building a Visual Search Engine - Part 2: The Search Engine - KDnuggets

#artificialintelligenceFeb-17-2022, 15:05:57 GMT

Editor's note: You can find part one of this article here. Task: The task is to generate a ranked list of images which are semantically similar to the query image. We will split our dataset into two parts: training and evaluation. From each class, we will randomly sample 20 images and create an evaluation set out of it. The remaining images will be part of the training set.

evaluation, search engine, visual search engine, (10 more...)

#artificialintelligence

Technology:

Information Technology > Information Management > Search (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.92)

Add feedback

Case law retrieval: problems, methods, challenges and evaluations in the last 20 years

Locke, Daniel, Zuccon, Guido

arXiv.org Artificial IntelligenceFeb-15-2022

Case law retrieval is the retrieval of judicial decisions relevant to a legal question. Case law retrieval comprises a significant amount of a lawyer's time, and is important to ensure accurate advice and reduce workload. We survey methods for case law retrieval from the past 20 years and outline the problems and challenges facing evaluation of case law retrieval systems going forward. Limited published work has focused on improving ranking in ad-hoc case law retrieval. But there has been significant work in other areas of case law retrieval, and legal information retrieval generally. This is likely due to legal search providers being unwilling to give up the secrets of their success to competitors. Most evaluations of case law retrieval have been undertaken on small collections and focus on related tasks such as question-answer systems or recommender systems. Work has not focused on Cranfield style evaluations and baselines of methods for case law retrieval on publicly available test collections are not present. This presents a major challenge going forward. But there are reasons to question the extent of this problem, at least in a commercial setting. Without test collections to baseline approaches it cannot be known whether methods are promising. Works by commercial legal search providers show the effectiveness of natural language systems as well as query expansion for case law retrieval. Machine learning is being applied to more and more legal search tasks, and undoubtedly this represents the future of case law retrieval.

case law retrieval, challenge and evaluation

arXiv.org Artificial Intelligence

2202.07209

Genre: Research Report (0.69)

Industry: Law (1.00)

Technology: Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.53)

Add feedback

Clinical-Longformer and Clinical-BigBird: Transformers for long clinical sequences

Li, Yikuan, Wehbe, Ramsey M., Ahmad, Faraz S., Wang, Hanyin, Luo, Yuan

arXiv.org Artificial IntelligenceFeb-12-2022

Transformers-based models, such as BERT, have dramatically improved the performance for various natural language processing tasks. The clinical knowledge enriched model, namely ClinicalBERT, also achieved state-of-the-art results when performed on clinical named entity recognition and natural language inference tasks. One of the core limitations of these transformers is the substantial memory consumption due to their full self-attention mechanism. To overcome this, long sequence transformer models, e.g. Longformer and BigBird, were proposed with the idea of sparse attention mechanism to reduce the memory usage from quadratic to the sequence length to a linear scale. These models extended the maximum input sequence length from 512 to 4096, which enhanced the ability of modeling long-term dependency and consequently achieved optimal results in a variety of tasks. Inspired by the success of these long sequence transformer models, we introduce two domain enriched language models, namely Clinical-Longformer and Clinical-BigBird, which are pre-trained from large-scale clinical corpora. We evaluate both pre-trained models using 10 baseline tasks including named entity recognition, question answering, and document classification tasks. The results demonstrate that Clinical-Longformer and Clinical-BigBird consistently and significantly outperform ClinicalBERT as well as other short-sequence transformers in all downstream tasks. We have made our source code available at [https://github.com/luoyuanlab/Clinical-Longformer] the pre-trained models available for public download at: [https://huggingface.co/yikuan8/Clinical-Longformer].

arxiv preprint arxiv, clinical-longformer and clinical-bigbird, sequence, (10 more...)

arXiv.org Artificial Intelligence

2201.11838

Country:

South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
North America > United States > Indiana (0.04)

Genre:

Research Report > New Finding (0.66)
Research Report > Experimental Study (0.46)

Industry:

Health & Medicine > Therapeutic Area (1.00)
Health & Medicine > Health Care Technology > Medical Record (0.96)
Health & Medicine > Diagnostic Medicine (0.94)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Bienvenu

AAAI ConferencesFeb-8-2022, 12:57:59 GMT

While query answering in the presence of description logic (DL) ontologies is a well-studied problem, questions of static analysis such as query containment and query optimization have received less attention. In this paper, we study a rather general version of query containment that, unlike the classical version, cannot be reduced to query answering. First, we allow a restriction to be placed on the vocabulary used in the instance data, which can result in shorter equivalent queries; and second, we allow each query its own ontology rather than assuming a single ontology for both queries, which is crucial in applications to versioning and modularity. We also study global minimization of queries in the presence of DL ontologies, which is more subtle than for classical databases as minimal queries need not be isomorphic.

bienvenu, ontology, query, (1 more...)

AAAI Conferences

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Description Logic (0.68)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval > Query Processing (0.68)

Add feedback

Libkin

AAAI ConferencesFeb-8-2022, 12:54:32 GMT

The standard way of answering queries over incomplete databases is to compute certain answers, defined as the intersection of query answers on all complete databases that the incomplete database represents. But is this universally accepted definition correct? We argue that this "one-size-fits-all" definition can often lead to counterintuitive or just plain wrong results, and propose an alternative framework for defining certain answers. The idea of the framework is to move away from the standard, in the database literature, assumption that query results be given in the form of a database object, and to allow instead two alternative representations of answers: as objects defining all other answers, or as knowledge we can deduce with certainty about all such answers. We show that the latter is often easier to achieve than the former, that in general certain answers need not be defined as intersection, and may well contain missing information in them. We also show that with a proper choice of semantics, we can often reduce computing certain answers - as either objects or knowledge - to standard query evaluation. We describe the framework in the most general way, applicable to a variety of data models, and test it on three concrete relational semantics of incompleteness: open, closed, and weak closed world.

certain answer, intersection, libkin, (2 more...)

AAAI Conferences

Technology:

Information Technology > Databases (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval > Query Processing (0.63)

Add feedback

Alfeld

AAAI ConferencesFeb-8-2022, 12:48:56 GMT

Machine teaching (MT) studies the task of designing a training set. Specifically, given a learner (e.g., an artificial neural network or a human) and a target model, a teacher aims to create a training set which results in the target model being learned. MT applications include optimal education design for human learners and computer security where adversaries aim to attack learning-based systems. In this work, we formulate pool-based MT as a state space search problem. We discuss the properties and challenges of the resulting problem and highlight opportunities for novel search techniques. In our preliminary study we use a beam search approach, and find that training and evaluating empirical risk of models dominate the run time of the search. Toward the goal of better search techniques for future work, we develop optimizations ranging from implementation details for specific learners to algorithm changes applicable to general blackbox learners. We conclude with a discussion of open problems and research directions.

alfeld, learner, search technique, (1 more...)

AAAI Conferences

Industry: Information Technology > Security & Privacy (0.66)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.92)

Add feedback

Camacho

AAAI ConferencesFeb-8-2022, 11:14:29 GMT

The evolution of the electronic sources connected through wide area networks like Internet has encouraged the development of new information gathering techniques that go beyond traditional information retrieval and WEB search methods. They use advanced techniques, like planning or constraint programming, to integrate and reason about hetereogeneous information sources. In this paper we describe MAPWEB. MAPWEB is a multiagent framework that integrates planning agents and WEB information retrieval agents. The goal of this framework is to deal with problems that require planning with information to be gathered from the WEB.

camacho, information gathering, mapweb, (1 more...)

AAAI Conferences

Industry: Consumer Products & Services > Travel (0.41)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.87)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.62)

Add feedback

Nareyek

AAAI ConferencesFeb-8-2022, 09:54:59 GMT

In this paper, we are considering advanced pathplanning problems that feature finding paths for multiple units subject to rich path constraints. Examples of richer constraints are the following of other units or to stay out of sight of a specific unit. Little attention has so far been given to richer pathplanning problem where the objective is more than reaching a specific destination from a starting point such that the path length is minimized. Richer pathplanning problems occur in many complex real-world scenarios, ranging from computer games to military movement planning. In this paper, a novel way to formally specify such problems and a new local-search strategy to solve such problems are proposed and demonstrated by a prototype implementation. Among the design goals are real-time computability as well as extendibility for new constraints and search heuristics.

constraint, nareyek

AAAI Conferences

Industry: Leisure & Entertainment > Games > Computer Games (0.31)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Information Management > Search (0.68)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.68)

Add feedback

How To Improve SEO Results With AI-Based Search Engine Modeling

#artificialintelligenceFeb-8-2022, 07:01:29 GMT

Is your search engine marketing strategy based on industry-wide best practices? Confused because you're not getting the results you want? You may need personalized SEO recommendations that just aren't applicable to everyone. AI-based search engine modeling can improve your SEO results with personalized solutions. Search engines are constantly evolving.

ai-based search engine modeling, engine modeling, ranking factor, (6 more...)

#artificialintelligence

Technology:

Information Technology > Information Management > Search (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (1.00)

Add feedback

Global Big Data Conference

#artificialintelligenceFeb-4-2022, 17:55:07 GMT

Many, if not most, search engines in use today are based on keywords, in which the search engine attempts to find the best match for a word or set of words used as input. It's a tried-and-true method that has been deployed millions of times over decades of use. But new search approaches based on deep learning, including vector search and neural search, have emerged recently, and early backers say they have the potential to shake up the search market. Vector search uses a fundamentally different approach to finding the best fit between a term provided as input to the engine and the result that is presented to the user. Instead of powering the search by doing a direct one-to-one matching of keywords, in vector search, the engine attempts to match the input term to a vector, which is an array of features generated from objects in the catalog.

global big data conference, vector, vector search, (9 more...)

#artificialintelligence

Technology:

Information Technology > Information Management > Search (0.87)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.42)

Add feedback