AITopics | Information Retrieval

Collaborating Authors

Information Retrieval

Our accustomed systems of retrieving particular bits of information no longer fill the needs of many people. Searching traditional indexes of print publications has been aided by computerized databases, but still usually requires time-consuming serial searching of one database after the other, and then moving on to other methods of searching for internet sources. And what if the information being sought is a sound byte? A video clip? Yesterday's e-mail exchange between respected scientists? Artificial intelligence may hold the key to information retrieval in an age where widely different formats contain the information being sought, and the universe of knowledge is simply too big and growing too rapidly for successful searching to proceed at a human's slow speed.

News Overviews Instructional Materials AI-Alerts Classics

Local Explanations for Clinical Search Engine results

Contempré, Edeline, Szlávik, Zoltán, Mohammadi, Majid, Velazquez, Erick, Teije, Annette ten, Tiddi, Ilaria

arXiv.org Artificial IntelligenceOct-19-2021

Health care professionals rely on treatment search engines to efficiently find adequate clinical trials and early access programs for their patients. However, doctors lose trust in the system if its underlying processes are unclear and unexplained. In this paper, a model-agnostic explainable method is developed to provide users with further information regarding the reasons why a clinical trial is retrieved in response to a query. To accomplish this, the engine generates features from clinical trials using by using a knowledge graph, clinical trial data and additional medical resources. and a crowd-sourcing methodology is used to determine their importance. Grounded on the proposed methodology, the rationale behind retrieving the clinical trials is explained in layman's terms so that healthcare processionals can effortlessly perceive them. In addition, we compute an explainability score for each of the retrieved items, according to which the items can be ranked. The experiments validated by medical professionals suggest that the proposed methodology induces trust in targeted as well as in non-targeted users, and provide them with reliable explanations and ranking of retrieved items.

clinical trial, explanation, search engine, (13 more...)

arXiv.org Artificial Intelligence

2110.12891

Country:

North America > United States (0.28)
Europe > Netherlands > North Holland > Amsterdam (0.06)
Europe > Denmark > Capital Region > Copenhagen (0.05)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Information Management (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.83)
Information Technology > Artificial Intelligence > Natural Language > Explanation & Argumentation (0.68)

Add feedback

Natural Language Processing for Smart Healthcare

Zhou, Binggui, Yang, Guanghua, Shi, Zheng, Ma, Shaodan

arXiv.org Artificial IntelligenceOct-18-2021

Smart healthcare has achieved significant progress in recent years. Emerging artificial intelligence (AI) technologies enable various smart applications across various healthcare scenarios. As an essential technology powered by AI, natural language processing (NLP) plays a key role in smart healthcare due to its capability of analysing and understanding human language. In this work we review existing studies that concern NLP for smart healthcare from the perspectives of technique and application. We focus on feature extraction and modelling for various NLP tasks encountered in smart healthcare from a technical point of view. In the context of smart healthcare applications employing NLP techniques, the elaboration largely attends to representative smart healthcare scenarios, including clinical practice, hospital management, personal care, public health, and drug development. We further discuss the limitations of current works and identify the directions for future works.

application, healthcare, smart healthcare, (14 more...)

arXiv.org Artificial Intelligence

2110.15803

Country:

Asia > Macao (0.14)
Asia > China > Guangdong Province > Zhuhai (0.04)
North America > United States > New York > New York County > New York City (0.04)
(18 more...)

Genre:

Overview (0.87)
Research Report > Experimental Study (0.46)

Industry:

Health & Medicine > Therapeutic Area > Oncology (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Health Care Providers & Services (1.00)
(5 more...)

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
(4 more...)

Add feedback

AI writes on 'Search Engine'

#artificialintelligenceOct-17-2021, 15:15:21 GMT

Search Engines is a software system that helps to carry out web searches. They search the World Wide Web in a systematic way for particular information specified by users, such as a list of web sites, news stories, a map, a directory listing or a biography of a celebrity. They are web search engines that search using a spider to systematically index the content of web sites. The term "search engine" can be used for the software system, the service that delivers web content, or both. In recent years, search engine optimization (SEO) has become a very popular way for web site owners to attract more traffic to their web sites.

engine, search engine, search result, (12 more...)

#artificialintelligence

Industry: Banking & Finance (0.31)

Technology:

Information Technology > Information Management > Search (1.00)
Information Technology > Communications > Web (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (1.00)

Add feedback

Exposing Query Identification for Search Transparency

Li, Ruohan, Li, Jianxiang, Mitra, Bhaskar, Diaz, Fernando, Biega, Asia J.

arXiv.org Artificial IntelligenceOct-14-2021

Search systems control the exposure of ranked content to searchers. In many cases, creators value not only the exposure of their content but, moreover, an understanding of the specific searches where the content is surfaced. The problem of identifying which queries expose a given piece of content in the ranking results is an important and relatively under-explored search transparency challenge. Exposing queries are useful for quantifying various issues of search bias, privacy, data protection, security, and search engine optimization. Exact identification of exposing queries in a given system is computationally expensive, especially in dynamic contexts such as web search. In quest of a more lightweight solution, we explore the feasibility of approximate exposing query identification (EQI) as a retrieval task by reversing the role of queries and documents in two classes of search systems: dense dual-encoder models and traditional BM25 models. We then propose how this approach can be improved through metric learning over the retrieval embedding space. We further derive an evaluation metric to measure the quality of a ranking of exposing queries, as well as conducting an empirical analysis focusing on various practical aspects of approximate EQI.

information retrieval, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2110.07701

Country:

North America > United States > District of Columbia > Washington (0.05)
Asia (0.05)
North America > United States > New York > New York County > New York City (0.04)
(3 more...)

Genre: Research Report (0.82)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Information Management > Search (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(2 more...)

Add feedback

Compressibility of Distributed Document Representations

Škrlj, Blaž, Petkovič, Matej

arXiv.org Artificial IntelligenceOct-14-2021

Contemporary natural language processing (NLP) revolves around learning from latent document representations, generated either implicitly by neural language models or explicitly by methods such as doc2vec or similar. One of the key properties of the obtained representations is their dimension. Whilst the commonly adopted dimensions of 256 and 768 offer sufficient performance on many tasks, it is many times unclear whether the default dimension is the most suitable choice for the subsequent downstream learning tasks. Furthermore, representation dimensions are seldom subject to hyperparameter tuning due to computational constraints. The purpose of this paper is to demonstrate that a surprisingly simple and efficient recursive compression procedure can be sufficient to both significantly compress the initial representation, but also potentially improve its performance when considering the task of text classification. Having smaller and less noisy representations is the desired property during deployment, as orders of magnitude smaller models can significantly reduce the computational overload and with it the deployment costs. We propose CoRe, a straightforward, representation learner-agnostic framework suitable for representation compression. The CoRe's performance is showcased and studied on a collection of 17 real-life corpora from biomedical, news, social media, and literary domains. We explored CoRe's behavior when considering contextual and non-contextual document representations, different compression levels, and 9 different compression algorithms. Current results based on more than 100,000 compression experiments indicate that recursive Singular Value Decomposition offers a very good trade-off between the compression efficiency and performance, making CoRe useful in many existing, representation-dependent NLP pipelines.

compression, dimension, representation, (14 more...)

arXiv.org Artificial Intelligence

2110.07595

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Europe > Slovenia > Central Slovenia > Municipality of Ljubljana > Ljubljana (0.04)
Europe > Belgium > Brussels-Capital Region > Brussels (0.04)
(9 more...)

Genre: Research Report (0.64)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.81)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)

Add feedback

Topic-time Heatmaps for Human-in-the-loop Topic Detection and Tracking

Beeferman, Doug, Jiang, Hang

arXiv.org Artificial IntelligenceOct-12-2021

The essential task of Topic Detection and Tracking (TDT) is to organize a collection of news media into clusters of stories that pertain to the same real-world event. To apply TDT models to practical applications such as search engines and discovery tools, human guidance is needed to pin down the scope of an "event" for the corpus of interest. In this work in progress, we explore a human-in-the-loop method that helps users iteratively fine-tune TDT algorithms so that both the algorithms and the users themselves better understand the nature of the events. We generate a visual overview of the entire corpus, allowing the user to select regions of interest from the overview, and then ask a series of questions to affirm (or reject) that the selected documents belong to the same event. The answers to these questions supplement the training data for the event similarity model that underlies the system.

arxiv preprint arxiv, representation, topic detection, (10 more...)

arXiv.org Artificial Intelligence

2110.07337

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.15)
Asia > Middle East > Jordan (0.05)
North America > United States > New York > New York County > New York City (0.04)
Europe > Kosovo (0.04)

Genre: Research Report (0.40)

Industry: Media > News (0.49)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

ActiveEA: Active Learning for Neural Entity Alignment

Liu, Bing, Scells, Harrisen, Zuccon, Guido, Hua, Wen, Zhao, Genghong

arXiv.org Artificial IntelligenceOct-12-2021

Entity Alignment (EA) aims to match equivalent entities across different Knowledge Graphs (KGs) and is an essential step of KG fusion. Current mainstream methods -- neural EA models -- rely on training with seed alignment, i.e., a set of pre-aligned entity pairs which are very costly to annotate. In this paper, we devise a novel Active Learning (AL) framework for neural EA, aiming to create highly informative seed alignment to obtain more effective EA models with less annotation cost. Our framework tackles two main challenges encountered when applying AL to EA: (1) How to exploit dependencies between entities within the AL strategy. Most AL strategies assume that the data instances to sample are independent and identically distributed. However, entities in KGs are related. To address this challenge, we propose a structure-aware uncertainty sampling strategy that can measure the uncertainty of each entity as well as its impact on its neighbour entities in the KG. (2) How to recognise entities that appear in one KG but not in the other KG (i.e., bachelors). Identifying bachelors would likely save annotation budget. To address this challenge, we devise a bachelor recognizer paying attention to alleviate the effect of sampling bias. Empirical results show that our proposed AL strategy can significantly improve sampling quality with good generality across different datasets, EA models and amount of bachelors.

alignment, bachelor, ea model, (13 more...)

arXiv.org Artificial Intelligence

2110.06474

Country:

North America > United States > Texas > Harris County > Houston (0.04)
North America > United States > New York (0.04)
Europe > Sweden > Stockholm > Stockholm (0.04)
(16 more...)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.46)

Add feedback

This new Google search engine feature will compete with Facebook, Twitter in curating news

USATODAY - Tech Top StoriesOct-11-2021, 21:03:14 GMT

Google is developing a new feature called Big Moments, which will compete with rivals Facebook and Twitter in delivering the latest breaking news updates during major events. The COVID-19 pandemic forced the search engine to react quickly and constantly to its users' needs for the latest and most authoritative information, according to Google. A team at Google has been working on the project for over a year, after the company struggled to provide the latest updates on the U.S. Capitol attack in January and Black Lives Matter protests last summer, says The Information, a Silicon Valley-basedtechnology news site. Big Moments hopes to build upon Google's Full Coverage feature, which it launched in Google News in 2018 and later integrated with its search engine in March of 2021. Full Coverage allows users to tap into a news headline and see how that story is reported from a variety of sources.

google, new google search engine feature, twitter, (8 more...)

USATODAY - Tech Top Stories

Country:

North America > United States > California (0.26)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.06)
Europe > France (0.06)

Industry:

Media > News (1.00)
Information Technology > Services (0.89)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.37)
Health & Medicine > Therapeutic Area > Immunology (0.37)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.85)

Add feedback

How Artificial Intelligence Is Powering Search Engines

#artificialintelligenceOct-8-2021, 10:20:08 GMT

Whether you are a customer searching for your favorite products online, a writer looking for the latest statistics, or a business owner learning SEO skills, you are using a search engine to get answers. And search engines are pretty interesting! You open up your favorite one, add some related keywords and click to search. Within a fraction of a second, you get thousands of results for your entered keyword. Search engines can perform the way they do because of the algorithms they have and a lot of brilliant people powering them.

artificial intelligence, intelligence, search engine, (12 more...)

#artificialintelligence

Technology:

Information Technology > Information Management > Search (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (1.00)

Add feedback

Robustness Evaluation of Transformer-based Form Field Extractors via Form Attacks

Xue, Le, Gao, Mingfei, Chen, Zeyuan, Xiong, Caiming, Xu, Ran

arXiv.org Artificial IntelligenceOct-8-2021

We propose a novel framework to evaluate the robustness of transformer-based form field extraction methods via form attacks. We introduce 14 novel form transformations to evaluate the vulnerability of the state-of-the-art field extractors against form attacks from both OCR level and form level, including OCR location/order rearrangement, form background manipulation and form field-value augmentation. We conduct robustness evaluation using real invoices and receipts, and perform comprehensive research analysis. Experimental results suggest that the evaluated models are very susceptible to form perturbations such as the variation of field-values (~15% drop in F1 score), the disarrangement of input text order(~15% drop in F1 score) and the disruption of the neighboring words of field-values(~10% drop in F1 score). Guided by the analysis, we make recommendations to improve the design of field extractors and the process of data collection.

field extractor, robustness, transformation, (13 more...)

arXiv.org Artificial Intelligence

2110.04413

Country: North America > United States > California > Santa Clara County > Palo Alto (0.04)

Genre: Research Report > New Finding (0.48)

Industry: Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.91)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.71)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.62)

Add feedback