AITopics | Information Retrieval

Collaborating Authors

Information Retrieval

Our accustomed systems of retrieving particular bits of information no longer fill the needs of many people. Searching traditional indexes of print publications has been aided by computerized databases, but still usually requires time-consuming serial searching of one database after the other, and then moving on to other methods of searching for internet sources. And what if the information being sought is a sound byte? A video clip? Yesterday's e-mail exchange between respected scientists? Artificial intelligence may hold the key to information retrieval in an age where widely different formats contain the information being sought, and the universe of knowledge is simply too big and growing too rapidly for successful searching to proceed at a human's slow speed.

News Overviews Instructional Materials AI-Alerts Classics

How Search Engines Use Machine Learning: 9 Things We Know For Sure

#artificialintelligenceAug-20-2021, 08:35:59 GMT

Tech giants are investing heavily in machine learning. In 2019, Microsoft invested in 11 artificial intelligence (AI) startups, with $1 billion for OpenAI alone. In that same year, Intel Capital made 19 investments, and Google Ventures made 16 investments. That huge influx of capital means that AI computing power is making rapid advancements in a range of sectors from healthcare to construction to marketing and search engine optimization. However, before we get into the implications of machine learning for SEO professionals, let's define what we mean by AI.

google, search engine, search result, (12 more...)

#artificialintelligence

Country:

North America > United States > New York (0.05)
North America > United States > California > San Diego County > San Diego (0.05)

Industry:

Information Technology (0.49)
Health & Medicine (0.35)

Technology:

Information Technology > Information Management > Search (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.78)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.49)

Add feedback

Web image search engine based on LSH index and CNN Resnet50

Parola, Marco, Nannini, Alice, Poleggi, Stefano

arXiv.org Artificial IntelligenceAug-20-2021

To implement a good Content Based Image Retrieval (CBIR) system, it is essential to adopt efficient search methods. One way to achieve this results is by exploiting approximate search techniques. In fact, when we deal with very large collections of data, using an exact search method makes the system very slow. In this project, we adopt the Locality Sensitive Hashing (LSH) index to implement a CBIR system that allows us to perform fast similarity search on deep features. Specifically, we exploit transfer learning techniques to extract deep features from images; this phase is done using two famous Convolutional Neural Networks (CNNs) as features extractors: Resnet50 and Resnet50v2, both pre-trained on ImageNet. Then we try out several fully connected deep neural networks, built on top of both of the previously mentioned CNNs in order to fine-tuned them on our dataset. In both of previous cases, we index the features within our LSH index implementation and within a sequential scan, to better understand how much the introduction of the index affects the results. Finally, we carry out a performance analysis: we evaluate the relevance of the result set, computing the mAP (mean Average Precision) value obtained during the different experiments with respect to the number of done comparison and varying the hyper-parameter values of the LSH index.

image search engine, resnet50, resnet50v2, (12 more...)

arXiv.org Artificial Intelligence

2108.13301

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > New York > New York County > New York City (0.04)
Europe > Italy > Tuscany > Pisa Province > Pisa (0.04)

Genre: Research Report (0.40)

Technology:

Information Technology > Information Management > Search (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.87)
Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition > Image Matching (0.53)

Add feedback

AI DBT Impact on Mammography Post-breast Therapy

#artificialintelligenceAug-12-2021, 12:50:23 GMT

AI-CAD marked axillary lymph node and region in right upper outer quadrant (arrows and thin line outlining both sites) and assigned an abnormality score of 28%. August 11, 2021 -- According to an open-access Editor's Choice article in the American Journal of Roentgenology (AJR), artificial intelligence-based computer-aided detection (AI-CAD) can be a practical addition for lowering false-positive findings when performing post-breast conserving therapy (BCT) surveillance mammography. "After BCT, adjunct digital breast tomosynthesis (DBT) or AI-CAD reduced recall rates and improved accuracy in the ipsilateral and contralateral breasts compared with digital mammography (DM)," wrote lead investigator Jung Hyun Yoon. "In the ipsilateral breast, addition of AI-CAD resulted in lower recall rate and higher accuracy than addition of DBT." Yoon and colleagues' single-center retrospective study included 314 women (mean age, 53.2 years; 4 with bilateral breast cancer) who underwent BCT followed by DBT (mean interval from surgery to DBT, 15.2 months). Three breast radiologists independently reviewed images in three sessions: DM, DM with DBT, and DM with AI-CAD.

ai-cad, breast, mammography post-breast therapy, (9 more...)

#artificialintelligence

Industry:

Health & Medicine > Therapeutic Area > Oncology > Breast Cancer (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (1.00)

Technology: Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.68)

Add feedback

TPRM: A Topic-based Personalized Ranking Model for Web Search

Huang, Minghui, Peng, Wei, Wang, Dong

arXiv.org Artificial IntelligenceAug-12-2021

Ranking models have achieved promising results, but it remains challenging to design personalized ranking systems to leverage user profiles and semantic representations between queries and documents. In this paper, we propose a topic-based personalized ranking model (TPRM) that integrates user topical profile with pretrained contextualized term representations to tailor the general document ranking list. Experiments on the real-world dataset demonstrate that TPRM outperforms state-of-the-art ad-hoc ranking models and personalized ranking models significantly. Keywords: personalized ranking model · personalized search · topic model · user profile.

topic-based personalized ranking model, tprm, user profile, (12 more...)

arXiv.org Artificial Intelligence

2108.06014

Country: Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Information Management > Search (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
(2 more...)

Add feedback

GQE-PRF: Generative Query Expansion with Pseudo-Relevance Feedback

Huang, Minghui, Wang, Dong, Liu, Shuang, Ding, Meizhen

arXiv.org Artificial IntelligenceAug-12-2021

Query expansion with pseudo-relevance feedback (PRF) is a powerful approach to enhance the effectiveness in information retrieval. Recently, with the rapid advance of deep learning techniques, neural text generation has achieved promising success in many natural language tasks. To leverage the strength of text generation for information retrieval, in this article, we propose a novel approach which effectively integrates text generation models into PRF-based query expansion. In particular, our approach generates augmented query terms via neural text generation models conditioned on both the initial query and pseudo-relevance feedback. Moreover, in order to train the generative model, we adopt the conditional generative adversarial nets (CGANs) and propose the PRF-CGAN method in which both the generator and the discriminator are conditioned on the pseudo-relevance feedback. We evaluate the performance of our approach on information retrieval tasks using two benchmark datasets. The experimental results show that our approach achieves comparable performance or outperforms traditional query expansion methods on both the retrieval and reranking tasks.

expansion, generation model, query expansion, (14 more...)

arXiv.org Artificial Intelligence

2108.0601

Country: Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report > Promising Solution (0.54)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Information Retrieval > Query Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

A Survey on Deep Reinforcement Learning for Data Processing and Analytics

Cai, Qingpeng, Cui, Can, Xiong, Yiyuan, Wang, Wei, Xie, Zhongle, Zhang, Meihui

arXiv.org Artificial IntelligenceAug-11-2021

In the age of big data, data processing and analytics are fundamental, ubiquitous, and crucial to many organizations which undertake a digitalization journey to improve and transform their businesses and operations. Data analytics typically entails other key operations such as data acquisition, data cleansing, data integration, modeling, etc., before insights could be extracted. Big data can unleash significant value creation across many sectors such as health care and retail[56]. However, the complexity of data (e.g., high volume, high velocity, and high variety) presents many challenges in data analytics and hence renders the difficulty in drawing meaningful insights. To tackle the challenge and facilitate the data processing and analytics efficiently and effectively, a lot of algorithms and techniques have been designed and numerous learning systems have also been developed by researchers and practitioners such as Spark MLlib[63], and Rafiki[104]. To support fast data processing and accurate data analytics, a huge number of algorithms rely on rules that are developed based on human knowledge and experience. For example, Shortest-job-first is a scheduling algorithm that chooses the job with the smallest execution time for the next execution. However, without fully exploiting characteristics of the workload, it can achieve inferior performance compared to DRL-based scheduling algorithm [58].

international conference, it software, upstream oil & gas, (28 more...)

arXiv.org Artificial Intelligence

2108.04526

Country:

Asia (0.28)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)

Genre:

Overview (1.00)
Workflow (0.67)
Research Report (0.64)

Industry:

Information Technology > Software (1.00)
Health & Medicine (1.00)
Banking & Finance > Trading (1.00)
Energy > Oil & Gas > Upstream (0.67)

Technology:

Information Technology > Data Science > Data Mining > Big Data (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
(2 more...)

Add feedback

Adaptive Multi-Resolution Attention with Linear Complexity

Zhang, Yao, Ma, Yunpu, Seidl, Thomas, Tresp, Volker

arXiv.org Artificial IntelligenceAug-10-2021

Transformers have improved the state-of-the-art across numerous tasks in sequence modeling. Besides the quadratic computational and memory complexity w.r.t the sequence length, the self-attention mechanism only processes information at the same scale, i.e., all attention heads are in the same resolution, resulting in the limited power of the Transformer. To remedy this, we propose a novel and efficient structure named Adaptive Multi-Resolution Attention (AdaMRA for short), which scales linearly to sequence length in terms of time and space. Specifically, we leverage a multi-resolution multi-head attention mechanism, enabling attention heads to capture long-range contextual information in a coarse-to-fine fashion. Moreover, to capture the potential relations between query representation and clues of different attention granularities, we leave the decision of which resolution of attention to use to query, which further improves the model's capacity compared to vanilla Transformer. In an effort to reduce complexity, we adopt kernel attention without degrading the performance. Extensive experiments on several benchmarks demonstrate the effectiveness and efficiency of our model by achieving a state-of-the-art performance-efficiency-memory trade-off. To facilitate AdaMRA utilization by the scientific community, the code implementation will be made publicly available.

arxiv preprint arxiv, attention head, transformer, (12 more...)

arXiv.org Artificial Intelligence

2108.04962

Country:

South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval > Query Processing (0.34)

Add feedback

Scalable Reverse Image Search Engine for NASAWorldview

Sodani, Abhigya, Levy, Michael, Koul, Anirudh, Kasam, Meher Anand, Ganju, Siddha

arXiv.org Artificial IntelligenceAug-10-2021

Researchers often spend weeks sifting through decades of unlabeled satellite imagery(on NASA Worldview) in order to develop datasets on which they can start conducting research. We developed an interactive, scalable and fast image similarity search engine (which can take one or more images as the query image) that automatically sifts through the unlabeled dataset reducing dataset generation time from weeks to minutes. In this work, we describe key components of the end to end pipeline. Our similarity search system was created to be able to identify similar images from a potentially petabyte scale database that are similar to an input image, and for this we had to break down each query image into its features, which were generated by a classification layer stripped CNN trained in a supervised manner. To store and search these features efficiently, we had to make several scalability improvements. To improve the speed, reduce the storage, and shrink memory requirements for embedding search, we add a fully connected layer to our CNN make all images into a 128 length vector before entering the classification layers. This helped us compress the size of our image features from 2048 (for ResNet, which was initially tried as our featurizer) to 128 for our new custom model. Additionally, we utilize existing approximate nearest neighbor search libraries to significantly speed up embedding search. Our system currently searches over our entire database of images at 5 seconds per query on a single virtual machine in the cloud. In the future, we would like to incorporate a SimCLR based featurizing model which could be trained without any labelling by a human (since the classification aspect of the model is irrelevant to this use case).

database, query image, satellite imagery, (13 more...)

arXiv.org Artificial Intelligence

2108.04479

Country:

North America > United States (0.54)
Oceania > Australia > New South Wales > Sydney (0.04)

Genre: Research Report (0.40)

Industry:

Government > Space Agency (0.54)
Government > Regional Government > North America Government > United States Government (0.54)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition > Image Matching (0.55)

Add feedback

Understanding Human Reading Comprehension with Brain Signals

Ye, Ziyi, Xie, Xiaohui, Liu, Yiqun, Wang, Zhihong, Chen, Xuesong, Zhang, Min, Ma, Shaoping

arXiv.org Artificial IntelligenceAug-3-2021

Reading comprehension is a complex cognitive process involving many human brain activities. Plenty of works have studied the reading patterns and attention allocation mechanisms in the reading process. However, little is known about what happens in human brain during reading comprehension and how we can utilize this information as implicit feedback to facilitate information acquisition performance. With the advances in brain imaging techniques such as EEG, it is possible to collect high-precision brain signals in almost real time. With neuroimaging techniques, we carefully design a lab-based user study to investigate brain activities during reading comprehension. Our findings show that neural responses vary with different types of contents, i.e., contents that can satisfy users' information needs and contents that cannot. We suggest that various cognitive activities, e.g., cognitive loading, semantic-thematic understanding, and inferential processing, at the micro-time scale during reading comprehension underpin these neural responses. Inspired by these detectable differences in cognitive activities, we construct supervised learning models based on EEG features for two reading comprehension tasks: answer sentence classification and answer extraction. Results show that it is feasible to improve their performance with brain signals. These findings imply that brain signals are valuable feedback for enhancing human-computer interactions during reading comprehension.

information, participant, reading comprehension, (15 more...)

arXiv.org Artificial Intelligence

2108.0136

Country:

Asia > China > Beijing > Beijing (0.05)
Oceania > Australia (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Therapeutic Area > Neurology (1.00)
Health & Medicine > Health Care Technology (1.00)

Technology:

Information Technology > Artificial Intelligence > Cognitive Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.67)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

Add feedback

Deep Natural Language Processing for LinkedIn Search Systems

Guo, Weiwei, Liu, Xiaowei, Wang, Sida, Kazi, Michaeel, Fu, Zhoutong, Gao, Huiji, Jia, Jun, Zhang, Liang, Long, Bo

arXiv.org Artificial IntelligenceJul-30-2021

Many search systems work with large amounts of natural language data, e.g., search queries, user profiles and documents, where deep learning based natural language processing techniques (deep NLP) can be of great help. In this paper, we introduce a comprehensive study of applying deep NLP techniques to five representative tasks in search engines. Through the model design and experiments of the five tasks, readers can find answers to three important questions: (1) When is deep NLP helpful/not helpful in search systems? (2) How to address latency challenges? (3) How to ensure model robustness? This work builds on existing efforts of LinkedIn search, and is tested at scale on a commercial search engine. We believe our experiences can provide useful insights for the industry and research communities.

baseline, experiment, query, (14 more...)

arXiv.org Artificial Intelligence

2108.08252

Country:

Asia > Myanmar > Tanintharyi Region > Dawei (0.04)
North America > United States > California > Santa Clara County > Mountain View (0.04)

Genre: Research Report (0.82)

Industry: Information Technology > Services (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback