AITopics | Information Retrieval

Collaborating Authors

Information Retrieval

Our accustomed systems of retrieving particular bits of information no longer fill the needs of many people. Searching traditional indexes of print publications has been aided by computerized databases, but still usually requires time-consuming serial searching of one database after the other, and then moving on to other methods of searching for internet sources. And what if the information being sought is a sound byte? A video clip? Yesterday's e-mail exchange between respected scientists? Artificial intelligence may hold the key to information retrieval in an age where widely different formats contain the information being sought, and the universe of knowledge is simply too big and growing too rapidly for successful searching to proceed at a human's slow speed.

News Overviews Instructional Materials AI-Alerts Classics

Bing SEO: Website Optimization Guide & Free SEO Tools

#artificialintelligenceFeb-4-2022, 12:45:19 GMT

Bing, previously known as Microsoft Live Search, is a search engine with over 40% market share in the US. Bing's SEO guide is aimed at helping small business owners to better optimize their websites for traffic and leads. If you're already using Google's SEO solutions (Analytics, Search Console, etc.) then Bing's guide will give you an edge over your competitors by showing you what additional steps to take and tools to use in order to increase your website traffic. The free SEO analysis tool is an extension for Chrome that automatically analyzes any page that it loads and tells you how well optimized the page is for Bing. Based on its results, you can then use this Bing's SEO guide to optimize your website further.

bing, keyword, relevant keyword, (12 more...)

#artificialintelligence

Country: North America > United States (0.26)

Technology:

Information Technology > Information Management > Search (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.56)

Add feedback

Google opens up about how Maps review moderation works – Search Engine Land

#artificialintelligenceFeb-3-2022, 05:55:58 GMT

Review moderation powered by machine learning. User reviews are sent to Google's moderation system as soon as they're submitted.

search engine land

#artificialintelligence

Industry: Media > News (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.52)
Information Technology > Information Management > Search (0.40)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.40)

Add feedback

Doubly Robust Off-Policy Evaluation for Ranking Policies under the Cascade Behavior Model

Kiyohara, Haruka, Saito, Yuta, Matsuhiro, Tatsuya, Narita, Yusuke, Shimizu, Nobuyuki, Yamamoto, Yasuo

arXiv.org Machine LearningFeb-3-2022

In real-world recommender systems and search engines, optimizing ranking decisions to present a ranked list of relevant items is critical. Off-policy evaluation (OPE) for ranking policies is thus gaining a growing interest because it enables performance estimation of new ranking policies using only logged data. Although OPE in contextual bandits has been studied extensively, its naive application to the ranking setting faces a critical variance issue due to the huge item space. To tackle this problem, previous studies introduce some assumptions on user behavior to make the combinatorial item space tractable. However, an unrealistic assumption may, in turn, cause serious bias. Therefore, appropriately controlling the bias-variance tradeoff by imposing a reasonable assumption is the key for success in OPE of ranking policies. To achieve a well-balanced bias-variance tradeoff, we propose the Cascade Doubly Robust estimator building on the cascade assumption, which assumes that a user interacts with items sequentially from the top position in a ranking. We show that the proposed estimator is unbiased in more cases compared to existing estimators that make stronger assumptions. Furthermore, compared to a previous estimator based on the same cascade assumption, the proposed estimator reduces the variance by leveraging a control variate. Comprehensive experiments on both synthetic and real-world data demonstrate that our estimator leads to more accurate OPE than existing estimators in a variety of settings.

assumption, cascade assumption, estimator, (14 more...)

arXiv.org Machine Learning

doi: 10.1145/3488560.3498380

2202.01562

Country:

North America > United States > Arizona > Maricopa County > Tempe (0.04)
North America > United States > New York > New York County > New York City (0.04)
Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.04)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.48)
Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (0.34)

Add feedback

Two minutes NLP -- Learn TF-IDF with easy examples

#artificialintelligenceJan-29-2022, 04:05:15 GMT

TF-IDF (Term Frequency-Inverse Document Frequency) is a way of measuring how relevant a word is to a document in a collection of documents. TF-IDF has many uses, such as in information retrieval, text analysis, keyword extraction, and as a way of obtaining numeric features from text for machine learning algorithms. TF-IDF was first designed for document search and information retrieval, where a query is run and the system has to find the most relevant documents. Suppose the query is the text "The bug". The system would give each document a higher score proportionally to the frequencies of the query words found in the document, weighting more rare words like "bug" with respect to common words like "the".

average tf-idf, tf-idf, tf-idf score, (10 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.78)

Add feedback

5 Alternatives to Search Engine Optimization - DataScienceCentral.com

#artificialintelligenceJan-25-2022, 13:51:00 GMT

It is not a coincidence that search engine optimization is the'holy cow' of internet traffic. It is responsible for more than half of it. Every second, Google alone processes nearly 100,000 search queries. Therefore, content creators do their best to exploit SEO tricks and gimmicks to their advantage and push their websites to the top of the search. People with deep knowledge of search engine optimization can easily find jobs all around the world.

search engine optimization, traffic, website, (8 more...)

#artificialintelligence

Technology:

Information Technology > Information Management > Search (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.84)

Add feedback

Reinforcement Learning Based Query Vertex Ordering Model for Subgraph Matching

Wang, Hanchen, Zhang, Ying, Qin, Lu, Wang, Wei, Zhang, Wenjie, Lin, Xuemin

arXiv.org Artificial IntelligenceJan-24-2022

Subgraph matching is a fundamental problem in various fields that use graph structured data. Subgraph matching algorithms enumerate all isomorphic embeddings of a query graph q in a data graph G. An important branch of matching algorithms exploit the backtracking search approach which recursively extends intermediate results following a matching order of query vertices. It has been shown that the matching order plays a critical role in time efficiency of these backtracking based subgraph matching algorithms. In recent years, many advanced techniques for query vertex ordering (i.e., matching order generation) have been proposed to reduce the unpromising intermediate results according to the preset heuristic rules. In this paper, for the first time we apply the Reinforcement Learning (RL) and Graph Neural Networks (GNNs) techniques to generate the high-quality matching order for subgraph matching algorithms. Instead of using the fixed heuristics to generate the matching order, our model could capture and make full use of the graph information, and thus determine the query vertex order with the adaptive learning-based rule that could significantly reduces the number of redundant enumerations. With the help of the reinforcement learning framework, our model is able to consider the long-term benefits rather than only consider the local information at current ordering step.Extensive experiments on six real-life data graphs demonstrate that our proposed matching order generation technique could reduce up to two orders of magnitude of query processing time compared to the state-of-the-art algorithms.

graph, rl-qvo, vertex, (16 more...)

arXiv.org Artificial Intelligence

2201.11251

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
Europe > Netherlands > North Holland > Amsterdam (0.04)
(10 more...)

Genre: Research Report > New Finding (0.93)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval > Query Processing (0.51)

Add feedback

Less is Less: When Are Snippets Insufficient for Human vs Machine Relevance Estimation?

Kazai, Gabriella, Mitra, Bhaskar, Dong, Anlei, Craswell, Nick, Yang, Linjun

arXiv.org Artificial IntelligenceJan-21-2022

Traditional information retrieval (IR) ranking models process the full text of documents. Newer models based on Transformers, however, would incur a high computational cost when processing long texts, so typically use only snippets from the document instead. The model's input based on a document's URL, title, and snippet (UTS) is akin to the summaries that appear on a search engine results page (SERP) to help searchers decide which result to click. This raises questions about when such summaries are sufficient for relevance estimation by the ranking model or the human assessor, and whether humans and machines benefit from the document's full text in similar ways. To answer these questions, we study human and neural model based relevance assessments on 12k query-documents sampled from Bing's search logs. We compare changes in the relevance assessments when only the document summaries and when the full text is also exposed to assessors, studying a range of query and document properties, e.g., query type, snippet length. Our findings show that the full text is beneficial for humans and a BERT model for similar query and document types, e.g., tail, long queries. A closer look, however, reveals that humans and machines respond to the additional input in very different ways. Adding the full text can also hurt the ranker's performance, e.g., for navigational queries.

assessor, body text, query, (13 more...)

arXiv.org Artificial Intelligence

2201.08721

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > Washington > King County > Redmond (0.04)
Asia (0.04)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Information Management > Search (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.55)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.47)

Add feedback

How Artificial Intelligence Is Powering Search Engines - DataScienceCentral.com

#artificialintelligenceJan-19-2022, 09:35:12 GMT

Whether you are a customer searching for your favorite products online, a writer looking for the latest statistics, or a business owner learning SEO skills, you are using a search engine to get answers. And search engines are pretty interesting! You open up your favorite one, add some related keywords and click to search. Within a fraction of a second, you get thousands of results for your entered keyword. Search engines can perform the way they do because of the algorithms they have and a lot of brilliant people powering them.

artificial intelligence, intelligence, search engine, (12 more...)

#artificialintelligence

Technology:

Information Technology > Information Management > Search (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (1.00)

Add feedback

Benchmark datasets driving artificial intelligence development fail to capture the needs of medical professionals

Blagec, Kathrin, Kraiger, Jakob, Frühwirt, Wolfgang, Samwald, Matthias

arXiv.org Artificial IntelligenceJan-18-2022

Publicly accessible benchmarks that allow for assessing and comparing model performances are important drivers of progress in artificial intelligence (AI). While recent advances in AI capabilities hold the potential to transform medical practice by assisting and augmenting the cognitive processes of healthcare professionals, the coverage of clinically relevant tasks by AI benchmarks is largely unclear. Furthermore, there is a lack of systematized meta-information that allows clinical AI researchers to quickly determine accessibility, scope, content and other characteristics of datasets and benchmark datasets relevant to the clinical domain. To address these issues, we curated and released a comprehensive catalogue of datasets and benchmarks pertaining to the broad domain of clinical and biomedical natural language processing (NLP), based on a systematic review of literature and online resources. A total of 450 NLP datasets were manually systematized and annotated with rich metadata, such as targeted tasks, clinical applicability, data types, performance metrics, accessibility and licensing information, and availability of data splits. We then compared tasks covered by AI benchmark datasets with relevant tasks that medical practitioners reported as highly desirable targets for automation in a previous empirical study. Our analysis indicates that AI benchmarks of direct clinical relevance are scarce and fail to cover most work activities that clinicians want to see addressed. In particular, tasks associated with routine documentation and patient data administration workflows are not represented despite significant associated workloads. Thus, currently available AI benchmarks are improperly aligned with desired targets for AI automation in clinical settings, and novel benchmarks should be created to fill these gaps.

benchmark, benchmark dataset, dataset, (15 more...)

arXiv.org Artificial Intelligence

2201.0704

Country:

North America > United States (0.28)
Europe > Austria > Vienna (0.14)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Therapeutic Area (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (1.00)
Health & Medicine > Health Care Technology > Medical Record (0.94)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

Content metadata: why keyword extraction requires automated labelling -- EDIA

#artificialintelligenceJan-16-2022, 09:15:25 GMT

Keywords are no science but an art. There is no such thing as'the right keyword,' as we're talking about a core concept incorporated into a piece of content in the broadest form. Texts don't necessarily need to contain an exact keyword. For example, if the term'European Union' is used several times, 'European Commission' may be a suitable keyword even though the writer never uses the term. Despite this fluid definition, keywords should be understandable to those who try to find the right ones.

content metadata, keyword, keyword extraction require, (3 more...)

#artificialintelligence

Country: Europe (0.59)

Industry: Government > Regional Government > Europe Government (0.59)

Technology: Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.43)

Add feedback