AITopics | Information Retrieval

Collaborating Authors

Information Retrieval

Our accustomed systems of retrieving particular bits of information no longer fill the needs of many people. Searching traditional indexes of print publications has been aided by computerized databases, but still usually requires time-consuming serial searching of one database after the other, and then moving on to other methods of searching for internet sources. And what if the information being sought is a sound byte? A video clip? Yesterday's e-mail exchange between respected scientists? Artificial intelligence may hold the key to information retrieval in an age where widely different formats contain the information being sought, and the universe of knowledge is simply too big and growing too rapidly for successful searching to proceed at a human's slow speed.

News Overviews Instructional Materials AI-Alerts Classics

Privacy-focused, rewarded ads browser Brave tops 10M monthly active users - Search Engine Land

#artificialintelligenceDec-7-2019, 14:33:49 GMT

Brave said it has seen a surge in user adoption since releasing version 1.0 of the privacy-centric browser on November 13, 2019. Monthly active users (MAU) have doubled in a year to 10.4 million as of the end of last month. Daily active users of the browser created by Mozilla founder Brendan Eich have tripled in the last year to 3.3 million, the company said Friday. Brave Ads are structured to serve only to users that opt-in to the Brave Rewards program and agree to see ads. Users can then accumulate Brave's Basic Attention Token (BAT), which is a blockchain-based system.

active user, monthly active user, search engine land, (5 more...)

#artificialintelligence

Industry: Information Technology (0.76)

Technology:

Information Technology > Information Management > Search (0.43)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.43)
Information Technology > Communications > Social Media (0.40)

Add feedback

Rethinking Search Engines and Recommendation Systems

Communications of the ACMDec-6-2019, 03:30:56 GMT

In her popular book, Weapons of Math Destruction, data scientist Cathy O'Neil elegantly describes to the general population the danger of the data science revolution in decision making. She describes how the US News ranking of universities, which orders universities based on 15 measured properties, created new dynamics in university behavior, as they adapted to these measures, ultimately resulting in decreased social welfare. Unfortunately, the idea that data science-related algorithms, such as ranking, cause changes in behavior, and that this dynamic may lead to socially inferior outcomes, is dominant in our new online economy. Ranking also plays a crucial role in search engines and recommendation systems--two prominent data science applications that we focus on in this article. Recommendation systems endorse items by ranking them using information induced from some context--for example, the Web page a user is currently browsing, a specific application the user is running on her mobile phone, or the time of day.

application, rethinking search engine, search engine and recommendation system, (1 more...)

Communications of the ACM

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (0.91)
Information Technology > Information Management > Search (0.71)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.71)

Add feedback

Keyword Aware Influential Community Search in Large Attributed Graphs

Islam, Md. Saiful, Ali, Mohammed Eunus, Kang, Yong-Bin, Sellis, Timos, Choudhury, Farhana M.

arXiv.org Artificial IntelligenceDec-4-2019

We introduce a novel keyword-aware influential community query KICQ that finds the most influential communities from an attributed graph, where an influential community is defined as a closely connected group of vertices having some dominance over other groups of vertices with the expertise (a set of keywords) matching with the query terms (words or phrases). We first design the KICQ that facilitates users to issue an influential CS query intuitively by using a set of query terms, and predicates (AND or OR). In this context, we propose a novel word-embedding based similarity model that enables semantic community search, which substantially alleviates the limitations of exact keyword based community search. Next, we propose a new influence measure for a community that considers both the cohesiveness and influence of the community and eliminates the need for specifying values of internal parameters of a network. Finally, we propose two efficient algorithms for searching influential communities in large attributed graphs. We present detailed experiments and a case study to demonstrate the effectiveness and efficiency of the proposed approaches.

graph, keyword, vertex, (17 more...)

arXiv.org Artificial Intelligence

doi: 10.1016/j.is.2021.101914

1912.02114

Country:

Oceania > Australia (0.04)
Asia > Bangladesh (0.04)
North America > United States (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Information Management > Search (1.00)
Information Technology > Data Science > Data Mining (1.00)
Information Technology > Communications (1.00)
(4 more...)

Add feedback

Forward and Backward Feature Selection for Query Performance Prediction

Déjean, Sébastien, Ionescu, Radu Tudor, Mothe, Josiane, Ullah, Md Zia

arXiv.org Machine LearningDec-4-2019

The goal of query performance prediction (QPP) is to automatically estimate the effectiveness of a search result for any given query, without relevance judgements. Post-retrieval features have been shown to be more effective for this task while being more expensive to compute than pre-retrieval features. Combining multiple post-retrieval features is even more effective, but state-of-the-art QPP methods are impossible to interpret because of the black-box nature of the employed machine learning models. However, interpretation is useful for understanding the predictive model and providing more answers about its behavior. Moreover, combining many post-retrieval features is not applicable to real-world cases, since the query running time is of utter importance. In this paper, we investigate a new framework for feature selection in which the trained model explains well the prediction. We introduce a step-wise (forward and backward) model selection approach where different subsets of query features are used to fit different models from which the system selects the best one. We evaluate our approach on four TREC collections using standard QPP features. We also develop two QPP features to address the issue of query-drift in the query feedback setting. We found that: (1) our model based on a limited number of selected features is as good as more complex models for QPP and better than non-selective models; (2) our model is more efficient than complex models during inference time since it requires fewer features; (3) the predictive model is readable and understandable; and (4) one of our new QPP features is consistently selected across different collections, proving its usefulness.

effectiveness, prediction, qpp feature, (13 more...)

arXiv.org Machine Learning

1912.04107

Country:

Europe > France > Occitanie > Haute-Garonne > Toulouse (0.05)
Europe > Romania > București - Ilfov Development Region > Municipality of Bucharest > Bucharest (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.95)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval > Query Processing (0.67)

Add feedback

Information Retrieval and Its Sister Disciplines

Yang, Grace Hui

arXiv.org Artificial IntelligenceDec-4-2019

This article presents a summary graph to show the relationships between Information Retrieval (IR) and other related disciplines. The figure tells the key differences between them and the conditions under which one would transition into another. When I studied Machine Learning (ML), my favorite figure among all was "The Table of Common Distributions" in Casella and Berger's 2002 book "Statistical Inference". It appeared in the book's appendix. Every time when I saw this figure, I was in awe.

graph, information retrieval, ir and recommendation, (12 more...)

arXiv.org Artificial Intelligence

1912.02346

Country:

North America > United States > District of Columbia > Washington (0.05)
North America > United States > California > Monterey County > Pacific Grove (0.05)
Africa > South Africa > Western Cape > Cape Town (0.05)

Genre:

Research Report (0.50)
Overview (0.35)

Technology: Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.74)

Add feedback

A Contextual-Bandit Approach to Online Learning to Rank for Relevance and Diversity

Li, Chang, Feng, Haoyun, de Rijke, Maarten

arXiv.org Machine LearningDec-3-2019

Online learning to rank (LTR) focuses on learning a policy from user interactions that builds a list of items sorted in decreasing order of the item utility. It is a core area in modern interactive systems, such as search engines, recommender systems, or conversational assistants. Previous online LTR approaches either assume the relevance of an item in the list to be independent of other items in the list or the relevance of an item to be a submodular function of the utility of the list. The former type of approach may result in a list of low diversity that has relevant items covering the same aspects, while the latter approaches may lead to a highly diversified list but with some non-relevant items. In this paper, we study an online LTR problem that considers both item relevance and topical diversity. We assume cascading user behavior, where a user browses the displayed list of items from top to bottom and clicks the first attractive item and stops browsing the rest. We propose a hybrid contextual bandit approach, called CascadeHybrid, for solving this problem. CascadeHybrid models item relevance and topical diversity using two independent functions and simultaneously learns those functions from user click feedback. We derive a gap-free bound on the n-step regret of CascadeHybrid. We conduct experiments to evaluate CascadeHybrid on the MovieLens and Yahoo music datasets. Our experimental results show that CascadeHybrid outperforms the baselines on both datasets.

attraction probability, cascadehybrid, diversity, (14 more...)

arXiv.org Machine Learning

1912.00508

Country:

Europe > Netherlands > North Holland > Amsterdam (0.04)
North America > United States > Maryland > Baltimore (0.04)

Genre: Research Report > New Finding (0.34)

Industry: Education > Educational Setting > Online (0.62)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (0.69)
Information Technology > Enterprise Applications > Human Resources > Learning Management (0.62)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Add feedback

80% of Major US E-commerce Stores Use JavaScript for Crucial Content - Search Engine Journal

#artificialintelligenceDec-2-2019, 12:07:47 GMT

According to new data, up to 80% of popular US-based e-commerce stores use JavaScript on crucial content such as product descriptions. That's an alarming number considering any time JavaScript is used to generate important content it runs the risk of not being indexed in search results. Google is getting better at crawling and rendering JavaScript, but it's not perfect. The company still recommends using static HTML as much as possible. The study, from software company Onely, found that 25% of web pages analyzed contained crucial JavaScript content that was unindexed by Google.

javascript, javascript content, us e-commerce store use javascript, (6 more...)

#artificialintelligence

Industry: Information Technology > Services > e-Commerce Services (0.67)

Technology:

Information Technology > Software > Programming Languages (1.00)
Information Technology > Information Management > Search (0.94)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.44)

Add feedback

scikit-hubness: Hubness Reduction and Approximate Neighbor Search

Feldbauer, Roman, Rattei, Thomas, Flexer, Arthur

arXiv.org Machine LearningDec-2-2019

This paper introduces scikit-hubness, a Python package for efficient nearest neighbor search in high-dimensional spaces. Hubness is an aspect of the curse of dimensionality, and is known to impair various learning tasks, including classification, clustering, and visualization. scikit-hubness provides algorithms for hubness analysis ("Is my data affected by hubness?"), hubness reduction ("How can we improve neighbor retrieval in high dimensions?"), and approximate neighbor search ("Does it work for large data sets?"). It is integrated into the scikit-learn environment, enabling rapid adoption by Python-based machine learning researchers and practitioners. Users will find all functionality of the scikit-learn neighbors package, plus additional support for transparent hubness reduction and approximate nearest neighbor search. scikit-hubness is developed using several quality assessment tools and principles, such as PEP8 compliance, unit tests with high code coverage, continuous integration on all major platforms (Linux, MacOS, Windows), and additional checks by LGTM. The source code is available at https://github.com/VarIr/scikit-hubness under the BSD 3-clause license. Install from the Python package index with $ pip install scikit-hubness.

hubness reduction, neighbor, reduction, (12 more...)

arXiv.org Machine Learning

1912.00706

Country: Europe > Austria > Vienna (0.15)

Genre: Research Report (0.41)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.69)

Add feedback

Latent Semantic Search and Information Extraction Architecture

Kolonin, Anton

arXiv.org Artificial IntelligenceNov-30-2019

The motivation, concept, design and implementation of latent semantic search for search engines have limited semantic search, entity extraction and property attribution features, have insufficient accuracy and response time of latent search, may impose privacy concerns and the search results are unavailable in offline mode for robotic search operations. The alternative suggestion involves autonomous search engine with adaptive storage consumption, configurable search scope and latent search response time with built-in options for entity extraction and property attribution available as open source platform for mobile, desktop and server solutions. The suggested architecture attempts to implement artificial general intelligence (AGI) principles as long as autonomous behaviour constrained by limited resources is concerned, and it is applied for specific task of enabling Web search for artificial agents implementing the AGI.

agent, application, architecture, (11 more...)

arXiv.org Artificial Intelligence

1912.0018

Country:

Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.28)
Asia > Russia > Siberian Federal District > Novosibirsk Oblast > Novosibirsk (0.05)
Europe > Russia (0.05)
Oceania > Australia > Queensland > Brisbane (0.04)

Genre: Research Report (0.82)

Industry: Information Technology > Security & Privacy (0.67)

Technology:

Information Technology > Information Management > Search (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.87)
Information Technology > Artificial Intelligence > Natural Language > Information Extraction (0.87)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.70)

Add feedback

Semantic Search Engine & Search Analytics Platform for Business

#artificialintelligenceNov-29-2019, 12:33:40 GMT

Don't be limited by a search engine that doesn't understand the user intent or the context. Enjoy the power of highly targeted, intuitive and conceptual search and exploration. To help you easily skim through the results, we offer a wide variety of options like Clustering, Semantic Cloud and Intuitive Facets. You could also get exploratory with our Concept Search. We promise a quicker and better search every time!

engine & search analytic platform, semantic search engine

#artificialintelligence

Technology:

Information Technology > Information Management > Search (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.74)

Add feedback