AITopics | Information Retrieval

Collaborating Authors

Information Retrieval

Our accustomed systems of retrieving particular bits of information no longer fill the needs of many people. Searching traditional indexes of print publications has been aided by computerized databases, but still usually requires time-consuming serial searching of one database after the other, and then moving on to other methods of searching for internet sources. And what if the information being sought is a sound byte? A video clip? Yesterday's e-mail exchange between respected scientists? Artificial intelligence may hold the key to information retrieval in an age where widely different formats contain the information being sought, and the universe of knowledge is simply too big and growing too rapidly for successful searching to proceed at a human's slow speed.

News Overviews Instructional Materials AI-Alerts Classics

Think Search Is Solved? Think Again

#artificialintelligenceNov-25-2020, 20:00:42 GMT

Search is one of the oldest technologies around. Ever since the dawn of the World Wide Web, a search engine has been the portal through which we obtain information. The search for a better search engine index kick started the Hadoop craze, and it continues to drive Google to push the limits of technology. But don't for a second think that search has been solved. "Search is far from being solved. It's the hardest thing we do. It's the hardest thing everybody does."

engine, information, search engine, (13 more...)

#artificialintelligence

Country:

North America > United States > New York (0.05)
North America > Canada > Quebec > Capitale-Nationale Region > Québec (0.05)
North America > Canada > Quebec > Capitale-Nationale Region > Quebec City (0.05)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.65)
Information Technology > Data Science > Data Mining > Big Data (0.50)

Add feedback

Can Artificial Intelligence be friends of Humans - OnPassive

#artificialintelligenceNov-24-2020, 12:55:05 GMT

A question arises that how will it become self-aware and realize that humans stand in its way? Artificial Intelligence is the capability of a digital computer or computer-controlled robot that performs a task commonly associated with intelligent beings. Robots and AI allow producing things faster, better, and cheaper with higher consistency. AI is very disruptive for low-cost countries that provide low-cost manufacturing for international companies since robots do this cheaply. It is also disruptive to countries with higher salary levels, but not at the same level as low-cost countries. Our forefathers had the same concern with industrial revolutions.

artificial intelligence, productivity, robot, (5 more...)

#artificialintelligence

Industry: Health & Medicine (0.31)

Technology:

Information Technology > Artificial Intelligence > Robots (0.86)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.31)

Add feedback

Code Search Intent Classification Using Weak Supervision

Rao, Nikitha, Bansal, Chetan, Guan, Joe

arXiv.org Artificial IntelligenceNov-24-2020

Developers use search for various tasks such as finding code, documentation, debugging information, etc. In particular, web search is heavily used by developers for finding code examples and snippets during the coding process. Recently, natural language based code search has been an active area of research. However, the lack of real-world large-scale datasets is a significant bottleneck. In this work, we propose a weak supervision based approach for detecting code search intent in search queries for C# and Java programming languages. We evaluate the approach against several baselines on a real-world dataset comprised of over 1 million queries mined from Bing web search engine and show that the CNN based model can achieve an accuracy of 77% and 76% for C# and Java respectively. Furthermore, we are also releasing the first large-scale real-world dataset of code search queries mined from Bing web search engine. We hope that the dataset will aid future research on code search.

code search intent, dataset, query, (12 more...)

arXiv.org Artificial Intelligence

2011.1195

Country:

North America > United States > New York > New York County > New York City (0.05)
North America > United States > Massachusetts > Suffolk County > Boston (0.04)

Genre: Research Report (0.51)

Technology:

Information Technology > Information Management > Search (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.71)

Add feedback

Ghostery's New Search Engine Will Be Entirely Ad-Free

WIREDNov-18-2020, 14:00:00 GMT

The internet runs on advertising, and that includes search engines. Google brought in $26 billion of search revenue in the most recent quarter alone. As that business has grown, it's reshaped what search looks like. Year after year, ads have gobbled up more space on its results pages, pushing organic results further out of view. Which is why using Ghostery's new ad-free search engine and desktop browser, even in their pre-beta form, feels at once like a throwback to a simpler internet and a glimpse of a future where browsing that puts results ahead of revenue is once again possible.

ghostery, google, search engine, (4 more...)

WIRED

Technology:

Information Technology > Information Management > Search (0.87)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.87)
Information Technology > Communications (0.55)

Add feedback

Non-Linear Multiple Field Interactions Neural Document Ranking

Takiguchi, Kentaro, Twomey, Niall, Vaquero, Luis M.

arXiv.org Artificial IntelligenceNov-18-2020

Ranking tasks are usually based on the text of the main body of the page and the actions (clicks) of users on the page. There are other elements that could be leveraged to better contextualise the ranking experience (e.g. text in other fields, query made by the user, images, etc). We present one of the first in-depth analyses of field interaction for multiple field ranking in two separate datasets. While some works have taken advantage of full document structure, some aspects remain unexplored. In this work we build on previous analyses to show how query-field interactions, non-linear field interactions, and the architecture of the underlying neural model affect performance.

field interaction, interaction, query-field interaction, (13 more...)

arXiv.org Artificial Intelligence

2011.0958

Country:

Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.28)
North America > Canada > Quebec > Montreal (0.06)
North America > United States > New York > New York County > New York City (0.05)
(5 more...)

Genre: Research Report (1.00)

Technology:

Information Technology > Information Management > Search (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.69)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.49)

Add feedback

Instagram finally lets you search for posts by keyword

EngadgetNov-17-2020, 19:35:00 GMT

Enhancements to Guides isn't the only thing Instagram users can look forward to checking out today. The app has started rolling out a new, more robust search tool to users in the US, UK, Canada, Ireland and two other English-speaking countries that allows you to look for posts using keywords. If you're someone who wants to grow their followers, this change should ideally help with discoverability since you won't need to be so exacting with the hashtags you add to a post. As for how the new tool goes about surfacing the content it does, an Instagram spokesperson told The Verge the new algorithm considers several factors, including when someone shared the post, the accompanying caption and the photo or video that's on display. Instagram also says it's using machine learning to put forward "the highest quality content that's relevant to you."

instagram, keyword

Engadget

Country:

North America > United States (0.58)
North America > Canada (0.28)
Europe > Ireland (0.28)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.40)

Add feedback

FLAT: Fast, Lightweight and Accurate Method for Cardinality Estimation

Zhu, Rong, Wu, Ziniu, Han, Yuxing, Zeng, Kai, Pfadler, Andreas, Qian, Zhengping, Zhou, Jingren, Cui, Bin

arXiv.org Artificial IntelligenceNov-17-2020

Query optimizers rely on accurate cardinality estimation (CardEst) to produce good execution plans. The core problem of CardEst is how to model the rich joint distribution of attributes in an accurate and compact manner. Despite decades of research, existing methods either over simplify the models only using independent factorization which leads to inaccurate estimates and sub optimal query plans, or over-complicate them by lossless conditional factorization without any independent assumption which results in slow probability computation. In this paper, we propose FLAT, a CardEst method that is simultaneously fast in probability computation, lightweight in model size and accurate in estimation quality. The key idea of FLAT is a novel unsupervised graphical model, called FSPN. It utilizes both independent and conditional factorization to adaptively model different levels of attributes correlations, and thus subsumes all existing CardEst models and dovetails their advantages. FLAT supports efficient online probability computation in near liner time on the underlying FSPN model, and provides effective offline model construction. It can estimate cardinality for both single table queries and multi-table join queries. Extensive experimental study demonstrates the superiority of FLAT over existing CardEst methods on well-known benchmarks: FLAT achieves 1 to 5 orders of magnitude better accuracy, 1 to 3 orders of magnitude faster probability computation speed (around 0.2ms) and 1 to 2 orders of magnitude lower storage cost (only tens of KB).

node, probability, query, (15 more...)

arXiv.org Artificial Intelligence

2011.09022

Country:

North America > United States > New York (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Databases (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.92)
(2 more...)

Add feedback

Practical Guide to Entity Resolution -- part 4

#artificialintelligenceNov-11-2020, 19:11:11 GMT

This is part 4 of a mini-series on entity resolution. Candidate pair generation is a fairly straightforward part of ER, as it is essentially a self join on the blocking keys. The next step after candidate pair generation, is to score the candidate pair match likelihood. This is crucial to removing non-matches and creating the final resolved entities. This step is again fairly open ended and one can be very creative about the specific scoring functions and features to implement.

entity resolution, iteration, practical guide, (5 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.64)

Add feedback

Artificial Intelligence Decision Support for Medical Triage

Marchiori, Chiara, Dykeman, Douglas, Girardi, Ivan, Ivankay, Adam, Thandiackal, Kevin, Zusag, Mario, Giovannini, Andrea, Karpati, Daniel, Saenz, Henri

arXiv.org Artificial IntelligenceNov-9-2020

Applying state-of-the-art machine learning and natural language processing on approximately one million of teleconsultation records, we developed a triage system, now certified and in use at the largest European telemedicine provider. The system evaluates care alternatives through interactions with patients via a mobile application. Reasoning on an initial set of provided symptoms, the triage application generates AIpowered, personalized questions to better characterize the problem and recommends the most appropriate point of care and time frame for a consultation. The underlying technology was developed to meet the needs for performance, transparency, user acceptance and ease of use, central aspects to the adoption of AIbased decision support systems. Providing such remote guidance at the beginning of the chain of care has significant potential for improving cost efficiency, patient experience and outcomes. Being remote, always available and highly scalable, this service is fundamental in high demand situations, such as the current COVID-19 outbreak. Introduction Shortage of physicians and increasing healthcare costs have created a need for digital solutions to better optimize medical resources. In addition, patient expectations for mobile, fast and easy 24/7 access to doctors and health services drive the development of patient-centered solutions.

medical concept, ontology, symptom, (16 more...)

arXiv.org Artificial Intelligence

2011.04548

Country:

North America > United States (0.14)
Europe > Switzerland > Basel-City > Basel (0.04)
Asia > Middle East > Qatar > Ad-Dawhah > Doha (0.04)
Asia > China > Hubei Province > Wuhan (0.04)

Genre:

Research Report (0.50)
Overview (0.46)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Therapeutic Area > Immunology (1.00)
Health & Medicine > Health Care Technology > Telehealth (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.69)

Add feedback

Sampling-Decomposable Generative Adversarial Recommender

Jin, Binbin, Lian, Defu, Liu, Zheng, Liu, Qi, Ma, Jianhui, Xie, Xing, Chen, Enhong

arXiv.org Artificial IntelligenceNov-2-2020

Recommendation techniques are important approaches for alleviating information overload. Being often trained on implicit user feedback, many recommenders suffer from the sparsity challenge due to the lack of explicitly negative samples. The GAN-style recommenders (i.e., IRGAN) addresses the challenge by learning a generator and a discriminator adversarially, such that the generator produces increasingly difficult samples for the discriminator to accelerate optimizing the discrimination objective. However, producing samples from the generator is very time-consuming, and our empirical study shows that the discriminator performs poor in top-k item recommendation. To this end, a theoretical analysis is made for the GAN-style algorithms, showing that the generator of limit capacity is diverged from the optimal generator. This may interpret the limitation of discriminator's performance. Based on these findings, we propose a Sampling-Decomposable Generative Adversarial Recommender (SD-GAR). In the framework, the divergence between some generator and the optimum is compensated by self-normalized importance sampling; the efficiency of sample generation is improved with a sampling-decomposable generator, such that each sample can be generated in O(1) with the Vose-Alias method. Interestingly, due to decomposability of sampling, the generator can be optimized with the closed-form solutions in an alternating manner, being different from policy gradient in the GAN-style algorithms. We extensively evaluate the proposed algorithm with five real-world recommendation datasets. The results show that SD-GAR outperforms IRGAN by 12.4% and the SOTA recommender by 10% on average. Moreover, discriminator training can be 20x faster on the dataset with more than 120K items.

generator, information retrieval, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2011.00956

Country:

Asia > China (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report > New Finding (0.48)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Data Science (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (0.48)
(2 more...)

Add feedback