AITopics | Information Retrieval

Collaborating Authors

Information Retrieval

Our accustomed systems of retrieving particular bits of information no longer fill the needs of many people. Searching traditional indexes of print publications has been aided by computerized databases, but still usually requires time-consuming serial searching of one database after the other, and then moving on to other methods of searching for internet sources. And what if the information being sought is a sound byte? A video clip? Yesterday's e-mail exchange between respected scientists? Artificial intelligence may hold the key to information retrieval in an age where widely different formats contain the information being sought, and the universe of knowledge is simply too big and growing too rapidly for successful searching to proceed at a human's slow speed.

News Overviews Instructional Materials AI-Alerts Classics

Google Says it Doesn't Require Fixing Structured Data Warnings - Search Engine Journal

#artificialintelligenceAug-26-2019, 10:01:36 GMT

In a Webmaster Hangout, an eCommerce publisher complained about structured data warnings regarding data fields that are inappropriate to their product. They refused to create fake information to get a passing score. John Mueller responded that there's a difference between warnings and errors. The person asking the question sold custom hand made products. They did not have a global identifier.

artificial intelligence, information retrieval, natural language, (8 more...)

#artificialintelligence

Industry: Information Technology > Services (0.72)

Technology:

Information Technology > Information Management > Search (0.40)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.40)

Add feedback

Real-world Conversational AI for Hotel Bookings

Li, Bai, Jiang, Nanyi, Sham, Joey, Shi, Henry, Fazal, Hussein

arXiv.org Machine LearningAug-26-2019

Hussein Fazal SnapTravel Toronto, Canada hussein@snaptravel.com Abstract --In this paper, we present a real-world conversational AI system to search for and book hotels through text messaging. Our architecture consists of a frame-based dialogue management system, which calls machine learning models for intent classification, named entity recognition, and information retrieval subtasks. Our chatbot has been deployed on a commercial scale, handling tens of thousands of hotel searches every day. We describe the various opportunities and challenges of developing a chatbot in the travel industry. Index T erms--conversational AI, task-oriented chatbot, named entity recognition, information retrieval I. I NTRODUCTION Task-oriented chatbots have recently been applied to many areas in e-commerce.

information retrieval, machine learning, natural language, (17 more...)

arXiv.org Machine Learning

1908.10001

Country: North America > Canada > Ontario > Toronto (0.26)

Genre: Research Report (0.40)

Industry:

Consumer Products & Services > Travel (1.00)
Consumer Products & Services > Hotels (0.90)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.98)

Add feedback

Nearest Neighbor Search-Based Bitwise Source Separation Using Discriminant Winner-Take-All Hashing

Kim, Sunwoo, Kim, Minje

arXiv.org Artificial IntelligenceAug-26-2019

We propose an iteration-free source separation algorithm based on Winner-Take-All (WTA) hash codes, which is a faster, yet accurate alternative to a complex machine learning model for single-channel source separation in a resource-constrained environment. We first generate random permutations with WTA hashing to encode the shape of the multidimensional audio spectrum to a reduced bitstring representation. A nearest neighbor search on the hash codes of an incoming noisy spectrum as the query string results in the closest matches among the hashed mixture spectra. Using the indices of the matching frames, we obtain the corresponding ideal binary mask vectors for denoising. Since both the training data and the search operation are bitwise, the procedure can be done efficiently in hardware implementations. Experimental results show that the WTA hash codes are discriminant and provide an affordable dictionary search mechanism that leads to a competent performance compared to a comprehensive model and oracle masking.

information retrieval, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

1908.09799

Country: North America > United States > Indiana (0.14)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.75)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.70)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.64)
(2 more...)

Add feedback

WordPress 3 Search Engine Optimization - Programmer Books

#artificialintelligenceAug-25-2019, 01:31:35 GMT

WordPress is a powerful platform for creating feature-rich and attractive websites and blogs; but with a little extra tweaking and effort your WordPress site can dominate the search engines and bring thousands of new customers to your blog or business. WordPress3.0 Search Engine Optimization will show you the secrets that professional SEO companies use to take websites to the top of search results and proliferate their business. You'll be able to take your WordPress blog/site to the next level, as well as brush aside even the stiffest competition with this book in hand. We'll begin with a typical WordPress installation and with a variety of simple techniques, turn it into a powerful website that search engines will reward with high rankings. We'll go further: with advanced plug-ins we'll connect your WordPress site to popular social media sites and expand the reach of your site to bring more visitors.

artificial intelligence, information retrieval, natural language, (7 more...)

#artificialintelligence

Technology:

Information Technology > Information Management > Search (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (1.00)

Add feedback

Automatic Language Identification in Texts: A Survey

Jauhiainen, Tommi, Lui, Marco, Zampieri, Marcos, Baldwin, Timothy, Lindén, Krister

Journal of Artificial Intelligence ResearchAug-25-2019

Language identification ("LI") is the problem of determining the natural language that a document or part thereof is written in. Automatic LI has been extensively researched for over fifty years. Today, LI is a key part of many text processing pipelines, as text processing techniques generally assume that the language of the input text is known. Research in this area has recently been especially active. This article provides a brief history of LI research, and an extensive survey of the features and methods used in the LI literature. We describe the features and methods using a unified notation, to make the relationships between methods clearer. We discuss evaluation methods, applications of LI, as well as off-the-shelf LI systems that do not require training by the end user. Finally, we identify open issues, survey the work to date on each issue, and propose future directions for research in LI.

pattern recognition association, text-based language identification, word-level language identification, (16 more...)

Journal of Artificial Intelligence Research

doi: 10.1613/jair.1.11675

AI Access Foundation

11675

Journal of Artificial Intelligence Research

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > California > Los Angeles County > Los Angeles (0.14)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.14)
(135 more...)

Genre:

Overview (1.00)
Research Report > New Finding (0.92)

Industry:

Information Technology > Services (1.00)
Education (1.00)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
(10 more...)

Add feedback

Google's John Mueller Answers if Linking Out Good for SEO - Search Engine Journal

#artificialintelligenceAug-24-2019, 13:33:10 GMT

Google launched a new video series that answers a single question. The first episode was about links but in my opinion it did not adequately answer the question. "Does linking to other websites help or hurt SEO?" The SEO community has thought of outbound links as ranking signals since at least 2002. I hope to show you how and why outbound links for SEO was invented.

information retrieval, natural language, outbound link, (16 more...)

#artificialintelligence

Genre: Personal > Opinion (0.37)

Technology:

Information Technology > Information Management > Search (0.40)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.40)

Add feedback

As Search Engines Increasingly Turn To AI They Are Harming Search

#artificialintelligenceAug-22-2019, 07:06:23 GMT

For more than half a century our digital search engines have relied upon the humble keyword. Yet over the past few years, search engines of all kinds have increasingly turned to deep learning-powered categorization and recommendation algorithms to augment and slowly replace the traditional keyword search. Behavioral and interest-based personalization has further eroded the impact of keyword searches, meaning that if ten people all search for the same thing, they may all get different results. As search engines depreciate traditional raw "search" in favor of AI-assisted navigation, the concept of informational access is being harmed and our digital world is being redefined by the limitations of today's AI. At first glance, the evolution of search from simple TF-IDF keyword queries into today's AI-powered personalized digital navigation is a positive step towards making the digital world more accessible to the general public.

information retrieval, machine learning, natural language, (17 more...)

#artificialintelligence

Technology:

Information Technology > Information Management > Search (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.40)

Add feedback

Intent term selection and refinement in e-commerce queries

Manchanda, Saurav, Sharma, Mohit, Karypis, George

arXiv.org Artificial IntelligenceAug-22-2019

In e-commerce, a user tends to search for the desired product by issuing a query to the search engine and examining the retrieved results. If the search engine was successful in correctly understanding the user's query, it will return results that correspond to the products whose attributes match the terms in the query that are representative of the query's product intent. However, the search engine may fail to retrieve results that satisfy the query's product intent and thus degrading user experience due to different issues in query processing: (i) when multiple terms are present in a query it may fail to determine the relevant terms that are representative of the query's product intent, and (ii) it may suffer from vocabulary gap between the terms in the query and the product's description, i.e., terms used in the query are semantically similar but different from the terms in the product description. Hence, identifying the terms that describe the query's product intent and predicting additional terms that describe the query's product intent better than the existing query terms to the search engine is an essential task in e-commerce search. In this paper, we leverage the historical query reformulation logs of a major e-commerce retailer to develop distant-supervised approaches to solve both these problems. Our approaches exploit the fact that the significance of a term is dependent upon the context (other terms in the neighborhood) in which it is used in order to learn the importance of the term towards the query's product intent. We show that identifying and emphasizing the terms that define the query's product intent leads to a 3% improvement in ranking. Moreover, for the tasks of identifying the important terms in a query and for predicting the additional terms that represent product intent, experiments illustrate that our approaches outperform the non-contextual baselines.

information retrieval, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

1908.08564

Country: North America > United States (0.93)

Genre: Research Report (1.00)

Industry:

Retail (1.00)
Information Technology > Services > e-Commerce Services (1.00)

Technology:

Information Technology > Information Management > Search (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)

Add feedback

Towards Effective Device-Aware Federated Learning

Anelli, Vito Walter, Deldjoo, Yashar, Di Noia, Tommaso, Ferrara, Antonio

arXiv.org Machine LearningAug-20-2019

With the wealth of information produced by social networks, smartphones, medical or financial applications, speculations have been raised about the sensitivity of such data in terms of users' personal privacy and data security. To address the above issues, Federated Learning (FL) has been recently proposed as a means to leave data and computational resources distributed over a large number of nodes (clients) where a central coordinating server aggregates only locally computed updates without knowing the original data. In this work, we extend the FL framework by pushing forward the state the art in the field on several dimensions: (i) unlike the original FedAvg approach relying solely on single criteria (i.e., local dataset size), a suite of domain- and client-specific criteria constitute the basis to compute each local client's contribution, (ii) the multi-criteria contribution of each device is computed in a prioritized fashion by leveraging a priority-aware aggregation operator used in the field of information retrieval, and (iii) a mechanism is proposed for online-adjustment of the aggregation operator parameters via a local search strategy with backtracking. Extensive experiments on a publicly available dataset indicate the merits of the proposed approach compared to standard FedAvg baseline.

criteria, information retrieval, machine learning, (18 more...)

arXiv.org Machine Learning

1908.0742

Country: Europe (0.28)

Genre: Research Report (0.82)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.54)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.54)

Add feedback

Learning Representations and Agents for Information Retrieval

Nogueira, Rodrigo

arXiv.org Artificial IntelligenceAug-16-2019

A goal shared by artificial intelligence and information retrieval is to create an oracle, that is, a machine that can answer our questions, no matter how difficult they are. A more limited, but still instrumental, version of this oracle is a question-answering system, in which an open-ended question is given to the machine, and an answer is produced based on the knowledge it has access to. Such systems already exist and are increasingly capable of answering complicated questions. This progress can be partially attributed to the recent success of machine learning and to the efficient methods for storing and retrieving information, most notably through web search engines. One can imagine that this general-purpose question-answering system can be built as a billion-parameters neural network trained end-to-end with a large number of pairs of questions and answers. We argue, however, that although this approach has been very successful for tasks such as machine translation, storing the world's knowledge as parameters of a learning machine can be very hard. A more efficient way is to train an artificial agent on how to use an external retrieval system to collect relevant information. This agent can leverage the effort that has been put into designing and running efficient storage and retrieval systems by learning how to best utilize them to accomplish a task. ...

information retrieval, machine learning, question answering, (19 more...)

arXiv.org Artificial Intelligence

1908.06132

Country:

Europe (0.93)
North America > United States > California (0.27)

Genre: Research Report > New Finding (0.92)

Industry:

Leisure & Entertainment > Games (1.00)
Media (0.93)
Health & Medicine (0.93)
Education (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Question Answering (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback