AITopics | Information Retrieval

Collaborating Authors

Information Retrieval

Our accustomed systems of retrieving particular bits of information no longer fill the needs of many people. Searching traditional indexes of print publications has been aided by computerized databases, but still usually requires time-consuming serial searching of one database after the other, and then moving on to other methods of searching for internet sources. And what if the information being sought is a sound byte? A video clip? Yesterday's e-mail exchange between respected scientists? Artificial intelligence may hold the key to information retrieval in an age where widely different formats contain the information being sought, and the universe of knowledge is simply too big and growing too rapidly for successful searching to proceed at a human's slow speed.

News Overviews Instructional Materials AI-Alerts Classics

Natural Language Processing of Clinical Notes on Chronic Diseases: Systematic Review

Sheikhalishahi, Seyedmostafa, Miotto, Riccardo, Dudley, Joel T, Lavelli, Alberto, Rinaldi, Fabio, Osmani, Venet

arXiv.org Artificial IntelligenceAug-15-2019

Of the 2652 articles considered, 106 met the inclusion criteria. Review of the included papers resulted in identification of 43 chronic diseases, which were then further classified into 10 disease categories using ICD-10. The majority of studies focused on diseases of the circulatory system (n=38) while endocrine and metabolic diseases were fewest (n=14). This was due to the structure of clinical records related to metabolic diseases, which typically contain much more structured data, compared with medical records for diseases of the circulatory system, which focus more on unstructured data and consequently have seen a stronger focus of NLP. The review has shown that there is a significant increase in the use of machine learning methods compared to rule-based approaches; however, deep learning methods remain emergent (n=3). Consequently, the majority of works focus on classification of disease phenotype with only a handful of papers addressing extraction of comorbidities from the free text or integration of clinical notes with structured data. There is a notable use of relatively simple methods, such as shallow classifiers (or combination with rule-based methods), due to the interpretability of predictions, which still represents a significant issue for more complex methods. Finally, scarcity of publicly available data may also have contributed to insufficient development of more advanced methods, such as extraction of word embeddings from clinical notes. Further efforts are still required to improve (1) progression of clinical NLP methods from extraction toward understanding; (2) recognition of relations among entities rather than entities in isolation; (3) temporal extraction to understand past, current, and future clinical events; (4) exploitation of alternative sources of clinical knowledge; and (5) availability of large-scale, de-identified clinical corpora.

information retrieval, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

doi: 10.2196/12239

1908.0578

Country:

North America > United States (0.46)
Europe > Switzerland (0.28)

Genre:

Research Report > Experimental Study (1.00)
Overview (1.00)

Industry:

Health & Medicine > Health Care Technology > Medical Record (1.00)
Health & Medicine > Therapeutic Area > Endocrinology > Diabetes (0.47)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Rule-Based Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
(2 more...)

Add feedback

Reasoning-Driven Question-Answering for Natural Language Understanding

Khashabi, Daniel

arXiv.org Artificial IntelligenceAug-13-2019

Natural language understanding (NLU) of text is a fundamental challenge in AI, and it has received significant attention throughout the history of NLP research. This primary goal has been studied under different tasks, such as Question Answering (QA) and Textual Entailment (TE). In this thesis, we investigate the NLU problem through the QA task and focus on the aspects that make it a challenge for the current state-of-the-art technology. This thesis is organized into three main parts: In the first part, we explore multiple formalisms to improve existing machine comprehension systems. We propose a formulation for abductive reasoning in natural language and show its effectiveness, especially in domains with limited training data. Additionally, to help reasoning systems cope with irrelevant or redundant information, we create a supervised approach to learn and detect the essential terms in questions. In the second part, we propose two new challenge datasets. In particular, we create two datasets of natural language questions where (i) the first one requires reasoning over multiple sentences; (ii) the second one requires temporal common sense reasoning. We hope that the two proposed datasets will motivate the field to address more complex problems. In the final part, we present the first formal framework for multi-step reasoning algorithms, in the presence of a few important properties of language use, such as incompleteness, ambiguity, etc. We apply this framework to prove fundamental limitations for reasoning algorithms. These theoretical results provide extra intuition into the existing empirical evidence in the field.

comprehension task, machine reading comprehension, reading comprehension dataset, (16 more...)

arXiv.org Artificial Intelligence

1908.04926

Country:

North America > United States > California > San Francisco County > San Francisco (0.13)
North America > United States > New York (0.04)
North America > United States > Pennsylvania (0.04)
(18 more...)

Genre:

Research Report > New Finding (1.00)
Overview (1.00)
Research Report > Experimental Study (0.92)

Industry:

Education > Curriculum > Subject-Specific Education (1.00)
Education > Educational Setting > K-12 Education (0.92)
Health & Medicine (0.92)
Education > Assessment & Standards > Student Performance (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Rule-Based Reasoning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
(13 more...)

Add feedback

A Survey of Cross-lingual Word Embedding Models

Ruder, Sebastian, Vulić, Ivan, Søgaard, Anders

Journal of Artificial Intelligence ResearchAug-12-2019

Cross-lingual representations of words enable us to reason about word meaning in multilingual contexts and are a key facilitator of cross-lingual transfer when developing natural language processing models for low-resource languages. In this survey, we provide a comprehensive typology of cross-lingual word embedding models. We compare their data requirements and objective functions. The recurring theme of the survey is that many of the models presented in the literature optimize for the same objectives, and that seemingly different models are often equivalent, modulo optimization strategies, hyper-parameters, and such. We also discuss the different ways cross-lingual word embeddings are evaluated, as well as future challenges and research horizons.

cross-lingual word, proceedings, representation, (17 more...)

Journal of Artificial Intelligence Research

doi: 10.1613/jair.1.11640

AI Access Foundation

11640

Journal of Artificial Intelligence Research

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.27)
Europe > Denmark > Capital Region > Copenhagen (0.04)
Asia > Middle East > Jordan (0.04)
(3 more...)

Genre: Overview (1.00)

Industry: Education (0.45)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)
(4 more...)

Add feedback

Nonconvex Zeroth-Order Stochastic ADMM Methods with Lower Function Query Complexity

Huang, Feihu, Gao, Shangqian, Pei, Jian, Huang, Heng

arXiv.org Machine LearningJul-29-2019

Zeroth-order (gradient-free) method is a class of powerful optimization tool for many machine learning problems because it only needs function values (not gradient) in the optimization. In particular, zeroth-order method is very suitable for many complex problems such as black-box attacks and bandit feedback, whose explicit gradients are difficult or infeasible to obtain. Recently, although many zeroth-order methods have been developed, these approaches still exist two main drawbacks: 1) high function query complexity; 2) not being well suitable for solving the problems with complex penalties and constraints. To address these challenging drawbacks, in this paper, we propose a novel fast zeroth-order stochastic alternating direction method of multipliers (ADMM) method (\emph{i.e.}, ZO-SPIDER-ADMM) with lower function query complexity for solving nonconvex problems with multiple nonsmooth penalties. Moreover, we prove that our ZO-SPIDER-ADMM has the optimal function query complexity of $O(dn + dn^{\frac{1}{2}}\epsilon^{-1})$ for finding an $\epsilon$-approximate local solution, where $n$ and $d$ denote the sample size and dimension of data, respectively. In particular, the ZO-SPIDER-ADMM improves the existing best nonconvex zeroth-order ADMM methods by a factor of $O(d^{\frac{1}{3}}n^{\frac{1}{6}})$. Moreover, we propose a fast online ZO-SPIDER-ADMM (\emph{i.e.,} ZOO-SPIDER-ADMM). Our theoretical analysis shows that the ZOO-SPIDER-ADMM has the function query complexity of $O(d\epsilon^{-\frac{3}{2}})$, which improves the existing best result by a factor of $O(\epsilon^{-\frac{1}{2}})$. Finally, we utilize a task of structured adversarial attack on black-box deep neural networks to demonstrate the efficiency of our algorithms.

inequality hold, machine learning, natural language, (18 more...)

arXiv.org Machine Learning

1907.13463

Genre: Research Report (0.63)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Information Retrieval > Query Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.87)

Add feedback

5 Ways AI Has Changed Ecommerce - Search Engine Journal

#artificialintelligenceJul-26-2019, 15:08:47 GMT

This is a sponsored post written by Atomic Reach. The opinions expressed in this article are the sponsor's own. When it comes to shopping, many customers have decided to take their business online. Statista has estimated that 1.92 billion global buyers will participate in ecommerce activities in 2019. The number is expected to rise to more than 2 billion by 2021. This demand for online goods has caused companies to be more creative in how they reach audiences online.

customer, information retrieval, natural language, (17 more...)

#artificialintelligence

Industry:

Marketing (0.72)
Information Technology > Services > e-Commerce Services (0.69)

Technology:

Information Technology > Information Management > Search (0.86)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.52)

Add feedback

The Green Google: Berlin Search Engine Uses Profits to Plant Trees

Der Spiegel InternationalJul-25-2019, 18:25:45 GMT

At first glance, the Berlin startup doesn't seem so different from others: a factory floor in the rear courtyard of a building in the city's Neukölln district, stacked preserving jars filled with muesli in the kitchen, a discarded ping-pong surface repurposed as a conference table. The employees are young, relaxed and very international. The company's head and founder, Christian Kroll, is 35 years old, the same age as Mark Zuckerberg. The two men also share a quirk: To avoid wasting time in the mornings choosing an outfit, he always wears the same thing -- in his case, blank white T-shirts made from organic cotton. Zuckerberg's favorite color, by contrast, is gray.

information retrieval, kroll, natural language, (16 more...)

Der Spiegel International

Country:

Europe > Germany > Saxony-Anhalt (0.05)
Asia > Indonesia (0.05)
South America > Brazil (0.04)
(8 more...)

Industry:

Energy > Renewable (0.47)
Energy > Power Industry (0.47)

Technology:

Information Technology > Information Management > Search (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.44)

Add feedback

Production Ranking Systems: A Review

Iqbal, Murium, Subedi, Nishan, Aryafar, Kamelia

arXiv.org Machine LearningJul-24-2019

The problem of ranking is a multi-billion dollar problem. In this paper we present an overview of several production quality ranking systems. We show that due to conflicting goals of employing the most effective machine learning models and responding to users in real time, ranking systems have evolved into a system of systems, where each subsystem can be viewed as a component layer. We view these layers as being data processing, representation learning, candidate selection and online inference. Each layer employs different algorithms and tools, with every end-to-end ranking system spanning multiple architectures. Our goal is to familiarize the general audience with a working knowledge of ranking at scale, the tools and algorithms employed and the challenges introduced by adopting a layered approach.

data mining, information retrieval, machine learning, (19 more...)

arXiv.org Machine Learning

1907.12372

Country: Europe > France (0.16)

Genre: Research Report (1.00)

Industry: Information Technology > Services (1.00)

Technology:

Information Technology > Information Management > Search (1.00)
Information Technology > Data Science > Data Mining (1.00)
Information Technology > Communications (1.00)
(4 more...)

Add feedback

Is Google creating a voice-activated search engine for TODDLERS?

Daily Mail - Science & techJul-22-2019, 23:34:33 GMT

Google is potentially creating a search engine for toddlers, despite recent privacy scandals. The tech giant has filed a European patent, entitled Gamifying Voice Search Experience for Children, which gives it exclusive rights to develop the concept. Aimed at nursery-age youngsters, the prospective product would use a child-friendly bubble-interface to engage with infants. This would be separate to Google Assistant, which already allows people to conduct voice-activated searches on their devices. However, education experts have raised concerns over the risk of potential privacy violations, such as those associated with Amazon's Echo Device, plus the dangers of making children addicted to technology.

artificial intelligence, information retrieval, natural language, (15 more...)

Daily Mail - Science & tech

Country: Asia (0.30)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (0.73)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.73)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.62)

Add feedback

Benefits of Enabling Enterprise Search in your digital Workplace eXo

#artificialintelligenceJul-22-2019, 15:59:00 GMT

A disconnected/disengaged workforce, broken business processes and an overall decrease in efficiency represent the most recurrent challenges facing organizations today. As a result, digital workplace solutions have grown in popularity as they offer an holistic solution capable of integrating different tools and applications. A typical digital workplace includes a knowledge management system (KMS), an enterprise social network (ESN), an intranet portal, instant messaging and more. It also integrates different third party software used internally, from CRM to Human Resources Information Systems (HRIS). For better usage and efficiency, a digital workplace needs to collect data from all these data sources and make it widely accessible to users in a centralized place – thus the importance of the enterprise search engine.

artificial intelligence, information retrieval, natural language, (15 more...)

#artificialintelligence

Industry: Information Technology (0.69)

Technology:

Information Technology > Information Management > Search (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.83)

Add feedback

Learning to Rank Broad and Narrow Queries in E-Commerce

Devapujula, Siddhartha, Arora, Sagar, Borar, Sumit

arXiv.org Machine LearningJul-15-2019

Search is a prominent channel for discovering products on an e-commerce platform. Ranking products retrieved from search becomes crucial to address customer's need and optimize for business metrics. While learning to Rank (LETOR) models have been extensively studied and have demonstrated efficacy in the context of web search; it is a relatively new research area to be explored in the e-commerce. In this paper, we present a framework for building LETOR model for an e-commerce platform. We analyze user queries and propose a mechanism to segment queries between broad and narrow based on user's intent. We discuss different types of features - query, product and query-product and discuss challenges in using them. We show that sparsity in product features can be tackled through a denoising auto-encoder while skip-gram based word embeddings help solve the query-product sparsity issues. We also present various target metrics that can be employed for evaluating search results and compare their robustness. Further, we build and compare performances of both pointwise and pairwise LETOR models on fashion category data set. We also build and compare distinct models for broad and narrow queries, analyze feature importance across these and show that these specialized models perform better than a combined model in the fashion world.

information retrieval, machine learning, natural language, (19 more...)

arXiv.org Machine Learning

1907.01549

Genre: Research Report (0.50)

Industry: Information Technology > Services > e-Commerce Services (1.00)

Technology:

Information Technology > Information Management > Search (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.50)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.48)

Add feedback