Information Retrieval
Optimization of Retrieval Algorithms on Large Scale Knowledge Graphs
Dörpinghaus, Jens, Stefan, Andreas
Knowledge graphs have been shown to play an important role in recent knowledge mining and discovery, for example in the field of life sciences or bioinformatics. Although a lot of research has been done on the field of query optimization, query transformation and of course in storing and retrieving large scale knowledge graphs the field of algorithmic optimization is still a major challenge and a vital factor in using graph databases. Few researchers have addressed the problem of optimizing algorithms on large scale labeled property graphs. Here, we present two optimization approaches and compare them with a naive approach of directly querying the graph database. The aim of our work is to determine limiting factors of graph databases like Neo4j and we describe a novel solution to tackle these challenges. For this, we suggest a classification schema to differ between the complexity of a problem on a graph database. We evaluate our optimization approaches on a test system containing a knowledge graph derived biomedical publication data enriched with text mining data. This dense graph has more than 71M nodes and 850M relationships. The results are very encouraging and - depending on the problem - we were able to show a speedup of a factor between 44 and 3839.
Search Medical Device and BioPharma Regulatory Guidance
You need either "Information" or "Knowledge". HUNT from Vistaar is "AI driven" search of our internal repository consisting of over 980K regulations and guidance documents from global health authorities which gets updated daily. So get your "Authoritative" documents from different countries at 1 place and get them in 1 search. Be sure about the documents.
Message Passing for Query Answering over Knowledge Graphs
Logic-based systems for query answering over knowledge graphs return only answers that rely on information explicitly represented in the graph. To improve recall, recent works have proposed the use of embeddings to predict additional information like missing links, or labels. These embeddings enable scoring entities in the graph as the answer a query, without being fully dependent on the graph structure. In its simplest case, answering a query in such a setting requires predicting a link between two entities. However, link prediction is not sufficient to address complex queries that involve multiple entities and variables. To solve this task, we propose to apply a message passing mechanism to a graph representation of the query, where nodes correspond to variables and entities. This results in an embedding of the query, such that answering entities are close to it in the embedding space. The general formulation of our method allows it to encode a more diverse set of query types in comparison to previous work. We evaluate our method by answering queries that rely on edges not seen during training, obtaining competitive performance. In contrast with previous work, we show that our method can generalize from training for the single-hop, link prediction task, to answering queries with more complex structures. A qualitative analysis reveals that the learned embeddings successfully capture the notion of different entity types.
Concept Embedding for Information Retrieval
Concepts are used to solve the term-mismatch problem. However, we need an effective similarity measure between concepts. Word embedding presents a promising solution. We present in this study three approaches to build concepts vectors based on words vectors. We use a vector-based measure to estimate inter-concepts similarity. Our experiments show promising results. Furthermore, words and concepts become comparable. This could be used to improve conceptual indexing process.
Your reputation depends on a solid (and legal) online review strategy - Search Engine Land
Consumers rely on search results, social media and peer reviews to perform research and gather feedback on businesses they are considering visiting or products they're thinking of purchasing. And while it can be easy to turn a blind eye on the reviews your business receives, simply ignoring those review sites can be damaging to your bottom line. While this probably comes as no surprise, 95% of shoppers read online reviews before making a purchase. As they seek out peer reviews on brands or products that they're considering doing business with, they're looking for specific things. Consumers actually look for negative reviews to discover authentic feedback from real customers.
Traveloka: Using Data to Build a Universal Search Engine Lionbridge AI
Traveloka is an online travel company that provides a one-stop platform for a range of ticketing services, including flights, accommodation, and attractions. As one of Southeast Asia's "unicorn" startups valued at over $1 billion, Traveloka is constantly searching for ways to improve their user experience. As part of this initiative, Traveloka has invested heavily in a number of artificial intelligence and machine learning projects. With an expanding list of 19 core product offerings, improving search capabilities was key to their continued growth. To do this, Traveloka built a search function to make it easy for users to browse the full range of products from a single search bar.
An Introduction to Neural Information Retrieval - Microsoft Research
Neural ranking models for information retrieval (IR) use shallow or deep neural networks to rank search results in response to a query. Traditional learning to rank models employ supervised machine learning (ML) techniques--including neural networks--over hand-crafted IR features. By contrast, more recently proposed neural models learn representations of language from raw text that can bridge the gap between query and document vocabulary. Unlike classical learning to rank models and non-neural approaches to IR, these new ML techniques are data-hungry, requiring large scale training data before they can be deployed. This tutorial introduces basic concepts and intuitions behind neural IR models, and places them in the context of classical non-neural approaches to IR.
How Artificial Intelligence Is Changing SEO Pipefy
Over time, the technological pipeline has made some content marketers remain in chaos. With AI as the fast-evolving approach, the SEO pillars are however expected to become vital for future trades. The developing systems are expected to make work easier and reliable. So far, SEO has been taking considerable attention by marketers as it has become one of the primary tools for boosting sales. However, considerable attention should be given also to the integration of AI into SEO and how will it impact its functionality.
BETO: Spanish BERT
Transformer based models are creating tremendous impact in the space of NLP as they have proven to be effective in a wide range of tasks such as POS tagging, machine translation, named-entity recognition, and a series of text classification tasks. This year saw the introduction to a whole family of transformer-based language models such as BERT, Transformer-XL, and GPT-2, among others. Langauge models, in general, offer desirable properties that can be leveraged in a transfer learning setting where you train a model with large-scale data to learn the properties of language in an unsupervised setting. The resulting model and weights can then be fine-tuned and be applied in low-resourced regimes to address different NLP tasks. In particular, it's exciting to see the use of BERT in different domains such as text classification, text summarization, text generation, and information retrieval.
Building a Video Search Engine
A natural progression in the field of computer vision following unprecedented progress in image classification tasks is towards video and video understanding, especially how it relates to identifying human subjects and activities. A number of datasets and benchmarks are being established in this area¹. In parallel, further progress is being made in 2D image related computer vision tasks such as fine-grained classification, image segmentation, 3D image construction, robot vision, scene flow estimation and human pose estimation. As part of my final Data Science project at Metis bootcamp, I've decided to marry these two parallel tracks -- video and human pose estimation in specific -- to create a content-based video search engine. Since applying 2D human pose estimation for video search is a novel idea with "no proof of concept", I have simplified my approach by selecting a single performer, fixed location single camera video footage of Salsa dance videos.