Goto

Collaborating Authors

 Information Retrieval


Search Engine Optimization All-in-One For Dummies, 3rd Edition - Programmer Books

#artificialintelligence

In Search Engine Optimization All-in-One For Dummies, 3rd Edition, Bruce Clay--whose search engine consultancy predates Google--shares everything you need to know about SEO. In minibooks that cover the entire topic, you'll discover how search engines work, how to apply effective keyword strategies, ways to use SEO to position yourself competitively, the latest on international SEO practices, and more. If SEO makes your head spin, this no-nonsense guide makes it easier. You'll get the lowdown on how to use search engine optimization to improve the quality and volume of traffic on your website via search engine results. Cutting through technical jargon, it gets you up to speed quickly on how to use SEO to get your website in the top of the rankings, target different kinds of searches, and win more industry-specific vertical search engine results!


How AI can help you boost your wordpress site? - Part I

#artificialintelligence

The task of maintaining a WordPress site needs consistency and tenacity in order to have a profitable enterprise. This task might become a herculean one, time-consuming and an excruciating one if you are to work manually on all the WordPress plug-ins without any assistance. But with the advent of Artificial Intelligence (AI), most of the tasks on these WordPress plug-ins are thus simplified with the help of AI boosting the quality of your WordPress site and in turn, increasing your profitability. This is part I of the two-part articles. For the second part please check this link.


Future search engines will help you find information you don't even know you need University of Helsinki

#artificialintelligence

The research surrounding methods of information retrieval is an entire field of science whose specialists aim to provide us with even better search results – a necessity as the amount of data constantly keeps growing. To succeed in their quest, researchers are focusing on the interaction between humans and computers, connecting methods of machine learning to this interaction. One of these researchers is Dorota Głowacka, who assumed an assistant professorship in machine learning and data science at the Helsinki Centre for Data Science HiDATA at the beginning of 2019. Głowacka is studying what people search for and how they interact with search engines, with a particular focus on exploratory search. This is a search method that helps find matters relevant to the person looking for information, even if they are not entirely certain about what they are looking for to begin with.


SMX Overtime: Here's how to make SEO gains through data science - Search Engine Land

#artificialintelligence

I am a senior data scientist at LinkedIn working on SEO and guest experience. I presented at SMX London last month about how to apply data science in SEO. The session covered topics including metrics, A/B testing, SEO vs. SEM cannibalization testing and machine learning for content quality. Here are a few questions from session attendees with my responses. For A/B testing, do you use any specific tools/processes?


Representation Learning for Words and Entities

arXiv.org Artificial Intelligence

This thesis presents new methods for unsupervised learning of distributed representations of words and entities from text and knowledge bases. The first algorithm presented in the thesis is a multi-view algorithm for learning representations of words called Multiview Latent Semantic Analysis (MVLSA). By incorporating up to 46 different types of co-occurrence statistics for the same vocabulary of english words, I show that MVLSA outperforms other state-of-the-art word embedding models. Next, I focus on learning entity representations for search and recommendation and present the second method of this thesis, Neural Variational Set Expansion (NVSE). NVSE is also an unsupervised learning method, but it is based on the Variational Autoencoder framework. Evaluations with human annotators show that NVSE can facilitate better search and recommendation of information gathered from noisy, automatic annotation of unstructured natural language corpora. Finally, I move from unstructured data and focus on structured knowledge graphs. I present novel approaches for learning embeddings of vertices and edges in a knowledge graph that obey logical constraints.


Automated Machine Learning: State-of-The-Art and Open Challenges

arXiv.org Machine Learning

With the continuous and vast increase in the amount of data in our digital world, it has been acknowledged that the number of knowledgeable data scientists can not scale to address these challenges. Thus, there was a crucial need for automating the process of building good machine learning models. In the last few years, several techniques and frameworks have been introduced to tackle the challenge of automating the process of Combined Algorithm Selection and Hyper-parameter tuning (CASH) in the machine learning domain. The main aim of these techniques is to reduce the role of the human in the loop and fill the gap for non-expert machine learning users by playing the role of the domain expert. In this paper, we present a comprehensive survey for the state-of-the-art efforts in tackling the CASH problem. In addition, we highlight the research work of automating the other steps of the full complex machine learning pipeline (AutoML) from data understanding till model deployment. Furthermore, we provide comprehensive coverage for the various tools and frameworks that have been introduced in this domain. Finally, we discuss some of the research directions and open challenges that need to be addressed in order to achieve the vision and goals of the AutoML process.


Detecting Everyday Scenarios in Narrative Texts

arXiv.org Artificial Intelligence

Script knowledge consists of detailed information on everyday activities. Such information is often taken for granted in text and needs to be inferred by readers. Therefore, script knowledge is a central component to language comprehension. Previous work on representing scripts is mostly based on extensive manual work or limited to scenarios that can be found with sufficient redundancy in large corpora. We introduce the task of scenario detection, in which we identify references to scripts. In this task, we address a wide range of different scripts (200 scenarios) and we attempt to identify all references to them in a collection of narrative texts. We present a first benchmark data set and a baseline model that tackles scenario detection using techniques from topic segmentation and text classification.


'How tall is the tower in Paris?' How Bing knows its about the Eiffel Tower

#artificialintelligence

Only a few years ago, web search was simple. Users typed a few words and waded through pages of results. Today, those same users may instead snap a picture on a phone and drop it into a search box or use an intelligent assistant to ask a question without physically touching a device at all. They may also type a question and expect an actual reply, not a list of pages with likely answers. These tasks challenge traditional search engines, which are based around an inverted index system that relies on keyword matches to produce results.


Compositional Questions Do Not Necessitate Multi-hop Reasoning

arXiv.org Artificial Intelligence

Multi-hop reading comprehension (RC) questions are challenging because they require reading and reasoning over multiple paragraphs. We argue that it can be difficult to construct large multi-hop RC datasets. For example, even highly compositional questions can be answered with a single hop if they target specific entity types, or the facts needed to answer them are redundant. Our analysis is centered on HotpotQA, where we show that single-hop reasoning can solve much more of the dataset than previously thought. We introduce a single-hop BERT-based RC model that achieves 67 F1---comparable to state-of-the-art multi-hop models. We also design an evaluation setting where humans are not shown all of the necessary paragraphs for the intended multi-hop reasoning but can still answer over 80% of questions. Together with detailed error analysis, these results suggest there should be an increasing focus on the role of evidence in multi-hop reasoning and possibly even a shift towards information retrieval style evaluations with large and diverse evidence collections.


Ambiverse - an amazing open-source suite for natural language understanding

#artificialintelligence

While doing performance benchmarks for Named Entity Linking solutions for our AI/FinTech start-up Risklio, I stumbled upon a very powerful, only just open-sourced framework called AmbiverseNLU. It was developed by Ambiverse and is based on work previously done at the Max Planck Institute¹. The components it uses are more well-known: entity recognition from KnowNER², open information extraction using ClausIE³ and AIDA, an entity detection and disambiguation tool⁴. You can have a look at the demo here. For the former one you can choose whether to use Apache Cassandra or PostgreSQL as a backend, while the last one uses Neo4j.