Information Retrieval
Does The Meta Description Tag Affect SEO & Search Engine Rankings?
There's a lot of confusion when it comes to meta descriptions and SEO. Do they affect search engine rankings? Is it worth spending the time to write a good meta description? Well, in theory, meta descriptions do not affect SEO. This is an official statement from Google, released in 2009. However, since meta descriptions show in the search engine results, they can affect CTRs (click through rates), which are linked to SEO & rankings. So, in practice, meta descriptions might have an impact on SEO.
Which Social Network Drives the Most Traffic to Your Website? [POLL] - Search Engine Journal
Facebook typically drives around 60 percent of all the traffic Search Engine Journal get from social media networks. This isn't too surprising, considering how big Facebook is with over 2.2 billion monthly active users. Even if social media won't directly help your organic search rankings, posting engaging content that attracts lots of traffic, shares, likes, and comments all helps you increase your reach, visibility, and linking potential. So which social networks have the highest potential to send lots of traffic to your website? We asked our Twitter community what trends they're seeing.
Ranking in Genealogy: Search Results Fusion at Ancestry
Jiang, Peng, Yang, Yingrui, Bierner, Gann, Li, Fengjie Alex, Wang, Ruhan, Moghtaderi, Azadeh
Genealogy research is the study of family history using available resources such as historical records. Ancestry provides its customers with one of the world's largest online genealogical index with billions of records from a wide range of sources, including vital records such as birth and death certificates, census records, court and probate records among many others. Search at Ancestry aims to return relevant records from various record types, allowing our subscribers to build their family trees, research their family history, and make meaningful discoveries about their ancestors from diverse perspectives. In a modern search engine designed for genealogical study, the appropriate ranking of search results to provide highly relevant information represents a daunting challenge. In particular, the disparity in historical records makes it inherently difficult to score records in an equitable fashion. Herein, we provide an overview of our solutions to overcome such record disparity problems in the Ancestry search engine. Specifically, we introduce customized coordinate ascent (customized CA) to speed up ranking within a specific record type. We then propose stochastic search (SS) that linearly combines ranked results federated across contents from various record types. Furthermore, we propose a novel information retrieval metric, normalized cumulative entropy (NCE), to measure the diversity of results. We demonstrate the effectiveness of these two algorithms in terms of relevance (by NDCG) and diversity (by NCE) if applicable in the offline experiments using real customer data at Ancestry.
Google's John Mueller on Ranking for Featured Snippets - Search Engine Journal
Someone asked John Mueller in a Webmaster Hangout about Schema structured data and ranking for featured snippets. Structured Data is useful for communicating a deep amount of precise data. John Mueller answered the question by describing what it takes to make it easier for Google to use your page for featured snippets. The question asked was about the use of structured data for ranking in featured snippets. It was also about showing up for voice search via the Google Assistant.
Empowering Elasticsearch with Exact and Fast $r$-Neighbor Search in Hamming Space
Mu, Cun, Zhao, Jun, Yang, Guang, Yang, Binwei, Yan, Zheng
A growing interest has been witnessed recently in building nearest neighbor search solutions within Elasticsearch--one of the most popular full-text search engines. In this paper, we focus specifically on Hamming space nearest neighbor search using Elasticsearch. By combining three techniques: bit operation, substring filtering and data preprocessing with permutation, we develop a novel approach called FENSHSES (Fast Exact Neighbor Search in Hamming Space on Elasticsearch), which achieves dramatic speed-ups over the existing term match baseline. This will empower Elasticsearch with the capability of fast information retrieval even when documents (e.g., texts, images and sounds) are represented with binary codes--a common practice in nowadays semantic representation learning.
LDA for Text Summarization and Topic Detection - DZone AI
Machine learning clustering techniques are not the only way to extract topics from a text data set. Text mining literature has proposed a number of statistical models, known as probabilistic topic models, to detect topics from an unlabeled set of documents. One of the most popular models is the latent Dirichlet allocation (LDA) algorithm developed by Blei, Ng, and Jordan [i]. LDA is a generative unsupervised probabilistic algorithm that isolates the top K topics in a data set as described by the most relevant N keywords. In other words, the documents in the data set are represented as random mixtures of latent topics, where each topic is characterized by a Dirichlet distribution over a fixed vocabulary.
RACE: Sub-Linear Memory Sketches for Approximate Near-Neighbor Search on Streaming Data
Coleman, Benjamin, Shrivastava, Anshumali, Baraniuk, Richard G.
We demonstrate the first possibility of a sub-linear memory sketch for solving the approximate near-neighbor search problem. In particular, we develop an online sketching algorithm that can compress $N$ vectors into a tiny sketch consisting of small arrays of counters whose size scales as $O(N^{b}\log^2{N})$, where $b < 1$ depending on the stability of the near-neighbor search. This sketch is sufficient to identify the top-$v$ near-neighbors with high probability. To the best of our knowledge, this is the first near-neighbor search algorithm that breaks the linear memory ($O(N)$) barrier. We achieve sub-linear memory by combining advances in locality sensitive hashing (LSH) based estimation, especially the recently-published ACE algorithm, with compressed sensing and heavy hitter techniques. We provide strong theoretical guarantees; in particular, our analysis sheds new light on the memory-accuracy tradeoff in the near-neighbor search setting and the role of sparsity in compressed sensing, which could be of independent interest. We rigorously evaluate our framework, which we call RACE (Repeated ACE) data structures on a friend recommendation task on the Google plus graph with more than 100,000 high-dimensional vectors. RACE provides compression that is orders of magnitude better than the random projection based alternative, which is unsurprising given the theoretical advantage. We anticipate that RACE will enable both new theoretical perspectives on near-neighbor search and new methodologies for applications like high-speed data mining, internet-of-things (IoT), and beyond.
Recovering SEO traffic and rankings after a website redesign - Search Engine Land
When building a new website, retaining and improving your SEO and organic traffic should be a key design goal. This requires a clear understanding of how SEO and website design work together and careful planning for the site migration. If everything is done correctly, you should retain (and improve) rankings and traffic. Unfortunately, in the real world, this is often not what happens. And then panic sets in.
Why A Product Search Engine And A Recommendation Engine Can't Replace Each Other
There is no denying that everyone who spends time shopping online has seen both search engines and recommender systems several numbers of times and the differences between both the engines seem fairly obvious. A search engine can be spotted easily -- it is a query box where you type in what you're looking for and the system shows a list of results. On the other hand, you also see some products has just appeared on your screen that is relevant to you, but you haven't requested for it; that's is the magic of a recommendation engine. But what makes a product search engine different from a recommendation engine? And where exactly deep learning, machine learning, and artificial intelligence come into the picture?