Every day, users from all over the world perform hundreds of millions of search queries with Bing in more than 100 languages. Whether this is the first or the millionth time we see a query, whether the best results for a query change every hour or barely change at all, our users expect an immediate answer that serves their needs. Bing web search is truly an example of AI at Scale at Microsoft, showcasing the next generation of AI capabilities and experiences. Over the past few years, Bing and Microsoft Research have been developing and deploying large neural network models such as MT-DNN, Unicoder, and UniLM to maximize the search experience for our customers. The best of those learnings are open sourced into the Microsoft Turing language models.
To most computers, that word printed on an otherwise blank screen is simply a string of characters. You see a word associated with a big cat, a large mammal. Given the context of valet parking, it might also bring to mind a luxury brand that is similar to Mercedes and BMW. Put another way, you have a collection of ideas, or concepts, of what "Jaguar" means and the mental agility to use context to infer which concept the writer of the word intended to convey. On Tuesday, a team of scientists from Microsoft Research Asia, Microsoft's research lab in Beijing, China, announced the public release of technology designed to help computers conceptualize in a humanlike fashion.
Editor's Note: This deep dive companion to our high-level FAQ piece is a 30-minute read so get comfortable! You'll learn the backstory and nuances of BERT's evolution, how the algorithm works to improve human language understanding for machines and what it means for SEO and the work we do every day. If you have been keeping an eye on Twitter SEO over the past week you'll have likely noticed an uptick in the number of gifs and images featuring the character Bert (and sometimes Ernie) from Sesame Street. This is because, last week Google announced an imminent algorithmic update would be rolling out, impacting 10% of queries in search results, and also affect featured snippet results in countries where they were present; which is not trivial. The update is named Google BERT (Hence the Sesame Street connection – and the gifs). Google describes BERT as the largest change to its search system since the company introduced RankBrain, almost five years ago, and probably one of the largest changes in search ever. The news of BERT's arrival and its impending impact has caused a stir in the SEO community, along with some confusion as to what BERT does, and what it means for the industry overall. With this in mind, let's take a look at what BERT is, BERT's background, the need for BERT and the challenges it aims to resolve, the current situation (i.e. The BERT backstory How search engines learn language Problems with language learning methods How BERT improves search engine language understanding What does BERT mean for SEO? BERT is a technologically ground-breaking natural language processing model/framework which has taken the machine learning world by storm since its release as an academic research paper. The research paper is entitled BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding (Devlin et al, 2018). Following paper publication Google AI Research team announced BERT as an open source contribution. A year later, Google announced a Google BERT algorithmic update rolling out in production search. Google linked the BERT algorithmic update to the BERT research paper, emphasizing BERT's importance for contextual language understanding in content and queries, and therefore intent, particularly for conversational search. BERT is described as a pre-trained deep learning natural language framework that has given state-of-the-art results on a wide variety of natural language processing tasks. Whilst in the research stages, and prior to being added to production search systems, BERT achieved state-of-the-art results on 11 different natural language processing tasks. These natural language processing tasks include, amongst others, sentiment analysis, named entity determination, textual entailment (aka next sentence prediction), semantic role labeling, text classification and coreference resolution. BERT also helps with the disambiguation of words with multiple meanings known as polysemous words, in context.
For the past several years, Microsoft researchers have been focused on finding ways to make commercial use of machine-reading technology. It looks like some of that work is about to become commercialized in the form of bringing machine-reading comprehension into search results. Based on information in Microsoft's Ignite conference session list, Microsoft may be ready to show this off as soon as next week. Machine-reading comprehension involves the automatic understanding of text. It involves computer vision, natural-language understanding and other technologies.
It took me a long time to realise that search is the biggest problem in NLP. Just look at Google, Amazon and Bing. These are multi-billion dollar businesses possible only due to their powerful search engines. My initial thoughts on search were centered around unsupervised ML, but I participated in Microsoft Hackathon 2018 for Bing and came to know the various ways a search engine can be made with deep learning. Do you find this in-depth technical education about NLP applications to be useful?