If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."
However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …
Natural language processing (NLP) is one of the most important technologies to arise in recent years. Specifically, 2019 has been a big year for NLP with the introduction of the revolutionary BERT language representation model. There are a large variety of underlying tasks and machine learning models powering NLP applications. Recently, deep learning approaches have obtained very high performance across many different NLP tasks. Convolutional Neural Network (CNNs) are typically associated with computer vision, but more recently CNNs have been applied to problems in NLP.
Natural language models typically have to solve two tough problems: mapping sentence prefixes to fixed-sized representations and using the representations to predict the next word in the text. In a recent paper, researchers at Facebook AI Research assert that the first problem -- the mapping problem -- might be easier than the prediction problem, a hypothesis they build upon to augment language models with a "nearest neighbors" retrieval mechanism. They say it allows rare patterns to be memorized and that it achieves a state-of-the-art complexity score (a measure of vocabulary and syntax variety) with no additional training. As the researchers explain, language models assign probabilities to sequences of words, such that from a context sequence of tokens (e.g., words) they estimate the distribution (the probabilities of occurrence of different possible outcomes) over target tokens. The proposed approach -- kNN-LM -- maps a context to a fixed-length mathematical representation computed by the pre-trained language model.
Artificial intelligence (AI) has made its way from University laboratories to the industry and found use cases in business over the last two decades and during this time, the main focus of AI in business has been in the data-driven perspective. In 2020, in my opinion, AI will move dramatically towards achieving its focus on the people-focussed perspective. This perspective in designing solutions too needs data, but the overall focus of using AI will be for the direct benefit of commoners and not just big businesses or retail majors. This will liberate people from many mundane tasks they do in their daily life, just so to perform tasks that are more aligned to what they are as a person, and as a part of a society they live in. It may look like a gross generalization of the impact of AI, and just about any technology development in general, but this will be too good to be true in multiple areas of our normal living.
The Attention mechanism is now an established technique in many NLP tasks. I've heard about it often, but wanted to go a bit more deep and understand the details. In this first blog post - since I plan to publish a few more blog posts regarding the attention subject - I make an introduction by focusing in the first proposal of attention mechanism, as applied to the task of neural machine translation. To the best of my knowledge the attention mechanism within the context of NLP was first presented in "Neural Machine Translation by Jointly Learning to Align and Translate" at ICLR 2015 (Bahdanau et al. 2015). This was proposed in the context of machine translation, where given a sentence in one language, the model has to produce a translation for that sentence in another language.
We all know the old adage, "When all you have is a hammer, everything looks like a nail." But not everything is a nail, especially when it comes to documents and content. When we work on a document, we must understand its format and what it is about. If I start working at the DMV, for example, I have to quickly understand their forms and what problems they address, the questions drivers have and the most appropriate answers to those questions. If I work for the Department of Agriculture, then I need a completely different set of tools.
Stemming is one of the most common data pre-processing operations we do in almost all Natural Language Processing (NLP) projects. If you're new to this space, it is possible that you don't exactly know what this is even though you have come across this word. You might also be confused between stemming and lemmatization, which are two similar operations. In this post, we'll see what exactly is stemming, with a few examples here and there. I hope I'll be able to explain this process in simple words for you.
Word embeddings enable knowledge representation where a vector represents a word. This improves the ability for neural networks to learn from a textual dataset. Before word embeddings were de facto standard for natural language processing, a common approach to deal with words was to use a one-hot vectorisation. Each word represents a column in the vector space, and each sentence is a vector of ones and zeros. As a result, this leads to a huge and sparse representation, because there are many more zeros than ones.
Researchers from the MIT-IBM Watson AI Lab, Tulane University and the University of Illinois this week unveiled research that allows a computer to more closely replicate human-based reading comprehension and inference. The researchers have created what they termed "a breakthrough neuro-symbolic approach" to infusing knowledge into natural language processing. The approach was announced at the AAAI-20 Conference taking place all week in New York City. Reasoning and inference are central to both humans and artificial intelligence, yet many enterprise AI systems still struggle to comprehend human language and textual entailment, which is defined as the relationship between two natural language sentences, according to IBM. There have been two schools of thought or "camps" since the beginning of AI: one has focused on the use of neural networks/deep learning, which have been very effective and successful in the past several years, said David Cox, director for the MIT-IBM AI Watson Lab.
New model has "real business impact" Microsoft has unveiled the world's largest deep learning language model to-date: a 17 billion-parameter "Turing Natural Language Generation (T-NLG)" model that the company believes will pave the way for more fluent chatbots and digital assistants. The T-NLG "outperforms the state of the art" on a several benchmarks, including summarisation and question answering, Microsoft claimed in a new research blog, as the company stakes its claim to a potentially dominant position in one of the most closely watched new technologies, natural language processing. Deep learning language models like BERT, developed by Google, have hugely improved the powers of natural language processing, by training on colossal data sets with billions of parameters to learn the contextual relations between words. Bigger is not always better, those working on language models may recognise, but Microsoft scientist Corby Rosset said his team "have observed that the bigger the model and the more diverse and comprehensive the pretraining data, the better it performs at generalizing to multiple downstream tasks even with fewer training examples." He emphasised: "Therefore, we believe it is more efficient to train a large centralized multi-task model and share its capabilities across numerous tasks."