Collaborating Authors

Lecture 8: Recurrent Neural Networks and Language Models


Lecture 8 covers traditional language models, RNNs, and RNN language models. Also reviewed are important training problems and tricks, RNNs for other sequence tasks, and bidirectional and deep RNNs. This lecture series provides a thorough introduction to the cutting-edge research in deep learning applied to NLP, an approach that has recently obtained very high performance across many different NLP tasks including question answering and machine translation. It emphasizes how to implement, train, debug, visualize, and design neural network models, covering the main technologies of word vectors, feed-forward models, recurrent neural networks, recursive neural networks, convolutional neural networks, and recent models involving a memory component. For additional learning opportunities please visit:

Deep Recursive Neural Networks for Compositionality in Language

Neural Information Processing Systems

Recursive neural networks comprise a class of architecture that can operate on structured input. They have been previously successfully applied to model compositionality in natural language using parse-tree-based structural representations. Even though these architectures are deep in structure, they lack the capacity for hierarchical representation that exists in conventional deep feed-forward networks as well as in recently investigated deep recurrent neural networks. In this work we introduce a new architecture --- a deep recursive neural network (deep RNN) --- constructed by stacking multiple recursive layers. We evaluate the proposed model on the task of fine-grained sentiment classification.

Language, trees, and geometry in neural networks


Left image in each pair, a traditional parse tree view, but the vertical length of each branch represents embedding distance. Right images: PCA projection of context embeddings, where color shows deviation from expected distance.

Has anyone tried to mine all the types of analogies possible using word embeddings (word2vec)? • /r/MachineLearning


We know of a few types of word analogies, like "France capital Paris" and "US currency dollar", but has anyone tried to search for all the possible analogies that can be deducted by word2vec? They would have to find modifiers that have multiple matches, like "word1 modifier word2". An algorithm could be to cluster all the difference vectors (word1-word2, for all words) and select words that are close to the centers of dense clusters. Even if we don't find all modifiers, we can infer more by combining with ontologies/word net. If we find all the types of analogy we could make a large test dataset to benchmark how capable are the various word embeddings of representing analogy.

Creating Explainable AI With Rules


Explainability issues arise because machine learning outputs are numerical; deep neural networks are so opaque that users don't necessarily know which factor contributed to what aspect of the resulting score. There are several emergent techniques for increasing explainability and interpretability of machine learning results. After organizations gain insight into the black box of intricate machine learning models, the best way to explain those results to customers, regulators and legal entities is to translate them into rules that, by their very definition, offer full transparency for explainable AI. Rules can also highlight points of bias in models.