AITopics | Chandar, Sarath

Collaborating Authors

Chandar, Sarath

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

A Deep Reinforcement Learning Chatbot (Short Version)

Serban, Iulian V., Sankar, Chinnadhurai, Germain, Mathieu, Zhang, Saizheng, Lin, Zhouhan, Subramanian, Sandeep, Kim, Taesup, Pieper, Michael, Chandar, Sarath, Ke, Nan Rosemary, Rajeswar, Sai, de Brebisson, Alexandre, Sotelo, Jose M. R., Suhubdy, Dendi, Michalski, Vincent, Nguyen, Alexandre, Pineau, Joelle, Bengio, Yoshua

arXiv.org Machine LearningJan-20-2018

We present MILABOT: a deep reinforcement learning chatbot developed by the Montreal Institute for Learning Algorithms (MILA) for the Amazon Alexa Prize competition. MILABOT is capable of conversing with humans on popular small talk topics through both speech and text. The system consists of an ensemble of natural language generation and retrieval models, including neural network and template-based models. By applying reinforcement learning to crowdsourced data and real-world user interactions, the system has been trained to select an appropriate response from the models in its ensemble. The system has been evaluated through A/B testing with real-world users, where it performed significantly better than other systems. The results highlight the potential of coupling ensemble systems with deep reinforcement learning as a fruitful path for developing real-world, open-domain conversational agents.

artificial intelligence, experiment, natural language, (19 more...)

arXiv.org Machine Learning

1801.067

Country:

North America > Canada > Quebec > Montreal (0.69)
North America > United States (0.68)

Genre:

Personal > Interview (0.68)
Research Report > New Finding (0.48)

Industry:

Media > Film (0.69)
Leisure & Entertainment (0.47)
Information Technology (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

A Deep Reinforcement Learning Chatbot

Serban, Iulian V., Sankar, Chinnadhurai, Germain, Mathieu, Zhang, Saizheng, Lin, Zhouhan, Subramanian, Sandeep, Kim, Taesup, Pieper, Michael, Chandar, Sarath, Ke, Nan Rosemary, Rajeshwar, Sai, de Brebisson, Alexandre, Sotelo, Jose M. R., Suhubdy, Dendi, Michalski, Vincent, Nguyen, Alexandre, Pineau, Joelle, Bengio, Yoshua

arXiv.org Machine LearningNov-5-2017

We present MILABOT: a deep reinforcement learning chatbot developed by the Montreal Institute for Learning Algorithms (MILA) for the Amazon Alexa Prize competition. MILABOT is capable of conversing with humans on popular small talk topics through both speech and text. The system consists of an ensemble of natural language generation and retrieval models, including template-based models, bag-of-words models, sequence-to-sequence neural network and latent variable neural network models. By applying reinforcement learning to crowdsourced data and real-world user interactions, the system has been trained to select an appropriate response from the models in its ensemble. The system has been evaluated through A/B testing with real-world users, where it performed significantly better than many competing systems. Due to its machine learning architecture, the system is likely to improve with additional data.

candidate response, deep learning, neural network, (22 more...)

arXiv.org Machine Learning

1709.02349

Country:

North America > United States (1.00)
North America > Canada > Quebec > Montreal (0.68)

Genre:

Research Report > New Finding (1.00)
Personal > Interview (1.00)

Industry:

Media > Film (1.00)
Leisure & Entertainment (1.00)
Government > Regional Government > North America Government > United States Government (0.92)
(2 more...)

Add feedback

Memory Augmented Neural Networks with Wormhole Connections

Gulcehre, Caglar, Chandar, Sarath, Bengio, Yoshua

arXiv.org Machine LearningJan-30-2017

Recent empirical results on long-term dependency tasks have shown that neural networks augmented with an external memory can learn the long-term dependency tasks more easily and achieve better generalization than vanilla recurrent neural networks (RNN). We suggest that memory augmented neural networks can reduce the effects of vanishing gradients by creating shortcut (or wormhole) connections. Based on this observation, we propose a novel memory augmented neural network model called TARDIS (Temporal Automatic Relation Discovery in Sequences). The controller of TARDIS can store a selective set of embeddings of its own previous hidden states into an external memory and revisit them as and when needed. For TARDIS, memory acts as a storage for wormhole connections to the past to propagate the gradients more effectively and it helps to learn the temporal dependencies. The memory structure of TARDIS has similarities to both Neural Turing Machines (NTM) and Dynamic Neural Turing Machines (D-NTM), but both read and write operations of TARDIS are simpler and more efficient. We use discrete addressing for read/write operations which helps to substantially to reduce the vanishing gradient problem with very long sequences. Read and write operations in TARDIS are tied with a heuristic once the memory becomes full, and this makes the learning problem simpler when compared to NTM or D-NTM type of architectures. We provide a detailed analysis on the gradient propagation in general for MANNs. We evaluate our models on different long-term dependency tasks and report competitive results in all of them.

controller, deep learning, neural network, (16 more...)

arXiv.org Machine Learning

1701.08718

Country: North America > Canada > Quebec > Montreal (0.14)

Genre: Research Report > New Finding (0.46)

Industry: Education (0.48)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Hierarchical Memory Networks

Chandar, Sarath, Ahn, Sungjin, Larochelle, Hugo, Vincent, Pascal, Tesauro, Gerald, Bengio, Yoshua

arXiv.org Machine LearningMay-24-2016

Memory networks are neural networks with an explicit memory component that can be both read and written to by the network. The memory is often addressed in a soft way using a softmax function, making end-to-end training with backpropagation possible. However, this is not computationally scalable for applications which require the network to read from extremely large memories. On the other hand, it is well known that hard attention mechanisms based on reinforcement learning are challenging to train successfully. In this paper, we explore a form of hierarchical memory network, which can be considered as a hybrid between hard and soft attention memory networks. The memory is organized in a hierarchical structure such that reading from it is done with less computation than soft attention over a flat memory, while also being easier to train than hard attention over a flat memory. Specifically, we propose to incorporate Maximum Inner Product Search (MIPS) in the training and inference procedures for our hierarchical memory network. We explore the use of various state-of-the art approximate MIPS techniques and report results on SimpleQuestions, a challenging large scale factoid question answering task.

deep learning, memory network, neural network, (18 more...)

arXiv.org Machine Learning

1605.07427

Country:

North America > United States (0.14)
North America > Canada (0.14)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Clustering is Efficient for Approximate Maximum Inner Product Search

Auvolat, Alex, Chandar, Sarath, Vincent, Pascal, Larochelle, Hugo, Bengio, Yoshua

arXiv.org Machine LearningNov-29-2015

Efficient Maximum Inner Product Search (MIPS) is an important task that has a wide applicability in recommendation systems and classification with a large number of classes. Solutions based on locality-sensitive hashing (LSH) as well as tree-based solutions have been investigated in the recent literature, to perform approximate MIPS in sublinear time. In this paper, we compare these to another extremely simple approach for solving approximate MIPS, based on variants of the k-means clustering algorithm. Specifically, we propose to train a spherical k-means, after having reduced the MIPS problem to a Maximum Cosine Similarity Search (MCSS). Experiments on two standard recommendation system benchmarks as well as on large vocabulary word embeddings, show that this simple approach yields much higher speedups, for the same retrieval precision, than current state-of-the-art hashing-based and tree-based methods. This simple method also yields more robust retrievals when the query is corrupted by noise.

artificial intelligence, natural language, vector, (18 more...)

arXiv.org Machine Learning

1507.0591

Country: North America > Canada > Quebec (0.28)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.87)

Add feedback

Correlational Neural Networks

Chandar, Sarath, Khapra, Mitesh M., Larochelle, Hugo, Ravindran, Balaraman

arXiv.org Machine LearningOct-12-2015

Common Representation Learning (CRL), wherein different descriptions (or views) of the data are embedded in a common subspace, is receiving a lot of attention recently. Two popular paradigms here are Canonical Correlation Analysis (CCA) based approaches and Autoencoder (AE) based approaches. CCA based approaches learn a joint representation by maximizing correlation of the views when projected to the common subspace. AE based methods learn a common representation by minimizing the error of reconstructing the two views. Each of these approaches has its own advantages and disadvantages. For example, while CCA based approaches outperform AE based approaches for the task of transfer learning, they are not as scalable as the latter. In this work we propose an AE based approach called Correlational Neural Network (CorrNet), that explicitly maximizes correlation among the views when projected to the common subspace. Through a series of experiments, we demonstrate that the proposed CorrNet is better than the above mentioned approaches with respect to its ability to learn correlated common representations. Further, we employ CorrNet for several cross language tasks and show that the representations learned using CorrNet perform better than the ones learned using other state of the art approaches.

neural network, representation, survey article, (18 more...)

arXiv.org Machine Learning

1504.07225

Country: North America > United States > Maryland > Baltimore (0.14)

Genre: Research Report > Promising Solution (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback