AITopics | Sordoni, Alessandro

Collaborating Authors

Sordoni, Alessandro

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Twin Networks: Matching the Future for Sequence Generation

Serdyuk, Dmitriy, Ke, Nan Rosemary, Sordoni, Alessandro, Trischler, Adam, Pal, Chris, Bengio, Yoshua

arXiv.org Machine LearningFeb-23-2018

We propose a simple technique for encouraging generative RNNs to plan ahead. We train a "backward" recurrent network to generate a given sequence in reverse order, and we encourage states of the forward model to predict cotemporal states of the backward model. The backward network is used only during training, and plays no role during sampling or inference. We hypothesize that our approach eases modeling of long-term dependencies by implicitly forcing the forward states to hold information about the longer-term future (as contained in the backward states). We show empirically that our approach achieves 9% relative improvement for a speech recognition task, and achieves significant improvement on a COCO caption generation task.

arxiv preprint arxiv, deep learning, neural network, (18 more...)

arXiv.org Machine Learning

1708.06742

Country: North America > Canada > Quebec > Montreal (0.14)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Z-Forcing: Training Stochastic Recurrent Networks

GOYAL, Anirudh Goyal ALIAS PARTH, Sordoni, Alessandro, Côté, Marc-Alexandre, Ke, Nan Rosemary, Bengio, Yoshua

Neural Information Processing SystemsDec-31-2017

Many efforts have been devoted to training generative latent variable models with autoregressive decoders, such as recurrent neural networks (RNN). Stochastic recurrent models have been successful in capturing the variability observed in natural sequential data such as speech. We unify successful ideas from recently proposed architectures into a stochastic recurrent model: each step in the sequence is associated with a latent variable that is used to condition the recurrent dynamics for future steps. Training is performed with amortised variational inference where the approximate posterior is augmented with a RNN that runs backward through the sequence. In addition to maximizing the variational lower bound, we ease training of the latent variables by adding an auxiliary cost which forces them to reconstruct the state of the backward recurrent network. This provides the latent variables with a task-independent objective that enhances the performance of the overall model. We found this strategy to perform better than alternative approaches such as KL annealing. Although being conceptually simple, our model achieves state-of-the-art results on standard speech benchmarks such as TIMIT and Blizzard and competitive performance on sequential MNIST. Finally, we apply our model to language modeling on the IMDB dataset where the auxiliary cost helps in learning interpretable latent variables.

deep learning, latent variable, neural network, (16 more...)

Neural Information Processing Systems

Country:

North America > Canada (0.14)
North America > United States (0.14)
Europe > Spain (0.14)
Europe > Denmark (0.14)

Industry:

Media > Film (1.00)
Leisure & Entertainment (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)

Add feedback

Z-Forcing: Training Stochastic Recurrent Networks

Goyal, Anirudh, Sordoni, Alessandro, Côté, Marc-Alexandre, Ke, Nan Rosemary, Bengio, Yoshua

arXiv.org Machine LearningNov-16-2017

Many efforts have been devoted to training generative latent variable models with autoregressive decoders, such as recurrent neural networks (RNN). Stochastic recurrent models have been successful in capturing the variability observed in natural sequential data such as speech. We unify successful ideas from recently proposed architectures into a stochastic recurrent model: each step in the sequence is associated with a latent variable that is used to condition the recurrent dynamics for future steps. Training is performed with amortized variational inference where the approximate posterior is augmented with a RNN that runs backward through the sequence. In addition to maximizing the variational lower bound, we ease training of the latent variables by adding an auxiliary cost which forces them to reconstruct the state of the backward recurrent network. This provides the latent variables with a task-independent objective that enhances the performance of the overall model. We found this strategy to perform better than alternative approaches such as KL annealing. Although being conceptually simple, our model achieves state-of-the-art results on standard speech benchmarks such as TIMIT and Blizzard and competitive performance on sequential MNIST. Finally, we apply our model to language modeling on the IMDB dataset where the auxiliary cost helps in learning interpretable latent variables. Source Code: \url{https://github.com/anirudh9119/zforcing_nips17}

deep learning, latent variable, neural network, (16 more...)

arXiv.org Machine Learning

1711.05411

Country:

North America > Canada (0.14)
North America > United States (0.14)
Europe > Spain (0.14)
Europe > Denmark (0.14)

Genre: Research Report (0.64)

Industry:

Media > Film (1.00)
Leisure & Entertainment (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)

Add feedback

A Hierarchical Latent Variable Encoder-Decoder Model for Generating Dialogues

AAAI ConferencesFeb-14-2017

Sequential data often possesses hierarchical structures with complex dependencies between sub-sequences, such as found between the utterances in a dialogue. To model these dependencies in a generative framework, we propose a neural network-based generative architecture, with stochastic latent variables that span a variable number of time steps. We apply the proposed model to the task of dialogue response generation and compare it with other recent neural-network architectures. We evaluate the model performance through a human evaluation study. The experiments demonstrate that our model improves upon recently proposed models and that the latent variables facilitate both the generation of meaningful, long and diverse responses and maintaining dialogue state.

deep learning, neural network, utterance, (21 more...)

AAAI Conferences

Thirty-First AAAI Conference on Artificial Intelligence

Country: North America > Canada > Quebec > Montreal (0.30)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Add feedback

Building End-To-End Dialogue Systems Using Generative Hierarchical Neural Network Models

Serban, Iulian V. (University of Montreal) | Sordoni, Alessandro (University of Montreal) | Bengio, Yoshua (University of Montreal) | Courville, Aaron (University of Montreal) | Pineau, Joelle ( McGill University )

AAAI ConferencesApr-19-2016

We investigate the task of building open domain, conversational dialogue systems based on large dialogue corpora using generative models. Generative models produce system responses that are autonomously generated word-by-word, opening up the possibility for realistic, flexible interactions. In support of this goal, we extend the recently proposed hierarchical recurrent encoder-decoder neural network to the dialogue domain, and demonstrate that this model is competitive with state-of-the-art neural language models and back-off n-gram models. We investigate the limitations of this and similar approaches, and show how its performance can be improved by bootstrapping the learning from a larger question-answer pair corpus and from pretrained word embeddings.

deep learning, neural network, utterance, (20 more...)

AAAI Conferences

Thirtieth AAAI Conference on Artificial Intelligence

Country: North America > Canada > Quebec > Montreal (0.14)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Add feedback

Building End-To-End Dialogue Systems Using Generative Hierarchical Neural Network Models

Serban, Iulian V., Sordoni, Alessandro, Bengio, Yoshua, Courville, Aaron, Pineau, Joelle

arXiv.org Artificial IntelligenceApr-6-2016

deep learning, neural network, utterance, (19 more...)

arXiv.org Artificial Intelligence

1507.04808

Country: North America > Canada > Quebec > Montreal (0.14)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Add feedback

Compact Aspect Embedding for Diversified Query Expansions

Liu, Xiaohua (University of Montreal) | Bouchoucha, Arbi (University of Montreal) | Sordoni, Alessandro (University of Montreal) | Nie, Jian-Yun (University of Montreal)

AAAI ConferencesJul-14-2014

Diversified query expansion (DQE) based approaches aim to select a set of expansion terms with less redundancy among them while covering as many query aspects as possible. Recently they have experimentally demonstrate their effectiveness for the task of search result diversification. One challenge faced by existing DQE approaches is how to ensure the aspect coverage. In this paper, we propose a novel method for DQE, called compact aspect embedding, which exploits trace norm regularization to learn a low rank vector space for the query, with each eigenvector of the learnt vector space representing an aspect, and the absolute value of its corresponding eigenvalue representing the association strength of that aspect to the query. Meanwhile, each expansion term is mapped into the vector space as well. Based on this novel representation of the query aspects and expansion terms, we design a greedy selection strategy to choose a set of expansion terms to explicitly cover all possible aspects of the query.We test our method on several TREC diversification data sets, and show that our method significantly outperforms the state-of-the-art search result diversification approaches.

expansion term, information management, information retrieval query processing, (19 more...)

AAAI Conferences

Twenty-Eighth AAAI Conference on Artificial Intelligence

Country: North America > Canada > Quebec (0.14)

Genre:

Research Report > Experimental Study (0.46)
Research Report > Promising Solution (0.34)

Technology:

Information Technology > Information Management (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval > Query Processing (0.71)

Add feedback

Learning Concept Embeddings for Query Expansion by Quantum Entropy Minimization

Sordoni, Alessandro (Université de Montréal) | Bengio, Yoshua (Université de Montréal) | Nie, Jian-Yun (Université de Montréal)

AAAI ConferencesJul-14-2014

In web search, users queries are formulated using only few terms and term-matching retrieval functions could fail at retrieving relevant documents. Given a user query, the technique of query expansion (QE) consists in selecting related terms that could enhance the likelihood of retrieving relevant documents. Selecting such expansion terms is challenging and requires a computational framework capable of encoding complex semantic relationships. In this paper, we propose a novel method for learning, in a supervised way, semantic representations for words and phrases. By embedding queries and documents in special matrices, our model disposes of an increased representational power with respect to existing approaches adopting a vector representation. We show that our model produces high-quality query expansion terms. Our expansion increase IR measures beyond expansion from current word-embeddings models and well-established traditional QE methods.

information retrieval query processing, query, text processing, (17 more...)

AAAI Conferences

Twenty-Eighth AAAI Conference on Artificial Intelligence

Country: North America > Canada > Quebec (0.14)

Genre: Research Report > New Finding (0.46)

Industry: Automobiles & Trucks > Manufacturer (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval > Query Processing (0.93)

Add feedback