AITopics | Michalski, Vincent

Behaviour Discovery and Attribution for Explainable Reinforcement Learning

Rishav, Rishav, Nath, Somjit, Michalski, Vincent, Kahou, Samira Ebrahimi

arXiv.org Artificial IntelligenceMar-19-2025

Explaining the decisions made by reinforcement learning (RL) agents is critical for building trust and ensuring reliability in real-world applications. Traditional approaches to explainability often rely on saliency analysis, which can be limited in providing actionable insights. Recently, there has been growing interest in attributing RL decisions to specific trajectories within a dataset. However, these methods often generalize explanations to long trajectories, potentially involving multiple distinct behaviors. Often, providing multiple more fine grained explanations would improve clarity. In this work, we propose a framework for behavior discovery and action attribution to behaviors in offline RL trajectories. Our method identifies meaningful behavioral segments, enabling more precise and granular explanations associated with high level agent behaviors. This approach is adaptable across diverse environments with minimal modifications, offering a scalable and versatile solution for behavior discovery and attribution for explainable RL.

artificial intelligence, machine learning, reinforcement learning, (11 more...)

arXiv.org Artificial Intelligence

2503.14973

Country:

North America > United States > Massachusetts (0.14)
North America > Canada > Quebec (0.14)

Genre: Research Report (0.42)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

On the Limits of Multi-modal Meta-Learning with Auxiliary Task Modulation Using Conditional Batch Normalization

Armengol-Estapé, Jordi, Michalski, Vincent, Kumar, Ramnath, St-Charles, Pierre-Luc, Precup, Doina, Kahou, Samira Ebrahimi

arXiv.org Artificial IntelligenceMay-30-2024

Few-shot learning aims to learn representations that can tackle novel tasks given a small number of examples. Recent studies show that cross-modal learning can improve representations for few-shot classification. More specifically, language is a rich modality that can be used to guide visual learning. In this work, we experiment with a multi-modal architecture for few-shot learning that consists of three components: a classifier, an auxiliary network, and a bridge network. While the classifier performs the main classification task, the auxiliary network learns to predict language representations from the same input, and the bridge network transforms high-level features of the auxiliary network into modulation parameters for layers of the few-shot classifier using conditional batch normalization. The bridge should encourage a form of lightweight semantic alignment between language and vision which could be useful for the classifier. However, after evaluating the proposed approach on two popular few-shot classification benchmarks we find that a) the improvements do not reproduce across benchmarks, and b) when they do, the improvements are due to the additional compute and parameters introduced by the bridge network. We contribute insights and recommendations for future work in multi-modal meta-learning, especially when using language representations.

artificial intelligence, deep learning, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2405.18751

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > Canada > Quebec > Montreal (0.14)

Genre: Research Report > New Finding (0.48)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

Accounting for Variance in Machine Learning Benchmarks

Bouthillier, Xavier, Delaunay, Pierre, Bronzi, Mirko, Trofimov, Assya, Nichyporuk, Brennan, Szeto, Justin, Sepah, Naz, Raff, Edward, Madan, Kanika, Voleti, Vikram, Kahou, Samira Ebrahimi, Michalski, Vincent, Serdyuk, Dmitriy, Arbel, Tal, Pal, Chris, Varoquaux, Gaël, Vincent, Pascal

arXiv.org Machine LearningMar-1-2021

Strong empirical evidence that one machine-learning algorithm A outperforms another one B ideally calls for multiple trials optimizing the learning pipeline over sources of variation such as data sampling, data augmentation, parameter initialization, and hyperparameters choices. This is prohibitively expensive, and corners are cut to reach conclusions. We model the whole benchmarking process, revealing that variance due to data sampling, parameter initialization and hyperparameter choice impact markedly the results. We analyze the predominant comparison methods used today in the light of this variance. We show a counter-intuitive result that adding more sources of variation to an imperfect estimator approaches better the ideal estimator at a 51 times reduction in compute cost. Building on these results, we study the error rate of detecting improvements, on five different deep-learning tasks/architectures. This study leads us to propose recommendations for performance comparisons.

deep learning, neural network, variance, (19 more...)

arXiv.org Machine Learning

2103.03098

Country:

Europe (0.93)
North America > United States > Maryland (0.28)
North America > Canada > Quebec > Montreal (0.14)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.68)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Therapeutic Area (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.66)

Add feedback

Modeling Deep Temporal Dependencies with Recurrent Grammar Cells""

Michalski, Vincent, Memisevic, Roland, Konda, Kishore

Neural Information Processing SystemsFeb-14-2020, 08:57:59 GMT

We propose modeling time series by representing the transformations that take a frame at time t to a frame at time t 1. To this end we show how a bi-linear model of transformations, such as a gated autoencoder, can be turned into a recurrent network, by training it to predict future frames from the current one and the inferred transformation using backprop-through-time. We also show how stacking multiple layers of gating units in a recurrent pyramid makes it possible to represent the "syntax" of complicated time series, and that it can outperform standard recurrent neural networks in terms of prediction accuracy on a variety of tasks. Papers published at the Neural Information Processing Systems Conference.

artificial intelligence, modeling deep temporal dependency, neural network, (4 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Towards Deep Conversational Recommendations

Li, Raymond, Kahou, Samira Ebrahimi, Schulz, Hannes, Michalski, Vincent, Charlin, Laurent, Pal, Chris

Neural Information Processing SystemsDec-31-2018

There has been growing interest in using neural networks and deep learning techniques to create dialogue systems. Conversational recommendation is an interesting setting for the scientific exploration of dialogue with natural language as the associated discourse involves goal-driven dialogue that often transforms naturally into more free-form chat. This paper provides two contributions. First, until now there has been no publicly available large-scale data set consisting of real-world dialogues centered around recommendations. To address this issue and to facilitate our exploration here, we have collected ReDial, a data set consisting of over 10,000 conversations centered around the theme of providing movie recommendations. We make this data available to the community for further research. Second, we use this dataset to explore multiple facets of conversational recommendations. In particular we explore new neural architectures, mechanisms and methods suitable for composing conversational recommendation systems. Our dataset allows us to systematically probe model sub-components addressing different parts of the overall problem domain ranging from: sentiment analysis and cold-start recommendation generation to detailed aspects of how natural language is used in this setting in the real world. We combine such sub-components into a full-blown dialogue system and examine its behavior.

deep learning, neural network, recommendation, (19 more...)

Neural Information Processing Systems

Country:

North America > Canada (0.14)
Europe > Germany (0.14)

Industry:

Media > Film (1.00)
Leisure & Entertainment (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Towards Deep Conversational Recommendations

Li, Raymond, Kahou, Samira Ebrahimi, Schulz, Hannes, Michalski, Vincent, Charlin, Laurent, Pal, Chris

Neural Information Processing SystemsDec-31-2018

There has been growing interest in using neural networks and deep learning techniques to create dialogue systems. Conversational recommendation is an interesting setting for the scientific exploration of dialogue with natural language as the associated discourse involves goal-driven dialogue that often transforms naturally into more free-form chat. This paper provides two contributions. First, until now there has been no publicly available large-scale data set consisting of real-world dialogues centered around recommendations. To address this issue and to facilitate our exploration here, we have collected ReDial, a data set consisting of over 10,000 conversations centered around the theme of providing movie recommendations. We make this data available to the community for further research. Second, we use this dataset to explore multiple facets of conversational recommendations. In particular we explore new neural architectures, mechanisms and methods suitable for composing conversational recommendation systems. Our dataset allows us to systematically probe model sub-components addressing different parts of the overall problem domain ranging from: sentiment analysis and cold-start recommendation generation to detailed aspects of how natural language is used in this setting in the real world. We combine such sub-components into a full-blown dialogue system and examine its behavior.

deep learning, neural network, recommendation, (19 more...)

Neural Information Processing Systems

Country:

North America > Canada (0.14)
Europe > Germany (0.14)

Industry:

Media > Film (1.00)
Leisure & Entertainment (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Towards Deep Conversational Recommendations

Li, Raymond, Kahou, Samira, Schulz, Hannes, Michalski, Vincent, Charlin, Laurent, Pal, Chris

arXiv.org Machine LearningDec-18-2018

There has been growing interest in using neural networks and deep learning techniques to create dialogue systems. Conversational recommendation is an interesting setting for the scientific exploration of dialogue with natural language as the associated discourse involves goal-driven dialogue that often transforms naturally into more free-form chat. This paper provides two contributions. First, until now there has been no publicly available large-scale dataset consisting of real-world dialogues centered around recommendations. To address this issue and to facilitate our exploration here, we have collected ReDial, a dataset consisting of over 10,000 conversations centered around the theme of providing movie recommendations. We make this data available to the community for further research. Second, we use this dataset to explore multiple facets of conversational recommendations. In particular we explore new neural architectures, mechanisms, and methods suitable for composing conversational recommendation systems. Our dataset allows us to systematically probe model sub-components addressing different parts of the overall problem domain ranging from: sentiment analysis and cold-start recommendation generation to detailed aspects of how natural language is used in this setting in the real world. We combine such sub-components into a full-blown dialogue system and examine its behavior.

deep learning, neural network, seeker, (20 more...)

arXiv.org Machine Learning

1812.07617

Country:

North America > Canada (0.14)
Europe > Germany (0.14)

Genre: Research Report (0.40)

Industry:

Media > Film (1.00)
Leisure & Entertainment (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

A Deep Reinforcement Learning Chatbot (Short Version)

Serban, Iulian V., Sankar, Chinnadhurai, Germain, Mathieu, Zhang, Saizheng, Lin, Zhouhan, Subramanian, Sandeep, Kim, Taesup, Pieper, Michael, Chandar, Sarath, Ke, Nan Rosemary, Rajeswar, Sai, de Brebisson, Alexandre, Sotelo, Jose M. R., Suhubdy, Dendi, Michalski, Vincent, Nguyen, Alexandre, Pineau, Joelle, Bengio, Yoshua

arXiv.org Machine LearningJan-20-2018

We present MILABOT: a deep reinforcement learning chatbot developed by the Montreal Institute for Learning Algorithms (MILA) for the Amazon Alexa Prize competition. MILABOT is capable of conversing with humans on popular small talk topics through both speech and text. The system consists of an ensemble of natural language generation and retrieval models, including neural network and template-based models. By applying reinforcement learning to crowdsourced data and real-world user interactions, the system has been trained to select an appropriate response from the models in its ensemble. The system has been evaluated through A/B testing with real-world users, where it performed significantly better than other systems. The results highlight the potential of coupling ensemble systems with deep reinforcement learning as a fruitful path for developing real-world, open-domain conversational agents.

artificial intelligence, experiment, natural language, (19 more...)

arXiv.org Machine Learning

1801.067

Country:

North America > Canada > Quebec > Montreal (0.69)
North America > United States (0.68)

Genre:

Personal > Interview (0.68)
Research Report > New Finding (0.48)

Industry:

Media > Film (0.69)
Leisure & Entertainment (0.47)
Information Technology (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

A Deep Reinforcement Learning Chatbot

Serban, Iulian V., Sankar, Chinnadhurai, Germain, Mathieu, Zhang, Saizheng, Lin, Zhouhan, Subramanian, Sandeep, Kim, Taesup, Pieper, Michael, Chandar, Sarath, Ke, Nan Rosemary, Rajeshwar, Sai, de Brebisson, Alexandre, Sotelo, Jose M. R., Suhubdy, Dendi, Michalski, Vincent, Nguyen, Alexandre, Pineau, Joelle, Bengio, Yoshua

arXiv.org Machine LearningNov-5-2017

We present MILABOT: a deep reinforcement learning chatbot developed by the Montreal Institute for Learning Algorithms (MILA) for the Amazon Alexa Prize competition. MILABOT is capable of conversing with humans on popular small talk topics through both speech and text. The system consists of an ensemble of natural language generation and retrieval models, including template-based models, bag-of-words models, sequence-to-sequence neural network and latent variable neural network models. By applying reinforcement learning to crowdsourced data and real-world user interactions, the system has been trained to select an appropriate response from the models in its ensemble. The system has been evaluated through A/B testing with real-world users, where it performed significantly better than many competing systems. Due to its machine learning architecture, the system is likely to improve with additional data.

candidate response, deep learning, neural network, (22 more...)

arXiv.org Machine Learning

1709.02349

Country:

North America > United States (1.00)
North America > Canada > Quebec > Montreal (0.68)

Genre:

Research Report > New Finding (1.00)
Personal > Interview (1.00)

Industry:

Media > Film (1.00)
Leisure & Entertainment (1.00)
Government > Regional Government > North America Government > United States Government (0.92)
(2 more...)

Add feedback

Modeling Deep Temporal Dependencies with Recurrent Grammar Cells""

Michalski, Vincent, Memisevic, Roland, Konda, Kishore

Neural Information Processing SystemsDec-31-2014

We propose modeling time series by representing the transformations that take a frame at time t to a frame at time t+1. To this end we show how a bi-linear model of transformations, such as a gated autoencoder, can be turned into a recurrent network, by training it to predict future frames from the current one and the inferred transformation using backprop-through-time. We also show how stacking multiple layers of gating units in a recurrent pyramid makes it possible to represent the ”syntax” of complicated time series, and that it can outperform standard recurrent neural networks in terms of prediction accuracy on a variety of tasks.

deep learning, neural network, transformation, (17 more...)

Neural Information Processing Systems

Country: Europe > Germany (0.47)

Genre: Research Report (0.47)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Filters

Collaborating Authors

Michalski, Vincent

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

Behaviour Discovery and Attribution for Explainable Reinforcement Learning

On the Limits of Multi-modal Meta-Learning with Auxiliary Task Modulation Using Conditional Batch Normalization

Accounting for Variance in Machine Learning Benchmarks

Modeling Deep Temporal Dependencies with Recurrent Grammar Cells""

Towards Deep Conversational Recommendations

Towards Deep Conversational Recommendations

Towards Deep Conversational Recommendations

A Deep Reinforcement Learning Chatbot (Short Version)

A Deep Reinforcement Learning Chatbot

Modeling Deep Temporal Dependencies with Recurrent Grammar Cells""