AITopics | Rabusseau, Guillaume

Collaborating Authors

Rabusseau, Guillaume

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Neural Architecture Search for Class-incremental Learning

Huang, Shenyang, François-Lavet, Vincent, Rabusseau, Guillaume

arXiv.org Machine LearningSep-14-2019

In class-incremental learning, a model learns continuously from a sequential data stream in which new classes occur. Existing methods often rely on static architectures that are manually crafted. These methods can be prone to capacity saturation because a neural network's ability to generalize to new concepts is limited by its fixed capacity. To understand how to expand a continual learner, we focus on the neural architecture design problem in the context of class-incremental learning: at each time step, the learner must optimize its performance on all classes observed so far by selecting the most competitive neural architecture. To tackle this problem, we propose Continual Neural Architecture Search (CNAS): an autoML approach that takes advantage of the sequential nature of class-incremental learning to efficiently and adaptively identify strong architectures in a continual learning setting. We employ a task network to perform the classification task and a reinforcement learning agent as the meta-controller for architecture search. In addition, we apply network transformations to transfer weights from previous learning step and to reduce the size of the architecture search space, thus saving a large amount of computational resources. We evaluate CNAS on the CIFAR-100 dataset under varied incremental learning scenarios with limited computational power (1 GPU). Experimental results demonstrate that CNAS outperforms architectures that are optimized for the entire dataset. In addition, CNAS is at least an order of magnitude more efficient than naively using existing autoML methods.

architecture, artificial intelligence, neural network, (18 more...)

arXiv.org Machine Learning

1909.06686

Country: North America > Canada > Quebec > Montreal (0.14)

Genre:

Workflow (0.68)
Research Report > New Finding (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

On Overfitting and Asymptotic Bias in Batch Reinforcement Learning with Partial Observability

Francois-Lavet, Vincent, Rabusseau, Guillaume, Pineau, Joelle, Ernst, Damien, Fonteneau, Raphael

Journal of Artificial Intelligence ResearchMay-5-2019

This paper provides an analysis of the tradeoff between asymptotic bias (suboptimality with unlimited data) and overfitting (additional suboptimality due to limited data) in the context of reinforcement learning with partial observability. Our theoretical analysis formally characterizes that while potentially increasing the asymptotic bias, a smaller state representation decreases the risk of overfitting. This analysis relies on expressing the quality of a state representation by bounding $L_1$ error terms of the associated belief states. Theoretical results are empirically illustrated when the state representation is a truncated history of observations, both on synthetic POMDPs and on a large-scale POMDP in the context of smartgrids, with real-world data. Finally, similarly to known results in the fully observable setting, we also briefly discuss and empirically illustrate how using function approximators and adapting the discount factor may enhance the tradeoff between asymptotic bias and overfitting in the partially observable context.

machine learning, overfitting and asymptotic bias, reinforcement learning, (18 more...)

Journal of Artificial Intelligence Research

doi: 10.1613/jair.1.11478

AI Access Foundation

11478

Journal of Artificial Intelligence Research

Country:

North America > Canada > Quebec > Montreal (0.14)
North America > Canada > Alberta (0.14)

Genre: Research Report (0.67)

Industry:

Energy > Power Industry (0.68)
Energy > Renewable > Hydrogen (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Clustering-Oriented Representation Learning with Attractive-Repulsive Loss

Kenyon-Dean, Kian, Cianflone, Andre, Page-Caccia, Lucas, Rabusseau, Guillaume, Cheung, Jackie Chi Kit, Precup, Doina

arXiv.org Artificial IntelligenceDec-18-2018

The standard loss function used to train neural network classifiers, categorical cross-entropy (CCE), seeks to maximize accuracy on the training data; building useful representations is not a necessary byproduct of this objective. In this work, we propose clustering-oriented representation learning (COREL) as an alternative to CCE in the context of a generalized attractive-repulsive loss framework. COREL has the consequence of building latent representations that collectively exhibit the quality of natural clustering within the latent space of the final hidden layer, according to a predefined similarity function. Despite being simple to implement, COREL variants outperform or perform equivalently to CCE in a variety of scenarios, including image and news article classification using both feed-forward and convolutional neural networks. Analysis of the latent spaces created with different similarity functions facilitates insights on the different use cases COREL variants can satisfy, where the Cosine-COREL variant makes a consistently clusterable latent space, while Gaussian-COREL consistently obtains better classification accuracy than CCE.

artificial intelligence, neural network, representation, (18 more...)

arXiv.org Artificial Intelligence

1812.07627

Country: North America > Canada > Quebec (0.14)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.34)

Add feedback

Hierarchical Methods of Moments

Ruffini, Matteo, Rabusseau, Guillaume, Balle, Borja

arXiv.org Machine LearningOct-17-2018

Spectral methods of moments provide a powerful tool for learning the parameters of latent variable models. Despite their theoretical appeal, the applicability of these methods to real data is still limited due to a lack of robustness to model misspecification. In this paper we present a hierarchical approach to methods of moments to circumvent such limitations. Our method is based on replacing the tensor decomposition step used in previous algorithms with approximate joint diagonalization. Experiments on topic modeling show that our method outperforms previous tensor decomposition methods in terms of speed and model quality.

algorithm, artificial intelligence, machine learning, (17 more...)

arXiv.org Machine Learning

1810.07468

Country: North America > United States > California (0.28)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.91)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.46)

Add feedback

Sequential Coordination of Deep Models for Learning Visual Arithmetic

Crawford, Eric, Rabusseau, Guillaume, Pineau, Joelle

arXiv.org Machine LearningSep-13-2018

Achieving machine intelligence requires a smooth integration of perception and reasoning, yet models developed to date tend to specialize in one or the other; sophisticated manipulation of symbols acquired from rich perceptual spaces has so far proved elusive. Consider a visual arithmetic task, where the goal is to carry out simple arithmetical algorithms on digits presented under natural conditions (e.g. hand-written, placed randomly). We propose a two-tiered architecture for tackling this problem. The lower tier consists of a heterogeneous collection of information processing modules, which can include pre-trained deep neural networks for locating and extracting characters from the image, as well as modules performing symbolic transformations on the representations extracted by perception. The higher tier consists of a controller, trained using reinforcement learning, which coordinates the modules in order to solve the high-level task. For instance, the controller may learn in what contexts to execute the perceptual networks and what symbolic transformations to apply to their outputs. The resulting model is able to solve a variety of tasks in the visual arithmetic domain, and has several advantages over standard, architecturally homogeneous feedforward networks including improved sample efficiency.

deep learning, interface, neural network, (15 more...)

arXiv.org Machine Learning

1809.04988

Country: North America > Canada > Quebec > Montreal (0.14)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Add feedback

Connecting Weighted Automata and Recurrent Neural Networks through Spectral Learning

Rabusseau, Guillaume, Li, Tianyu, Precup, Doina

arXiv.org Machine LearningJul-3-2018

In this paper, we unravel a fundamental connection between weighted finite automata~(WFAs) and second-order recurrent neural networks~(2-RNNs): in the case of sequences of discrete symbols, WFAs and 2-RNNs with linear activation functions are expressively equivalent. Motivated by this result, we build upon a recent extension of the spectral learning algorithm to vector-valued WFAs and propose the first provable learning algorithm for linear 2-RNNs defined over sequences of continuous input vectors. This algorithm relies on estimating low rank sub-blocks of the so-called Hankel tensor, from which the parameters of a linear 2-RNN can be provably recovered. The performances of the proposed method are assessed in a simulation study.

deep learning, neural network, sequence, (20 more...)

arXiv.org Machine Learning

1807.01406

Country:

North America > Canada > Quebec > Montreal (0.14)
North America > United States > California (0.14)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.93)

Add feedback

Learning Graph Weighted Models on Pictures

Amortila, Philip, Rabusseau, Guillaume

arXiv.org Machine LearningJun-21-2018

Graph Weighted Models (GWMs) have recently been proposed as a natural generalization of weighted automata over strings and trees to arbitrary families of labeled graphs (and hypergraphs). A GWM generically associates a labeled graph with a tensor network and computes a value by successive contractions directed by its edges. In this paper, we consider the problem of learning GWMs defined over the graph family of pictures (or 2-dimensional words). As a proof of concept, we consider regression and classification tasks over the simple Bars & Stripes and Shifting Bits picture languages and provide an experimental study investigating whether these languages can be learned in the form of a GWM from positive and negative examples using gradient-based methods. Our results suggest that this is indeed possible and that investigating the use of gradient-based methods to learn picture series and functions computed by GWMs over other families of graphs could be a fruitful direction.

deep learning, gwm, neural network, (15 more...)

arXiv.org Machine Learning

1806.08297

Country: Europe (0.14)

Genre: Research Report > New Finding (1.00)

Industry:

Media (0.34)
Leisure & Entertainment (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Hierarchical Methods of Moments

Ruffini, Matteo, Rabusseau, Guillaume, Balle, Borja

Neural Information Processing SystemsDec-31-2017

algorithm, artificial intelligence, machine learning, (14 more...)

Neural Information Processing Systems

Country: North America > United States > California (0.28)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.46)

Add feedback

Multitask Spectral Learning of Weighted Automata

Rabusseau, Guillaume, Balle, Borja, Pineau, Joelle

Neural Information Processing SystemsDec-31-2017

We consider the problem of estimating multiple related functions computed by weighted automata (WFA). We first present a natural notion of relatedness between WFAs by considering to which extent several WFAs can share a common underlying representation.We then introduce the novel model of vector-valued WFA which conveniently helps us formalize this notion of relatedness. Finally, we propose a spectral learning algorithm for vector-valued WFAs to tackle the multitask learning problem. By jointly learning multiple tasks in the form of a vector-valued WFA, our algorithm enforces the discovery of a representation space shared between tasks. The benefits of the proposed multitask approach are theoretically motivated and showcased through experiments on both synthetic and real world datasets.

artificial intelligence, machine learning, wfa, (15 more...)

Neural Information Processing Systems

Country: North America > United States (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)

Add feedback

Tensor Regression Networks with various Low-Rank Tensor Approximations

Cao, Xingwei, Rabusseau, Guillaume, Pineau, Joelle

arXiv.org Machine LearningDec-27-2017

Tensor regression networks achieve high rate of compression of model parameters in multilayer perceptrons (MLP) while having slight impact on performances. Tensor regression layer imposes low-rank constraints on the tensor regression layer which replaces the flattening operation of traditional MLP. We investigate tensor regression networks using various low-rank tensor approximations, aiming to leverage the multi-modal structure of high dimensional data by enforcing efficient low-rank constraints. We provide a theoretical analysis giving insights on the choice of the rank parameters. We evaluated performance of proposed model with state-of-the-art deep convolutional models. For CIFAR-10 dataset, we achieved the compression rate of 0.018 with the sacrifice of accuracy less than 1%.

artificial intelligence, decomposition, neural network, (18 more...)

arXiv.org Machine Learning

1712.0952

Country: North America > Canada > Quebec > Montreal (0.14)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.91)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.54)

Add feedback