Toni, Laura
Jensen-Shannon Information Based Characterization of the Generalization Error of Learning Algorithms
Aminian, Gholamali, Toni, Laura, Rodrigues, Miguel R. D.
Bounds based on Wassertein distances [12], Machine learning-based approaches are increasingly [13] and bounds based on other divergences [14] are also adopted to solve various prediction problems in a wide range known. of applications such as computer vision, speech recognition, In this work, we also concentrate on the characterization speech translation, and many more [1], [2]. In particular, of the generalization ability of (supervised) machine learning supervised machine learning approaches learn a predictor - algorithms by making a series of contributions: also known as a hypothesis - mapping some input variable 1) First, we offer a new approach to bound the (expected) to an output variable using some algorithm that leverages a generalization error of learning algorithms based on the series of input-output examples drawn from some underlying use of auxiliary distributions imposed both on the data (and unknown) distribution [1]. It is therefore critical to generation and the hypothesis generation processes.
Graph signal processing for machine learning: A review and new perspectives
Dong, Xiaowen, Thanou, Dorina, Toni, Laura, Bronstein, Michael, Frossard, Pascal
The effective representation, processing, analysis, and visualization of large-scale structured data, especially those related to complex domains such as networks and graphs, are one of the key questions in modern machine learning. Graph signal processing (GSP), a vibrant branch of signal processing models and algorithms that aims at handling data supported on graphs, opens new paths of research to address this challenge. In this article, we review a few important contributions made by GSP concepts and tools, such as graph filters and transforms, to the development of novel machine learning algorithms. In particular, our discussion focuses on the following three aspects: exploiting data structure and relational priors, improving data and computational efficiency, and enhancing model interpretability. Furthermore, we provide new perspectives on future development of GSP techniques that may serve as a bridge between applied mathematics and signal processing on one side, and machine learning and network science on the other. Cross-fertilization across these different disciplines may help unlock the numerous challenges of complex data analysis in the modern age.
Differentiable Linear Bandit Algorithm
Yang, Kaige, Toni, Laura
Upper Confidence Bound (UCB) is arguably the most commonly used method for linear multi-arm bandit problems. While conceptually and computationally simple, this method highly relies on the confidence bounds, failing to strike the optimal exploration-exploitation if these bounds are not properly set. In the literature, confidence bounds are typically derived from concentration inequalities based on assumptions on the reward distribution, e.g., sub-Gaussianity. The validity of these assumptions however is unknown in practice. In this work, we aim at learning the confidence bound in a data-driven fashion, making it adaptive to the actual problem structure. Specifically, noting that existing UCB-typed algorithms are not differentiable with respect to confidence bound, we first propose a novel differentiable linear bandit algorithm. Then, we introduce a gradient estimator, which allows the confidence bound to be learned via gradient ascent. Theoretically, we show that the proposed algorithm achieves a $\tilde{\mathcal{O}}(\hat{\beta}\sqrt{dT})$ upper bound of $T$-round regret, where $d$ is the dimension of arm features and $\hat{\beta}$ is the learned size of confidence bound. Empirical results show that $\hat{\beta}$ is significantly smaller than its theoretical upper bound and proposed algorithms outperforms baseline ones on both simulated and real-world datasets.
State2vec: Off-Policy Successor Features Approximators
Madjiheurem, Sephora, Toni, Laura
A major challenge in reinforcement learning (RL) is the design of agents that are able to generalize across tasks that share common dynamics. A viable solution is meta-reinforcement learning, which identifies common structures among past tasks to be then generalized to new tasks (meta-test). In meta-training, the RL agent learns state representations that encode prior information from a set of tasks, used to generalize the value function approximation. This has been proposed in the literature as successor representation approximators. While promising, these methods do not generalize well across optimal policies, leading to sampling-inefficiency during meta-test phases. In this paper, we propose state2vec, an efficient and low-complexity framework for learning successor features which (i) generalize across policies, (ii) ensure sample-efficiency during meta-test. We extend the well known node2vec framework to learn state embeddings that account for the discounted future state transitions in RL. The proposed off-policy state2vec captures the geometry of the underlying state space, making good basis functions for linear value function approximation.
Error Analysis on Graph Laplacian Regularized Estimator
Yang, Kaige, Dong, Xiaowen, Toni, Laura
We provide a theoretical analysis of the representation learning problem aimed at learning the latent variables (design matrix) $\Theta$ of observations $Y$ with the knowledge of the coefficient matrix $X$. The design matrix is learned under the assumption that the latent variables $\Theta$ are smooth with respect to a (known) topological structure $\mathcal{G}$. To learn such latent variables, we study a graph Laplacian regularized estimator, which is the penalized least squares estimator with penalty term proportional to a Laplacian quadratic form. This type of estimators has recently received considerable attention due to its capability in incorporating underlying topological graph structure of variables into the learning process. While the estimation problem can be solved efficiently by state-of-the-art optimization techniques, its statistical consistency properties have been largely overlooked. In this work, we develop a non-asymptotic bound of estimation error under the classical statistical setting, where sample size is larger than the ambient dimension of the latent variables. This bound illustrates theoretically the impact of the alignment between the data and the graph structure as well as the graph spectrum on the estimation accuracy. It also provides theoretical evidence of the advantage, in terms of convergence rate, of the graph Laplacian regularized estimator over classical ones (that ignore the graph structure) in case of a smoothness prior. Finally, we provide empirical results of the estimation error to corroborate the theoretical analysis.
Representation Learning on Graphs: A Reinforcement Learning Application
Madjiheurem, Sephora, Toni, Laura
In this work, we study value function approximation in reinforcement learning (RL) problems with high dimensional state or action spaces via a generalized version of representation policy iteration (RPI). We consider the limitations of proto-value functions (PVFs) at accurately approximating the value function in low dimensions and we highlight the importance of features learning for an improved low-dimensional value function approximation. Then, we adopt different representation learning algorithm on graphs to learn the basis functions that best represent the value function. We empirically show that node2vec, an algorithm for scalable feature learning in networks, and the Variational Graph Auto-Encoder constantly outperform the commonly used smooth proto-value functions in low-dimensional feature space.
Graph-Based Recommendation System
Yang, Kaige, Toni, Laura
In this work, we study recommendation systems modelled as contextual multi-armed bandit (MAB) problems. We propose a graph-based recommendation system that learns and exploits the geometry of the user space to create meaningful clusters in the user domain. This reduces the dimensionality of the recommendation problem while preserving the accuracy of MAB. We then study the effect of graph sparsity and clusters size on the MAB performance and provide exhaustive simulation results both in synthetic and in real-case datasets. Simulation results show improvements with respect to state-of-the-art MAB algorithms.