Statistical Learning
Learning Dynamical Systems via Koopman Operator Regression in Reproducing Kernel Hilbert Spaces
We study a class of dynamical systems modelled as Markov chains that admit an invariant distribution via the corresponding transfer, or Koopman, operator. While data-driven algorithms to reconstruct such operators are well known, their relationship with statistical learning is largely unexplored. We formalize a framework to learn the Koopman operator from finite data trajectories of the dynamical system. We consider the restriction of this operator to a reproducing kernel Hilbert space and introduce a notion of risk, from which different estimators naturally arise. We link the risk with the estimation of the spectral decomposition of the Koopman operator. These observations motivate a reduced-rank operator regression (RRR) estimator. We derive learning bounds for the proposed estimator, holding both in i.i.d. and non i.i.d.
Efficient Data Subset Selection to Generalize Training Across Models: Transductive and Inductive Networks
Existing subset selection methods for efficient learning predominantly employ discrete combinatorial and model-specific approaches which lack generalizability. For an unseen architecture, one cannot use the subset chosen for a different model. To tackle this problem, we propose SUBSELNET, a trainable subset selection framework, that generalizes across architectures. Here, we first introduce an attention-based neural gadget that leverages the graph structure of architectures and acts as a surrogate to trained deep neural networks for quick model prediction. Then, we use these predictions to build subset samplers.
UCB-based Algorithms for Multinomial Logistic Regression Bandits
Out of the rich family of generalized linear bandits, perhaps the most well studied ones are logistic bandits that are used in problems with binary rewards: for instance, when the learner aims to maximize the profit over a user that can select one of two possible outcomes (e.g., 'click' vs'no-click'). Despite remarkable recent progress and improved algorithms for logistic bandits, existing works do not address practical situations where the number of outcomes that can be selected by the user is larger than two (e.g., 'click', 'show me later', 'never show again', 'no click'). In this paper, we study such an extension. We use multinomial logit (MNL) to model the probability of each one of K+1 2possible outcomes (+1 stands for the'not click' outcome): we assume that for a learner's action xt, the user selects one of K +1 2outcomes, say outcome i, with a MNL probabilistic model with corresponding unknown parameter ฮธ i. Each outcome i is also associated with a revenue parameter ฯi and the goal is to maximize the expected revenue. For this problem, we present MNL-UCB, an upper confidence bound (UCB)-based algorithm, that achieves regret O(dK T) with small dependency on problemdependent constants that can otherwise be arbitrarily large and lead to loose regret bounds. We present numerical simulations that corroborate our theoretical results.