Not enough data to create a plot.
Try a different view from the menu above.
Mixability made efficient: Fast online multiclass logistic regression
Mixability has been shown to be a powerful tool to obtain algorithms with optimal regret. However, the resulting methods often suffer from high computational complexity which has reduced their practical applicability. For example, in the case of multiclass logistic regression, the aggregating forecaster (Foster et al. (2018)) achieves a regret of O(log(Bn)) whereas Online Newton Step achieves O(e
Linearly Converging Error Compensated SGD
In this paper, we propose a unified analysis of variants of distributed SGD with arbitrary compressions and delayed updates. Our framework is general enough to cover different variants of quantized SGD, Error-Compensated SGD (EC-SGD) and SGD with delayed updates (D-SGD). Via a single theorem, we derive the complexity results for all the methods that fit our framework. For the existing methods, this theorem gives the best-known complexity results. Moreover, using our general scheme, we develop new variants of SGD that combine variance reduction or arbitrary sampling with error feedback and quantization and derive the convergence rates for these methods beating the state-of-the-art results. In order to illustrate the strength of our framework, we develop 16 new methods that fit this. In particular, we propose the first method called EC-SGD-DIANA that is based on error-feedback for biased compression operator and quantization of gradient differences and prove the convergence guarantees showing that EC-SGD-DIANA converges to the exact optimum asymptotically in expectation with constant learning rate for both convex and strongly convex objectives when workers compute full gradients of their loss functions. Moreover, for the case when the loss function of the worker has the form of finite sum, we modified the method and got a new one called EC-LSVRG-DIANA which is the first distributed stochastic method with error feedback and variance reduction that converges to the exact optimum asymptotically in expectation with a constant learning rate.
Video game industry makes it easier to find accessible games for disabled players
The Entertainment Software Association (ESA), a national video game industry trade association, unveiled a new Accessible Games Initiative this week, intended to standardize information for players with disabilities and backed by major names in the world of gaming, including Electronic Arts (EA), Nintendo, and Ubisoft. Announced at this year's Game Developers Conference, the accessibility initiative includes 24 new tags and associated criteria that elucidates in-game features or controls, helping players better understand if they'll be able to play a game before they buy. Examples include "clear text," "large and clear subtitles," and "narrated menus," which enable access for people who are blind or have low vision. Tags like "playable with buttons only," "playable without touch controls," and "stick inversion" are necessary for players with various motor skills. According to the ESA, standardized tags will make it easier for players with disabilities to find and assess games with built-in accessibility features or assistive device compatibility, and even provide useful information for parents and teachers seeking out games for young children.
Efficient Bayesian Learning Curve Extrapolation using Prior-Data Fitted Networks Machine Learning Lab
Learning curve extrapolation aims to predict model performance in later epochs of training, based on the performance in earlier epochs. In this work, we argue that, while the inherent uncertainty in the extrapolation of learning curves warrants a Bayesian approach, existing methods are (i) overly restrictive, and/or (ii) computationally expensive. We describe the first application of prior-data fitted neural networks (PFNs) in this context. A PFN is a transformer, pre-trained on data generated from a prior, to perform approximate Bayesian inference in a single forward pass. We propose LC-PFN, a PFN trained to extrapolate artificial right-censored learning curves generated from a parametric prior proposed in prior art using MCMC. We demonstrate that LC-PFN can approximate the posterior predictive distribution over learning curves more accurately than MCMC, while being over 10 000 times faster. We also show that the same LC-PFN achieves competitive performance extrapolating a total of 20 000 real learning curves from four learning curve benchmarks (LCBench, NAS-Bench-201, Taskset, and PD1) that stem from training a wide range of model architectures (MLPs, CNNs, RNNs, and Transformers) on 53 different datasets with varying input modalities (tabular, image, text, and protein data). Finally, we investigate its potential in the context of model selection and find that a simple LC-PFN based predictive early stopping criterion obtains 2 - 6 speed-ups on 45 of these datasets, at virtually no overhead.
Appendix and policy can be well-approximated by an
In the beginning of this Appendix, we will provide the overall organization of the Appendix and notation table for the paper. Then we will include description of lower bound on the warm-up duration and briefly comment on their goal in helping to achieve the regret result. We also provide Lemma A.1 that shows that any stabilizing In particular, in Appendix E.1 we show the persistence of excitation during the warm-up, in Appendix E.2 we formally define the persistence of excitation property of the controllers in M, i.e. (43), and finally in Appendix E.3, we show that the control policies of In Appendix G, we state the formal regret result of the paper, Theorem 5 and provides its proof. Appendix H briefly looks into the case where the loss functions are convex. Finally, in Appendix I, we provide the supporting technical theorems and lemmas.
Logarithmic Regret Bound in Partially Observable Linear Dynamical Systems
We study the problem of system identification and adaptive control in partially observable linear dynamical systems. Adaptive and closed-loop system identification is a challenging problem due to correlations introduced in data collection. In this paper, we present the first model estimation method with finite-time guarantees in both open and closed-loop system identification.
propose the first finite-time system identification algorithm for partially observable linear dynamical systems (LDS)
We thank the reviewers for their effort and insightful comments during these unprecedented times. LQR & LQG are among few continuous settings where the optimal policies exist (and mainly have closed form) [1]. Therefore, we do not see why this paper would be less relevant to our community. If PE is absent, we provide two general algorithms stated in Cor. The agent uses a warm-up period of O( T) after which it commits to a controller yielding a regret of T. R1: We are delighted to hear the kind words of R1 about our novel results.
Learning State Representations from Random Deep Action-conditional Predictions
Our main contribution in this work is an empirical finding that random General Value Functions (GVFs), i.e., deep action-conditional predictions--random both in what feature of observations they predict as well as in the sequence of actions the predictions are conditioned upon--form good auxiliary tasks for reinforcement learning (RL) problems. In particular, we show that random deep action-conditional predictions when used as auxiliary tasks yield state representations that produce control performance competitive with state-of-the-art hand-crafted auxiliary tasks like value prediction, pixel control, and CURL in both Atari and DeepMind Lab tasks. In another set of experiments we stop the gradients from the RL part of the network to the state representation learning part of the network and show, perhaps surprisingly, that the auxiliary tasks alone are sufficient to learn state representations good enough to outperform an end-to-end trained actor-critic baseline.