Goto

Collaborating Authors

Results



Online learning with kernel losses

arXiv.org Machine Learning

We present a generalization of the adversarial linear bandits framework, where the underlying losses are kernel functions (with an associated reproducing kernel Hilbert space) rather than linear functions. We study a version of the exponential weights algorithm and bound its regret in this setting. Under conditions on the eigendecay of the kernel we provide a sharp characterization of the regret for this algorithm. When we have polynomial eigendecay $\mu_j \le \mathcal{O}(j^{-\beta})$, we find that the regret is bounded by $\mathcal{R}_n \le \mathcal{O}(n^{\beta/(2(\beta-1))})$; while under the assumption of exponential eigendecay $\mu_j \le \mathcal{O}(e^{-\beta j })$, we get an even tighter bound on the regret $\mathcal{R}_n \le \mathcal{O}(n^{1/2}\log(n)^{1/2})$. We also study the full information setting when the underlying losses are kernel functions and present an adapted exponential weights algorithm and a conditional gradient descent algorithm.


Real-Time Energy Disaggregation of a Distribution Feeder's Demand Using Online Learning

arXiv.org Machine Learning

Though distribution system operators have been adding more sensors to their networks, they still often lack an accurate real-time picture of the behavior of distributed energy resources such as demand responsive electric loads and residential solar generation. Such information could improve system reliability, economic efficiency, and environmental impact. Rather than installing additional, costly sensing and communication infrastructure to obtain additional real-time information, it may be possible to use existing sensing capabilities and leverage knowledge about the system to reduce the need for new infrastructure. In this paper, we disaggregate a distribution feeder's demand measurements into: 1) the demand of a population of air conditioners, and 2) the demand of the remaining loads connected to the feeder. We use an online learning algorithm, Dynamic Fixed Share (DFS), that uses the real-time distribution feeder measurements as well as models generated from historical building- and device-level data. We develop two implementations of the algorithm and conduct case studies using real demand data from households and commercial buildings to investigate the effectiveness of the algorithm. The case studies demonstrate that DFS can effectively perform online disaggregation and the choice and construction of models included in the algorithm affects its accuracy, which is comparable to that of a set of Kalman filters.


Online Learning Rate Adaptation with Hypergradient Descent

arXiv.org Machine Learning

We introduce a general method for improving the convergence rate of gradient-based optimizers that is easy to implement and works well in practice. We demonstrate the effectiveness of the method in a range of optimization problems by applying it to stochastic gradient descent, stochastic gradient descent with Nesterov momentum, and Adam, showing that it significantly reduces the need for the manual tuning of the initial learning rate for these commonly used algorithms. Our method works by dynamically updating the learning rate during optimization using the gradient with respect to the learning rate of the update rule itself. Computing this "hypergradient" needs little additional computation, requires only one extra copy of the original gradient to be stored in memory, and relies upon nothing more than what is provided by reverse-mode automatic differentiation.


The Many Faces of Exponential Weights in Online Learning

arXiv.org Machine Learning

A standard introduction to online learning might place Online Gradient Descent at its center and then proceed to develop generalizations and extensions like Online Mirror Descent and second-order methods. Here we explore the alternative approach of putting exponential weights (EW) first. We show that many standard methods and their regret bounds then follow as a special case by plugging in suitable surrogate losses and playing the EW posterior mean. For instance, we easily recover Online Gradient Descent by using EW with a Gaussian prior on linearized losses, and, more generally, all instances of Online Mirror Descent based on regular Bregman divergences also correspond to EW with a prior that depends on the mirror map. Furthermore, appropriate quadratic surrogate losses naturally give rise to Online Gradient Descent for strongly convex losses and to Online Newton Step. We further interpret several recent adaptive methods (iProd, Squint, and a variation of Coin Betting for experts) as a series of closely related reductions to exp-concave surrogate losses that are then handled by Exponential Weights. Finally, a benefit of our EW interpretation is that it opens up the possibility of sampling from the EW posterior distribution instead of playing the mean. As already observed by Bubeck and Eldan, this recovers the best-known rate in Online Bandit Linear Optimization.



Advanced Data Mining projects with R Udemy

@machinelearnbot

Advanced Data Mining Projects with R takes you one step ahead in understanding the most complex data mining algorithms and implementing them in the popular R language. Follow up to our course Data Mining Projects in R, this course will teach you how to build your own recommendation engine. You will also implement dimensionality reduction and use it to build a real-world project. Going ahead, you will be introduced to the concept of neural networks and learn how to apply them for predictions, classifications, and forecasting. Finally, you will implement ggplot2, plotly and aspects of geomapping to create your own data visualization projects.By the end of this course, you will be well-versed with all the advanced data mining techniques and how to implement them using R, in any real-world scenario.


Finding Heavily-Weighted Features in Data Streams

arXiv.org Machine Learning

We introduce a new sub-linear space data structure---the Weight-Median Sketch---that captures the most heavily weighted features in linear classifiers trained over data streams. This enables memory-limited execution of several statistical analyses over streams, including online feature selection, streaming data explanation, relative deltoid detection, and streaming estimation of pointwise mutual information. In contrast with related sketches that capture the most commonly occurring features (or items) in a data stream, the Weight-Median Sketch captures the features that are most discriminative of one stream (or class) compared to another. The Weight-Median sketch adopts the core data structure used in the Count-Sketch, but, instead of sketching counts, it captures sketched gradient updates to the model parameters. We provide a theoretical analysis of this approach that establishes recovery guarantees in the online learning setting, and demonstrate substantial empirical improvements in accuracy-memory trade-offs over alternatives, including count-based sketches and feature hashing.


Interested in Machine Learning? – Udacity Inc – Medium

#artificialintelligence

Then we invite you to check out this very friendly introduction we made at Udacity! There are actually 19 videos included in this playlist, covering topics like Linear Regression, Neural Networks, Hierarchical Clustering, and more. Really got the Machine Learning fever? Then consider enrolling in our Machine Learning Nanodegree program. It's the best way to learn everything you need to know to become a successful Machine Learning Engineer!


Tableau 10 and Tableau 9.3 Desktop, Server & Data Science

@machinelearnbot

This course is about learning Business Intelligence & Analytical tool called Tableau, which has been in leaders position since 4 years Business Intelligence, Analytics, Data Visualisation, Tableau desktop, Tableau server, Tableau & Hadoop, Tableau & R, are the common terminologies used to find this course We have included course content in form of powerpoint presentation, datasets used for visualisation, 2 live case study projects for download, interview questions, sample resumes/profiles for job seekers This course is extremely exhaustive & hence will last for more than 25 hours Course is structured to start with introduction to the tool & the principles behind data visualisation. From there Tableau desktop is explained thoroughly including analytical concepts behind applicable visualisation. Finally course ends with explanation on Tableau server & the final 2 use cases as projects along with interview questions for job seekers Jobs are abundant for Tableau & salaries are very promising & highest in this domain. Also this course is very exhaustive which includes Statistics, Forecasting, Regression models, K-means Clustering, Text Mining, Hadoop & R required for Tableau. Also included are Tableau Desktop & Server concepts in one course.