Goto

Collaborating Authors

 Mathematical & Statistical Methods


Free Book: Probability and Statistics Cookbook

#artificialintelligence

The format is very similar to a BIG cheat sheet. It is based on literature and in-class material from courses of the statistics department at the University of California in Berkeley but also influenced by other sources . To read the PDF version, click here.


10 Examples of Linear Algebra in Machine Learning - Machine Learning Mastery

@machinelearnbot

Linear algebra is a sub-field of mathematics concerned with vectors, matrices, and linear transforms. It is a key foundation to the field of machine learning, from notations used to describe the operation of algorithms to the implementation of algorithms in code. Although linear algebra is integral to the field of machine learning, the tight relationship is often left unexplained or explained using abstract concepts such as vector spaces or specific matrix operations. In this post, you will discover 10 common examples of machine learning that you may be familiar with that use, require and are really best understood using linear algebra. In this post, we will review 10 obvious and concrete examples of linear algebra in machine learning.


A Stochastic Semismooth Newton Method for Nonsmooth Nonconvex Optimization

arXiv.org Machine Learning

In this work, we present a globalized stochastic semismooth Newton method for solving stochastic optimization problems involving smooth nonconvex and nonsmooth convex terms in the objective function. We assume that only noisy gradient and Hessian information of the smooth part of the objective function is available via calling stochastic first and second order oracles. The proposed method can be seen as a hybrid approach combining stochastic semismooth Newton steps and stochastic proximal gradient steps. Two inexact growth conditions are incorporated to monitor the convergence and the acceptance of the semismooth Newton steps and it is shown that the algorithm converges globally to stationary points in expectation. Moreover, under standard assumptions and utilizing random matrix concentration inequalities, we prove that the proposed approach locally turns into a pure stochastic semismooth Newton method and converges r-superlinearly with high probability. We present numerical results and comparisons on $\ell_1$-regularized logistic regression and nonconvex binary classification that demonstrate the efficiency of our algorithm.


Curious Mathematical Object: Hyperlogarithms

@machinelearnbot

Logarithms turn a product of numbers into a sum of numbers: log(xy) log(x) log(y). Hyperlogarithms generalize the concept as follows: Hlog(XY) Hlog(X) Hlog(y), where X and Y are any kind of objects, and the product and sum are replaced by operators in some arbitrary space. Here we focus exclusively on operations on sets: XY becomes the intersection of the sets X and Y, and X Y the union of X and Y. The question is: which functions satisfy Hlog(XY) Hlog(X) Hlog(y). We assume here that the argument for Hlog is a set X, and the returned value Hlog(X) Y is another set Y from the same set of sets. Let E {X, Y, ... } be the sets of all potential arguments for Hlog.


No Bullshit Guide To Linear Algebra Review - Machine Learning Mastery

#artificialintelligence

There are many books that provide an introduction to the field of linear algebra. Most are textbooks targeted at undergraduate students and are full of theoretical digressions that are barely relevant and mostly distracting to a beginner or practitioner to the field. In this post, you will discover the book "No bullshit guide to linear algebra" that provides a gentle introduction to the field of linear algebra and assumes no prior mathematical knowledge. No Bullshit Guide To Linear Algebra Review Photo by Ralf Kayser, some rights reserved. The book provides an introduction to linear algebra, comparable to an undergraduate university course on the subject.


Computational Optimal Transport

arXiv.org Machine Learning

Optimal Transport (OT) is a mathematical gem at the interface between probability, analysis and optimization. The goal of that theory is to define geometric tools that are useful to compare probability distributions. Earlier contributions originated from Monge's work in the 18th century, to be later rediscovered under a different formalism by Tolstoi in the 1920's, Kantorovich, Hitchcock and Koopmans in the 1940's. The problem was solved numerically by Dantzig in 1949 and others in the 1950's within the framework of linear programming, paving the way for major industrial applications in the second half of the 20th century. OT was later rediscovered under a different light by analysts in the 90's, following important work by Brenier and others, as well as in the computer vision/graphics fields under the name of earth mover's distances. Recent years have witnessed yet another revolution in the spread of OT, thanks to the emergence of approximate solvers that can scale to sizes and dimensions that are relevant to data sciences. Thanks to this newfound scalability, OT is being increasingly used to unlock various problems in imaging sciences (such as color or texture processing), computer vision and graphics (for shape manipulation) or machine learning (for regression,classification and density fitting). This short book reviews OT with a bias toward numerical methods and their applications in data sciences, and sheds lights on the theoretical properties of OT that make it particularly useful for some of these applications.


Top Resources for Learning Linear Algebra for Machine Learning - Machine Learning Mastery

#artificialintelligence

Linear algebra is a field of mathematics and an important pillar of the field of machine learning. It can be a challenging topic for beginners, or for practitioners who have not looked at the topic in decades. In this post, you will discover how to get help with linear algebra for machine learning. Top Resources for Learning Linear Algebra for Machine Learning Photos by mickey, some rights reserved. Take my free 7-day email crash course now (with sample code).



Linear Algebra Cheat Sheet for Machine Learning - Machine Learning Mastery

#artificialintelligence

The Python numerical computation library called NumPy provides many linear algebra functions that may be useful as a machine learning practitioner. In this tutorial, you will discover the key functions for working with vectors and matrices that you may find useful as a machine learning practitioner. This is a cheat sheet and all examples are short and assume you are familiar with the operation being performed. You may want to bookmark this page for future reference. Linear Algebra Cheat Sheet for Machine Learning Photo by Christoph Landers, some rights reserved.


Projection-Free Online Optimization with Stochastic Gradient: From Convexity to Submodularity

arXiv.org Artificial Intelligence

Online optimization has been a successful framework for solving large-scale problems under computational constraints and partial information. Current methods for online convex optimization require either a projection or exact gradient computation at each step, both of which can be prohibitively expensive for large-scale applications. At the same time, there is a growing trend of non-convex optimization in machine learning community and a need for online methods. Continuous submodular functions, which exhibit a natural diminishing returns condition, have recently been proposed as a broad class of non-convex functions which may be efficiently optimized. Although online methods have been introduced, they suffer from similar problems. In this work, we propose Meta-Frank-Wolfe, the first online projectionfree algorithm that uses stochastic gradient estimates. The algorithm relies on a careful sampling of gradients in each round and achieves the optimal $O(\sqrt{T})$ adversarial regret bounds for convex and continuous submodular optimization. We also propose One-Shot Frank-Wolfe, a simpler algorithm which requires only a single stochastic gradient estimate in each round and achieves a $O(T^{2/3})$ stochastic regret bound for convex and continuous submodular optimization. We apply our methods to develop a novel "lifting" framework for the online discrete submodular maximization and also see that they outperform current state of the art techniques on an extensive set of experiments.