Goto

Collaborating Authors

 Mathematical & Statistical Methods


Stochastic modified equations for the asynchronous stochastic gradient descent

arXiv.org Machine Learning

We propose a stochastic modified equations (SME) for modeling the asynchronous stochastic gradient descent (ASGD) algorithms. The resulting SME of Langevin type extracts more information about the ASGD dynamics and elucidates the relationship between different types of stochastic gradient algorithms. We show the convergence of ASGD to the SME in the continuous time limit, as well as the SME's precise prediction to the trajectories of ASGD with various forcing terms. As an application of the SME, we propose an optimal mini-batching strategy for ASGD via solving the optimal control problem of the associated SME.


Learning with Non-Convex Truncated Losses by SGD

arXiv.org Machine Learning

Learning with a {\it convex loss} function has been a dominating paradigm for many years. It remains an interesting question how non-convex loss functions help improve the generalization of learning with broad applicability. In this paper, we study a family of objective functions formed by truncating traditional loss functions, which is applicable to both shallow learning and deep learning. Truncating loss functions has potential to be less vulnerable and more robust to large noise in observations that could be adversarial. More importantly, it is a generic technique without assuming the knowledge of noise distribution. To justify non-convex learning with truncated losses, we establish excess risk bounds of empirical risk minimization based on truncated losses for heavy-tailed output, and statistical error of an approximate stationary point found by stochastic gradient descent (SGD) method. Our experiments for shallow and deep learning for regression with outliers, corrupted data and heavy-tailed noise further justify the proposed method.


Impact of Random Number Generation on Parallel Genetic Algorithms

AAAI Conferences

In this paper, we present a parallel genetic algorithm (pGA) with adaptive control parameters and permutation representation for weighted tardiness scheduling with sequence-dependent setups, an NP-Hard problem. This pGA provides a linear to slightly superlinear speedup relative to its sequential counterpart. As part of our research, we explore the effects of different random number generation algorithms on the runtimes of both sequential and parallel GAs. GAs and other forms of evolutionary computation rely so heavily on random number generation that our results show that we can obtain a 20% increase in the speed of a pGA, and an over 25% increase in the speed of a sequential GA, simply by careful choice of random number generator---both the underlying generator as well as algorithms for specific number types such as Gaussian often needed for mutating real-valued genes.


Linear Algebra for Deep Learning – Towards Data Science

#artificialintelligence

Linear algebra, probability and calculus are the'languages' in which machine learning is formulated. Learning these topics will contribute a deeper understanding of the underlying algorithmic mechanics and allow development of new algorithms. When confined to smaller levels, everything is math behind deep learning. So it is essential to understand basic linear algebra before getting started with deep learning and programming it. The core data structures behind Deep-Learning are Scalars, Vectors, Matrices and Tensors.


Matrix Algebra - Linear Algebra for Deep Learning (Part 2)

#artificialintelligence

Last week I posted an article, which formed the first part in a series on Linear Algebra For Deep Learning. The response to the article was extremely positive, both in terms of feedback, article views and also more broadly on social media. Many of you commented that there was "an appetite" for introductory mathematical content and this only confirms the results of the QuantStart 2017 Content Survey. Hence I've decided to write more introductory articles, not only continuing with Linear Algebra, but also on the topics of Calculus and Probability, which are fundamental topics for machine learning--and quantitative finance more broadly. In the previous article we introduced the three basic entities that will be used in linear algebra, namely the scalar, vector and the matrix.


Introduction to Graph Theory Coursera

@machinelearnbot

About this course: We invite you to a fascinating journey into Graph Theory -- an area which connects the elegance of painting and the rigor of mathematics; is simple, but not unsophisticated. Graph Theory gives us, both an easy way to pictorially represent many major mathematical results, and insights into the deep theories behind them. In this course, among other intriguing applications, we will see how GPS systems find shortest routes, how engineers design integrated circuits, how biologists assemble genomes, why a political map can always be colored using a few colors. We will study Ramsey Theory which proves that in a large system, complete disorder is impossible! By the end of the course, we will implement an algorithm which finds an optimal assignment of students to schools.


Top KDnuggets tweets, May 02-08: Boost your data science skills. Learn linear algebra.

@machinelearnbot

Most popular @KDnuggets tweets for May 02-08 were Most Retweeted, Most Favorited, Most Viewed, Most Clicked Boost your data science skills. Top 10 most engaging Tweets Boost your data science skills. Deep Conversations: Mathematician Lisha Li on how she thrives as a VC at Amplify Partners to identify, invest and nurture the right #startups in #MachineLearning and #Distributed Systems https://t.co/9h9VeNfgV0 Boost your data science skills. Deep Conversations: Mathematician Lisha Li on how she thrives as a VC at Amplify Partners to identify, invest and nurture the right #startups in #MachineLearning and #Distributed Systems https://t.co/9h9VeNfgV0


Boost your data science skills. Learn linear algebra.

@machinelearnbot

Graphical representation is also very helpful to understand linear algebra. I tried to bind the concepts with plots (and code to produce it). The type of representation I liked most by doing this series is the fact that you can see any matrix as linear transformation of the space. In several chapters we will extend this idea and see how it can be useful to understand eigendecomposition, Singular Value Decomposition (SVD) or the Principal Components Analysis (PCA). In addition, I noticed that creating and reading examples is really helpful to understand the theory. It is why I built Python notebooks.


Number Theory and Cryptography Coursera

@machinelearnbot

About this course: We all learn numbers from the childhood. Some of us like to count, others hate it, but any person uses numbers everyday to buy things, pay for services, estimated time and necessary resources. People have been wondering about numbers' properties for thousands of years. And for thousands of years it was more or less just a game that was only interesting for pure mathematicians. Famous 20th century mathematician G.H. Hardy once said "The Theory of Numbers has always been regarded as one of the most obviously useless branches of Pure Mathematics".


A Simple Introduction to Complex Stochastic Processes

@machinelearnbot

Stochastic processes have many applications, including in finance and physics. It is an interesting model to represent many phenomena. Unfortunately the theory behind it is very difficult, making it accessible to a few'elite' data scientists, and not popular in business contexts. One of the most simple examples is a random walk, and indeed easy to understand with no mathematical background. However, time-continuous stochastic processes are always defined and studied using advanced and abstract mathematical tools such as measure theory, martingales, and filtration.