Goto

Collaborating Authors

 Mathematical & Statistical Methods


9 Off-the-beaten-path Statistical Science Topics with Interesting Applications

@machinelearnbot

You will find here nine interesting topics that you won't learn in college classes. Most have interesting applications in business and elsewhere. They are not especially difficult, and I explain them in simple English. Yet they are not part of the traditional statistical curriculum, and even many experienced data scientists with a PhD degree have not heard about some of these concepts. This is a well known model, used as a base stochastic process to model the logarithm of stock prices, yet it has interesting properties (depending on dimension) that few people know about.


Communication-Avoiding Optimization Methods for Distributed Massive-Scale Sparse Inverse Covariance Estimation

arXiv.org Machine Learning

Across a variety of scientific disciplines, sparse inverse covariance estimation is a popular tool for capturing the underlying dependency relationships in multivariate data. Unfortunately, most estimators are not scalable enough to handle the sizes of modern high-dimensional data sets (often on the order of terabytes), and assume Gaussian samples. To address these deficiencies, we introduce HP-CONCORD, a highly scalable optimization method for estimating a sparse inverse covariance matrix based on a regularized pseudolikelihood framework, without assuming Gaussianity. Our parallel proximal gradient method uses a novel communication-avoiding linear algebra algorithm and runs across a multi-node cluster with up to 1k nodes (24k cores), achieving parallel scalability on problems with up to ~819 billion parameters (1.28 million dimensions); even on a single node, HP-CONCORD demonstrates scalability, outperforming a state-of-the-art method. We also use HP-CONCORD to estimate the underlying dependency structure of the brain from fMRI data, and use the result to identify functional regions automatically. The results show good agreement with a clustering from the neuroscience literature.


Mathematics for Machine Learning: Linear Algebra Coursera

#artificialintelligence

About this course: In this course on Linear Algebra we look at what linear algebra is and how it relates to vectors and matrices. Then we look through what vectors and matrices are and how to work with them, including the knotty problem of eigenvalues and eigenvectors, and how to use these to solve problems. Finally we look at how to use these to do fun things with datasets - like how to rotate images of faces and how to extract eigenvectors to look at how the Pagerank algorithm works. Since we're aiming at data-driven applications, we'll be implementing some of these ideas in code, not just on pencil and paper. Towards the end of the course, you'll write code blocks and encounter Jupyter notebooks in Python, but don't worry, these will be quite short, focussed on the concepts, and will guide you through if you've not coded before.


Learning the Influence Structure between Partially Observed Stochastic Processes Using IoT Sensor Data

AAAI Conferences

The recent widespread of availability of sensors, as part of the IoT, presents the opportunity to learn the properties of compound distributions in practical applications. Understanding temporal distributions by observations collected from the IoT can advance many intelligent applications. In this paper we develop an algorithm to learn influence between stochastic processes using observations obtained from the IoT. The proposed method learns these processes using temporal models independently, and then attempts to recover the underlying distribution of influence between them. Experimental results are provided which demonstrate the effectiveness of our method. This approach is useful in applications that require an understanding of how partially observed high-level processes can influence each other given a set of observations at different times.


Approximating Hamiltonian dynamics with the Nystr\"om method

arXiv.org Machine Learning

Simulating the time-evolution of quantum mechanical systems is BQP-hard and expected to be one of the foremost applications of quantum computers. We consider the approximation of Hamiltonian dynamics using subsampling methods from randomized numerical linear algebra. We propose conditions for the efficient approximation of state vectors evolving under a given Hamiltonian. As an immediate application, we show that sample based quantum simulation, a type of evolution where the Hamiltonian is a density matrix, can be efficiently classically simulated under specific structural conditions. Our main technical contribution is a randomized algorithm for approximating Hermitian matrix exponentials. The proof leverages the Nystr\"om method to obtain low-rank approximations of the Hamiltonian. We envisage that techniques from randomized linear algebra will bring further insights into the power of quantum computation.


Average performance analysis of the stochastic gradient method for online PCA

arXiv.org Machine Learning

This paper studies the complexity of the stochastic gradient algorithm for PCA when the data are observed in a streaming setting. We also propose an online approach for selecting the learning rate. Simulation experiments confirm the practical relevance of the plain stochastic gradient approach and that drastic improvements can be achieved by learning the learning rate.


Stochastic Processes and New Tests of Randomness - Application to Cool Number Theory Problem

@machinelearnbot

This article is intended for practitioners who might not necessarily be statisticians or statistically-savvy. The mathematical level is kept as simple as possible, yet I present an original, simple approach to test for randomness, with an interesting application to illustrate the methodology. This material is not something usually discussed in textbooks or classrooms (even for statistical students), offering a fresh perspective, and out-of-the-box tools that are useful in many contexts, as an addition or alternative to traditional tests that are widely used. This article is written as a tutorial, but it also features an interesting research result in the last section. The example used in this tutorial shows how intuiting can be wrong, and why you need data science.


Boost your data science skills. Learn linear algebra.

@machinelearnbot

This chapter is mainly on the dot product (vector and/or matrix multiplication). We will also see some of its properties. Then, we will see how to synthesize a system of linear equations using matrix notation. This is a major process for the following chapters.


Stochastic Processes and New Tests of Randomness - Application to Cool Number Theory Problem

@machinelearnbot

This article is intended for practitioners who might not necessarily be statisticians or statistically-savvy. The mathematical level is kept as simple as possible, yet I present an original, simple approach to test for randomness, with an interesting application to illustrate the methodology. This material is not something usually discussed in textbooks or classrooms (even for statistical students), offering a fresh perspective, and out-of-the-box tools that are useful in many contexts, as an addition or alternative to traditional tests that are widely used. This article is written as a tutorial, but it also features an interesting research result in the last section. Let us assume that you are dealing with a time series with discrete time increments (for instance, daily observations) as opposed to a time-continuous process.


Linear Algebra for Deep Learning – Towards Data Science

@machinelearnbot

Linear Algebra is a continuous form of mathematics and it is applied throughout science and engineering because it allows you to model natural phenomena and to compute them efficiently. Because it is a form of continuous and not discrete mathematics, a lot of computer scientists don't have a lot of experience with it. Linear Algebra is also central to almost all areas of mathematics like geometry and functional analysis. Its concepts are a crucial prerequisite for understanding the theory behind Machine Learning, especially if you are working with Deep Learning Algorithms. You don't need to understand Linear Algebra before you get started with Machine Learning but at some point, you want to gain a better intuition for how the different machine learning algorithms really work under the hood.