Mathematical & Statistical Methods
Asymptotic normality and optimality in nonsmooth stochastic approximation
Davis, Damek, Drusvyatskiy, Dmitriy, Jiang, Liwei
Polyak and Juditsky [30] famously showed that the stochastic gradient method for minimizing smooth and strongly convex functions enjoys a central limit theorem: the error between the running average of the iterates and the minimizer, normalized by the square root of the iteration counter, converges to a normal random vector. Moreover, the covariance matrix of the limiting distribution is in a precise sense "optimal" among any estimation procedure. A long standing open question is whether similar guarantees - asymptotic normality and optimality - exist for nonsmooth optimization and, more generally, for equilibrium problems. In this work, we obtain such guarantees under mild conditions that hold both in concrete circumstances (e.g.
Interpretable and Scalable Graphical Models for Complex Spatio-temporal Processes
This thesis focuses on data that has complex spatio-temporal structure and on probabilistic graphical models that learn the structure in an interpretable and scalable manner. We target two research areas of interest: Gaussian graphical models for tensor-variate data and summarization of complex time-varying texts using topic models. This work advances the state-of-the-art in several directions. First, it introduces a new class of tensor-variate Gaussian graphical models via the Sylvester tensor equation. Second, it develops an optimization technique based on a fast-converging proximal alternating linearized minimization method, which scales tensor-variate Gaussian graphical model estimations to modern big-data settings. Third, it connects Kronecker-structured (inverse) covariance models with spatio-temporal partial differential equations (PDEs) and introduces a new framework for ensemble Kalman filtering that is capable of tracking chaotic physical systems. Fourth, it proposes a modular and interpretable framework for unsupervised and weakly-supervised probabilistic topic modeling of time-varying data that combines generative statistical models with computational geometric methods. Throughout, practical applications of the methodology are considered using real datasets. This includes brain-connectivity analysis using EEG data, space weather forecasting using solar imaging data, longitudinal analysis of public opinions using Twitter data, and mining of mental health related issues using TalkLife data. We show in each case that the graphical modeling framework introduced here leads to improved interpretability, accuracy, and scalability.
Active manifolds, stratifications, and convergence to local minima in nonsmooth optimization
Davis, Damek, Drusvyatskiy, Dmitriy, Jiang, Liwei
We show that the subgradient method converges only to local minimizers when applied to generic Lipschitz continuous and subdifferentially regular functions that are definable in an o-minimal structure. At a high level, the argument we present is appealingly transparent: we interpret the nonsmooth dynamics as an approximate Riemannian gradient method on a certain distinguished submanifold that captures the nonsmooth activity of the function. In the process, we develop new regularity conditions in nonsmooth analysis that parallel the stratification conditions of Whitney, Kuo, and Verdier and extend stochastic processes techniques of Pemantle.
Springer has released 65 Machine Learning and Data books for free
Springer has released hundreds of free books on a wide range of topics to the general public. The list, which includes 408 books in total, covers a wide range of scientific and technological topics. In order to save you some time, I have created one list of all the books (65 in number) that are relevant to the data and Machine Learning field. Among the books, you will find those dealing with the mathematical side of the domain (Algebra, Statistics, and more), along with more advanced books on Deep Learning and other advanced topics. You also could find some good books in various programming languages such as Python, R, MATLAB, etc.
The Importance of Mathematics for Machine Learning -- The ML Enthusiast's Blog
Mathematics plays a vital role in the field of machine learning. It provides the tools and framework for understanding and solving problems in this rapidly growing field. From linear algebra and calculus to probability and statistics, math is an essential component of machine learning. Linear algebra is used to represent and manipulate data in machine learning algorithms. It deals with linear equations and their transformations and is crucial for understanding how algorithms work and how to optimize them.
How is Linear Algebra Applied for Machine Learning?
Firstly, let's address the building blocks of linear algebra -- scalar, vector, matrix, and tensor. To implement them, we can use NumPy array np.array() in python. Let's look at the shape of the vector, matrix, and tensor we generated above. Similar to how we perform operations on numbers, the same logic also works for matrices and vectors. However, please note that these operations on matrices have restrictions on two matrices being the same size.
Practical Linear Algebra for Data Science: From Core Concepts to Applications Using Python: Cohen, Mike: 9781098120610: Amazon.com: Books
The purpose of this book is to teach you modern linear algebra. But this is not about memorizing some key equations and slugging through abstract proofs; the purpose is to teach you how to think about matrices, vectors, and operations acting upon them. You will develop a geometric intuition for why linear algebra is the way it is. And you will understand how to implement linear algebra concepts in Python code, with a focus on applications in machine learning and data science. Many traditional linear algebra textbooks avoid numerical examples in the interest of generalizations, expect you to derive difficult proofs on your own, and teach myriad concepts that have little or no relevance to application or implementation in computers.
Python for Data Science: A Look at the Top Libraries
Python is a popular language for data science due to its powerful libraries and tools for data manipulation, visualization, machine learning, and statistical analysis. In this listicle, we will introduce some of the top Python libraries for data science and provide a quick and cool way to get started with them. NumPy is a library for working with large, multi-dimensional arrays and matrices of numerical data. It provides functions for performing mathematical operations on arrays, such as linear algebra, statistical analysis, and random number generation. It provides functions for reading in data from various sources, cleaning and wrangling data, and performing aggregations and transformations. Matplotlib is a library for creating static, animated, and interactive visualizations in Python.
Receding Horizon Control on the Broadcast of Information in Stochastic Networks
Silva, Thales C., Shen, Li, Yu, Xi, Hsieh, M. Ani
This paper focuses on the broadcast of information on robot networks with stochastic network interconnection topologies. Problematic communication networks are almost unavoidable in areas where we wish to deploy multi-robotic systems, usually due to a lack of environmental consistency, accessibility, and structure. We tackle this problem by modeling the broadcast of information in a multi-robot communication network as a stochastic process with random arrival times, which can be produced by irregular robot movements, wireless attenuation, and other environmental factors. Using this model, we provide and analyze a receding horizon control strategy to control the statistics of the information broadcast. The resulting strategy compels the robots to re-direct their communication resources to different neighbors according to the current propagation process to fulfill global broadcast requirements. Based on this method, we provide an approach to compute the expected time to broadcast the message to all nodes. Numerical examples are provided to illustrate the results.
Learning Transition Operators From Sparse Space-Time Samples
Kümmerle, Christian, Maggioni, Mauro, Tang, Sui
We consider the nonlinear inverse problem of learning a transition operator $\mathbf{A}$ from partial observations at different times, in particular from sparse observations of entries of its powers $\mathbf{A},\mathbf{A}^2,\cdots,\mathbf{A}^{T}$. This Spatio-Temporal Transition Operator Recovery problem is motivated by the recent interest in learning time-varying graph signals that are driven by graph operators depending on the underlying graph topology. We address the nonlinearity of the problem by embedding it into a higher-dimensional space of suitable block-Hankel matrices, where it becomes a low-rank matrix completion problem, even if $\mathbf{A}$ is of full rank. For both a uniform and an adaptive random space-time sampling model, we quantify the recoverability of the transition operator via suitable measures of incoherence of these block-Hankel embedding matrices. For graph transition operators these measures of incoherence depend on the interplay between the dynamics and the graph topology. We develop a suitable non-convex iterative reweighted least squares (IRLS) algorithm, establish its quadratic local convergence, and show that, in optimal scenarios, no more than $\mathcal{O}(rn \log(nT))$ space-time samples are sufficient to ensure accurate recovery of a rank-$r$ operator $\mathbf{A}$ of size $n \times n$. This establishes that spatial samples can be substituted by a comparable number of space-time samples. We provide an efficient implementation of the proposed IRLS algorithm with space complexity of order $O(r n T)$ and per-iteration time complexity linear in $n$. Numerical experiments for transition operators based on several graph models confirm that the theoretical findings accurately track empirical phase transitions, and illustrate the applicability and scalability of the proposed algorithm.