Mathematical & Statistical Methods
Applied Linear Algebra
This textbook develops the essential tools of linear algebra, with the goal of imparting technique alongside contextual understanding. Applications go hand-in-hand with theory, each reinforcing and explaining the other. This approach encourages students to develop not only the technical proficiency needed to go on to further study, but an appreciation for when, why, and how the tools of linear algebra can be used across modern applied mathematics. No previous knowledge of linear algebra is needed to approach this text, with single-variable calculus as the only formal prerequisite. However, the reader will need to draw upon some mathematical maturity to engage in the increasing abstraction inherent to the subject.
Linear Algebra. Points matching with SVD in 3D space
We need to find best rotation & translation params between two sets of points in 3D space. This type of transformation called Euclidean as it preserves sizes. There are many ways of getting things done almost in any case, but currently we will use SVD for this. You ask why? -- Because matrix is basically a transformation and with SVD can be decomposed on Rotation(U), Scaling(ฮฃ), Rotation(V) (in special case when A is mxm matrix, but this gives us a clue. To get rotation first we need to find out center of it.
Discrete Mathematics
Discrete mathematics is quickly becoming one of the most important areas of mathematical research, with applications to cryptography, linear programming, coding theory and the theory of computing. This book is aimed at undergraduate mathematics and computer science students interested in developing a feeling for what mathematics is all about, where mathematics can be helpful, and what kinds of questions mathematicians work on. The authors discuss a number of selected results and methods of discrete mathematics, mostly from the areas of combinatorics and graph theory, with a little number theory, probability, and combinatorial geometry. Wherever possible, the authors use proofs and problem solving to help students understand the solutions to problems. In addition, there are numerous examples, figures and exercises spread throughout the book.
Linear Algebra : Vector addition and scalar-vector multiplication
Highlights: In this post we are going to talk about vectors. They are the fundamental building blocks in Linear Algebra. We will give an intuitive definition what the vectors are, where we use them, how we add them and multiply with scalars. We provide a code examples to demonstrate how to work with vectors in Python. So, what exactly is a vector?
Cross-Validation for Correlated Data
Rabinowicz, Assaf, Rosset, Saharon
Datasets with correlation structures are common in modern statistical applications in various fields, such as geostatistics (Goovaerts 1999), genetics (Maddison 1990) and ecology (Roberts et al. 2017). Different modeling methods address the correlation structure differently. Some modeling methods, such as Gaussian process regression (Rasmussen and Williams 2006, GPR) and generalized least squares (Hansen 2007, GLS), utilize explicitly the correlation structure for achieving better prediction accuracy. Other predictive models, like random forest (Breiman 2001, RF), gradient boosting machines (Friedman 2002, GBM) and other machine learning models, do not consider explicitly the correlation structure but are still potentially able to utilize the correlation implicitly. The analysis in this paper mainly focuses on correlation that appears due to latent objects, such as random effects and random fields as appear in generalized linear mixed models (Verbeke 1997, GLMM) and generalized Gaussian process regression (Rasmussen and Williams 2006, GGPR) in clustered, temporal and spatial datasets.
Stochastic modeling of non-linear adsorption with Gaussian kernel density estimators
Rahbaralam, Maryam, Abdollahi, Amir, Fernร ndez-Garcia, Daniel, Sanchez-Vila, Xavier
Adsorption is a relevant process in many fields, such as product manufacturing or pollution remediation in porous materials. Adsorption takes place at the molecular scale, amenable to be modeled by Lagrangian numerical methods. We have proposed a chemical diffusion-reaction model for the simulation of adsorption, based on the combination of a random walk particle tracking method involving the use of Gaussian Kernel Density Estimators. The main feature of the proposed model is that it can effectively reproduce the nonlinear behavior characteristic of the Langmuir and Freundlich isotherms. In the former, it is enough to add a finite number of sorption sites of homogeneous sorption properties, and to set the process as the combination of the forward and the backward reactions, each one of them with a prespecified reaction rate. To model the Freundlich isotherm instead, typical of low to intermediate range of solute concentrations, there is a need to assign a different equilibrium constant to each specific sorption site, provided they are all drawn from a truncated power-law distribution. Both nonlinear models can be combined in a single framework to obtain a typical observed behavior for a wide range of concentration values.
#000 Linear Algebra for Machine Learning Master Data Science 14.03.2020
Highlights: Linear algebra is a branch of mathematics related to linear equations, linear functions and their representations through matrices and vector spaces. Basically, it is the science of numbers which empowers diverse Data Science algorithms and applications. To fully comprehend machine learning, linear algebra fundamentals are the essential prerequisite. In this post, you will discover what exactly linear algebra is and what are the main applications in the machine learning. Linear algebra is a foundation of machine learning. Before you start to study machine learning, you need to get better knowledge and understanding of this field. If you are a fan and a practitioner of machine learning, this post will help you to realize where linear algebra is applied to and you can benefit from these insights. In machine learning, the majority of data is most often represented as vectors, matrices or tensors. Therefore, the machine learning heavily relies on the linear algebra.
At the Interface of Algebra and Statistics
This thesis takes inspiration from quantum physics to investigate mathematical structure that lies at the interface of algebra and statistics. The starting point is a passage from classical probability theory to quantum probability theory. The quantum version of a probability distribution is a density operator, the quantum version of marginalizing is an operation called the partial trace, and the quantum version of a marginal probability distribution is a reduced density operator. Every joint probability distribution on a finite set can be modeled as a rank one density operator. By applying the partial trace, we obtain reduced density operators whose diagonals recover classical marginal probabilities. In general, these reduced densities will have rank higher than one, and their eigenvalues and eigenvectors will contain extra information that encodes subsystem interactions governed by statistics. We decode this information, and show it is akin to conditional probability, and then investigate the extent to which the eigenvectors capture "concepts" inherent in the original joint distribution. The theory is then illustrated with an experiment that exploits these ideas. Turning to a more theoretical application, we also discuss a preliminary framework for modeling entailment and concept hierarchy in natural language, namely, by representing expressions in the language as densities. Finally, initial inspiration for this thesis comes from formal concept analysis, which finds many striking parallels with the linear algebra. The parallels are not coincidental, and a common blueprint is found in category theory. We close with an exposition on free (co)completions and how the free-forgetful adjunctions in which they arise strongly suggest that in certain categorical contexts, the "fixed points" of a morphism with its adjoint encode interesting information.
Heuristics for Link Prediction in Multiplex Networks
Tillman, Robert E., Potluru, Vamsi K., Chen, Jiahao, Reddy, Prashant, Veloso, Manuela
Link prediction, or the inference of future or missing connections between entities, is a well-studied problem in network analysis. A multitude of heuristics exist for link prediction in ordinary networks with a single type of connection. However, link prediction in multiplex networks, or networks with multiple types of connections, is not a well understood problem. We propose a novel general framework and three families of heuristics for multiplex network link prediction that are simple, interpretable, and take advantage of the rich connection type correlation structure that exists in many real world networks. We further derive a theoretical threshold for determining when to use a different connection type based on the number of links that overlap with an Erdos-Renyi random graph. Through experiments with simulated and real world scientific collaboration, transportation and global trade networks, we demonstrate that the proposed heuristics show increased performance with the richness of connection type correlation structure and significantly outperform their baseline heuristics for ordinary networks with a single connection type.
An Improved Cutting Plane Method for Convex Optimization, Convex-Concave Games and its Applications
Jiang, Haotian, Lee, Yin Tat, Song, Zhao, Wong, Sam Chiu-wai
Given a separation oracle for a convex set $K \subset \mathbb{R}^n$ that is contained in a box of radius $R$, the goal is to either compute a point in $K$ or prove that $K$ does not contain a ball of radius $\epsilon$. We propose a new cutting plane algorithm that uses an optimal $O(n \log (\kappa))$ evaluations of the oracle and an additional $O(n^2)$ time per evaluation, where $\kappa = nR/\epsilon$. $\bullet$ This improves upon Vaidya's $O( \text{SO} \cdot n \log (\kappa) + n^{\omega+1} \log (\kappa))$ time algorithm [Vaidya, FOCS 1989a] in terms of polynomial dependence on $n$, where $\omega < 2.373$ is the exponent of matrix multiplication and $\text{SO}$ is the time for oracle evaluation. $\bullet$ This improves upon Lee-Sidford-Wong's $O( \text{SO} \cdot n \log (\kappa) + n^3 \log^{O(1)} (\kappa))$ time algorithm [Lee, Sidford and Wong, FOCS 2015] in terms of dependence on $\kappa$. For many important applications in economics, $\kappa = \Omega(\exp(n))$ and this leads to a significant difference between $\log(\kappa)$ and $\mathrm{poly}(\log (\kappa))$. We also provide evidence that the $n^2$ time per evaluation cannot be improved and thus our running time is optimal. A bottleneck of previous cutting plane methods is to compute leverage scores, a measure of the relative importance of past constraints. Our result is achieved by a novel multi-layered data structure for leverage score maintenance, which is a sophisticated combination of diverse techniques such as random projection, batched low-rank update, inverse maintenance, polynomial interpolation, and fast rectangular matrix multiplication. Interestingly, our method requires a combination of different fast rectangular matrix multiplication algorithms.