Goto

Collaborating Authors

 Mathematical & Statistical Methods


SINDy-PI: A Robust Algorithm for Parallel Implicit Sparse Identification of Nonlinear Dynamics

arXiv.org Machine Learning

Accurately modeling the nonlinear dynamics of a system from measurement data is a challenging yet vital topic. The sparse identification of nonlinear dynamics (SINDy) algorithm is one approach to discover dynamical systems models from data. Although extensions have been developed to identify implicit dynamics, or dynamics described by rational functions, these extensions are extremely sensitive to noise. In this work, we develop SINDy-PI (parallel, implicit), a robust variant of the SINDy algorithm to identify implicit dynamics and rational nonlinearities. The SINDy-PI framework includes multiple optimization algorithms and a principled approach to model selection. We demonstrate the ability of this algorithm to learn implicit ordinary and partial differential equations and conservation laws from limited and noisy data. In particular, we show that the proposed approach is several orders of magnitude more noise robust than previous approaches, and may be used to identify a class of complex ODE and PDE dynamics that were previously unattainable with SINDy, including for the double pendulum dynamics and the Belousov Zhabotinsky (BZ) reaction.


Genius triumphs: Japanese mathematician's solution to number theory riddle validated

The Japan Times

KYOTO โ€“ A proof by mathematician Shinichi Mochizuki of a major conundrum in number theory that went unresolved for over 30 years has finally been validated, Kyoto University said Friday following a controversy over his method, which was often labeled too novel or complicated to understand. Accepted for publication by the university's Research Institute for Mathematical Sciences was Mochizuki's 600-page proof of the abc conjecture, which provides immediate proofs for many other famous mathematical problems, including Fermat's last theorem, which took almost 350 years to be demonstrated. The abc conjecture, proposed by European mathematicians in 1985, is an equation of three integers a, b, and c composed of different prime numbers, where a b c, and describing the relationship between the product of the prime numbers and c. "There are a number of new notions and it was hard to understand them," Masaki Kashiwara, head of the team that examined the professor's theory, said at a news conference. He proved the abc conjecture with a "totally new, innovative theory," said fellow professor Akio Tamagawa. "His achievement creates a huge impact in the field of number theory."


Kernel autocovariance operators of stationary processes: Estimation and convergence

arXiv.org Machine Learning

We consider autocovariance operators of a stationary stochastic process on a Polish space that is embedded into a reproducing kernel Hilbert space. We investigate how empirical estimates of these operators converge along realizations of the process under various conditions. In particular, we examine ergodic and strongly mixing processes and prove several asymptotic results as well as finite sample error bounds with a detailed analysis for the Gaussian kernel. We provide applications of our theory in terms of consistency results for kernel PCA with dependent data and the conditional mean embedding of transition probabilities. Finally, we use our approach to examine the nonparametric estimation of Markov transition operators and highlight how our theory can give a consistency analysis for a large family of spectral analysis methods including kernel-based dynamic mode decomposition.


On Two Distinct Sources of Nonidentifiability in Latent Position Random Graph Models

arXiv.org Machine Learning

The statistical analysis of network data is important for fields such as neuroscience (Vogelstein et al., 2012), sociology (Hoff et al., 2002), and physics (Newman and Girvan, 2004; Bickel and Chen, 2009). Recently, network data have become ubiquitous in the the modern data-science landscape, and a large literature on statistical methods for analyzing these data has developed. Popular statistical models for conditionally independent random graphs include, but are not limited to, the stochastic block model (Holland et al., 1983), the random dot product graph (Young and Scheinerman, 2007; Athreya et al., 2017), and graphons (Lovรกsz, 2012; Diaconis and Janson, 2007). Both the stochastic block model and the random dot product graph are examples of latent position random graphs (Hoff et al., 2002), a graph model that is motivated by the idea that individual nodes have latent positions whose values determine their propensity to form connections. The purpose of this manuscript is to explain a curious phenomenon that arises in latent position random graph settings.


Information-Theoretic Lower Bounds for Zero-Order Stochastic Gradient Estimation

arXiv.org Machine Learning

In this paper we analyze the necessary number of samples to estimate the gradient of any multidimensional smooth (possibly non-convex) function in a zero-order stochastic oracle model. In this model, an estimator has access to noisy values of the function, in order to produce the estimate of the gradient. We also provide an analysis on the sufficient number of samples for the finite difference method, a classical technique in numerical linear algebra. For $T$ samples and $d$ dimensions, our information-theoretic lower bound is $\Omega(\sqrt{d/T})$. We show that the finite difference method has rate $O(d^{4/3}/\sqrt{T})$ for functions with zero third and higher order derivatives. Thus, the finite difference method is not minimax optimal, and therefore there is space for the development of better gradient estimation methods.


Generalized Canonical Correlation Analysis: A Subspace Intersection Approach

arXiv.org Machine Learning

Generalized Canonical Correlation Analysis (GCCA) is an important tool that finds numerous applications in data mining, machine learning, and artificial intelligence. It aims at finding `common' random variables that are strongly correlated across multiple feature representations (views) of the same set of entities. CCA and to a lesser extent GCCA have been studied from the statistical and algorithmic points of view, but not as much from the standpoint of linear algebra. This paper offers a fresh algebraic perspective of GCCA based on a (bi-)linear generative model that naturally captures its essence. It is shown that from a linear algebra point of view, GCCA is tantamount to subspace intersection; and conditions under which the common subspace of the different views is identifiable are provided. A novel GCCA algorithm is proposed based on subspace intersection, which scales up to handle large GCCA tasks. Synthetic as well as real data experiments are provided to showcase the effectiveness of the proposed approach.


Efficient Tensor Kernel methods for sparse regression

arXiv.org Machine Learning

Recently, classical kernel methods have been extended by the introduction of suitable tensor kernels so to promote sparsity in the solution of the underlying regression problem. Indeed, they solve an lp-norm regularization problem, with p=m/(m-1) and m even integer, which happens to be close to a lasso problem. However, a major drawback of the method is that storing tensors requires a considerable amount of memory, ultimately limiting its applicability. In this work we address this problem by proposing two advances. First, we directly reduce the memory requirement, by intriducing a new and more efficient layout for storing the data. Second, we use a Nystrom-type subsampling approach, which allows for a training phase with a smaller number of data points, so to reduce the computational cost. Experiments, both on synthetic and read datasets, show the effectiveness of the proposed improvements. Finally, we take case of implementing the cose in C++ so to further speed-up the computation.


Spectral Clustering Revisited: Information Hidden in the Fiedler Vector

arXiv.org Machine Learning

We are interested in the clustering problem on graphs: it is known that if there are two underlying clusters, then the signs of the eigenvector corresponding to the second largest eigenvalue of the adjacency matrix can reliably reconstruct the two clusters. We argue that the vertices for which the eigenvector has the largest and the smallest entries, respectively, are unusually strongly connected to their own cluster and more reliably classified than the rest. This can be regarded as a discrete version of the Hot Spots conjecture and should be useful in applications. We give a rigorous proof for the stochastic block model and several examples.


5 Most Important Skills of a Data Scientist

#artificialintelligence

Data scientist was coined the sexiest job of the 21st century and with good reason. In Linkedin 2020 Emerging Jobs Reports, Artificial intelligence was named the'Jobs of Tomorrow' due to its strong presence. Furthermore, the potential application of data science in multiple industries has attracted people from all backgrounds into this field. Here I present the top 5 most important skills of a data scientist that is essential for their work in data science. Probability and Statistics are two mathematics concepts that are closely related.


Detection and skeletonization of single neurons and tracer injections using topological methods

arXiv.org Machine Learning

Neuroscientific data analysis has traditionally relied on linear algebra and stochastic process theory. However, the tree-like shapes of neurons cannot be described easily as points in a vector space (the subtraction of two neuronal shapes is not a meaningful operation), and methods from computational topology are better suited to their analysis. Here we introduce methods from Discrete Morse (DM) Theory to extract the tree-skeletons of individual neurons from volumetric brain image data, and to summarize collections of neurons labelled by tracer injections. Since individual neurons are topologically trees, it is sensible to summarize the collection of neurons using a consensus tree-shape that provides a richer information summary than the traditional regional 'connectivity matrix' approach. The conceptually elegant DM approach lacks hand-tuned parameters and captures global properties of the data as opposed to previous approaches which are inherently local. For individual skeletonization of sparsely labelled neurons we obtain substantial performance gains over state-of-the-art non-topological methods (over 10% improvements in precision and faster proofreading). The consensus-tree summary of tracer injections incorporates the regional connectivity matrix information, but in addition captures the collective collateral branching patterns of the set of neurons connected to the injection site, and provides a bridge between single-neuron morphology and tracer-injection data.