Goto

Collaborating Authors

 Mathematical & Statistical Methods


Beautiful Number Theory Problem and Sandbox for Data Scientists

@machinelearnbot

The Waring conjecture - actually a problem associated with a number of conjectures, many now being solved - is one of the most fascinating mathematical problems. This article covers new aspects of this problem, with a generalization and new conjectures, some with a tentative solution, and a new framework to tackle the problem. Yet it is written in simple English and accessible to the layman. I also review a number of famous related mathematical conjectures, including one with a $1 million award still waiting for a solution, as well as Goldbach's conjecture, yet unproved as of today. Many curious properties of the Floor function are also listed, and the emphasis is on machine learning and efficient computer-intensive algorithms to try to find surprising results, which then need to be formally proved or disproved.


Number Theory: Nice Generalization of the Waring Conjecture

@machinelearnbot

The Waring conjecture - actually a problem associated with a number of conjectures, many now being solved - is one of the most fascinating mathematical problems. This article covers new aspects of this problem, with a generalization and new conjectures, some with a tentative solution, and a new framework to tackle the problem. Yet it is written in simple English and accessible to the layman. I also review a number of famous related mathematical conjectures, including one with a $1 million award still waiting for a solution, as well as Goldbach's conjecture, yet unproved as of today. Many curious properties of the Floor function are also listed, and the emphasis is on machine learning and efficient computer-intensive algorithms to try to find surprising results, which then need to be formally proved or disproved.


Essence of linear algebra preview

#artificialintelligence

This introduces the "Essence of linear algebra" series, aimed at animating the geometric intuitions underlying many of the topics taught in a standard linear algebra course. Error corrections: - At one point I mistakenly allude to calculators using the Taylor expansion of sine for its computations, when in reality most use CORDIC (or something like it). - Around 30 seconds in, there is a type in how the determinant is written, which should be ad - bc ------------------ 3blue1brown is a channel about animating math, in all senses of the word animate. And you know the drill with YouTube, if you want to stay posted about new videos, subscribe, and click the bell to receive notifications (if you're into that). If you are new to this channel and want to see more, a good place to start is this playlist: http://3b1b.co/recommended


Fascinating Chaotic Sequences with Cool Applications

@machinelearnbot

Here we describe well-known chaotic sequences, including new generalizations, with application to random number generation, highly non-linear auto-regressive models for times series, simulation, random permutations, and the use of big numbers (libraries available in programming languages to work with numbers with hundreds of decimals) as standard computer precision almost always produces completely erroneous results after a few iterations -- a fact rarely if ever mentioned in the scientific literature, but illustrated here, together with a solution. It is possible that all scientists who published on chaotic processes, used faulty numbers because of this issue. This article is accessible to non-experts, even though we solve a special stochastic equation for the first time, providing an unexpected exact solution, for a new chaotic process that generalizes the logistic map. We also describe a general framework for continuous random number generators, and investigate the interesting auto-correlation structure associated with some of these sequences. References are provided, as well as fast source code to process big numbers accurately, and even an elegant mathematical proof in the last section.


Book Reviews

AI Magazine

In his new book Alternate Realities: Mathematical Models of Nature and Man (New York: John Wiley and Sons, 1989, 493 pages, $34.95), John L. Casti gives us an impressive, up-todate look at several areas of mathematics that are being applied to the study of biological and sociological systems. These areas, including cellular automata theory, catastrophe theory, nonlinear dynamics and chaos, game theory, and control theory, are finding use on the frontiers of scientific research. Although these areas and their applications are described in various other sources, both on the level of a scientist and a layperson, I know of no other book that brings them all together to show how they can be used in scientific research. However, this book suffers from being written for mathematical specialists and, therefore, limits the potential readership. An opportunity to educate more scientists in the use of mathematical models is regrettably missed.


A Simple Introduction to Complex Stochastic Processes

@machinelearnbot

Stochastic processes have many applications, including in finance and physics. It is an interesting model to represent many phenomena. Unfortunately the theory behind it is very difficult, making it accessible to a few'elite' data scientists, and not popular in business contexts. One of the most simple examples is a random walk, and indeed easy to understand with no mathematical background. However, time-continuous stochastic processes are always defined and studied using advanced and abstract mathematical tools such as measure theory, martingales, and filtration.


Scalable Log Determinants for Gaussian Process Kernel Learning

Neural Information Processing Systems

For applications as varied as Bayesian neural networks, determinantal point processes, elliptical graphical models, and kernel learning for Gaussian processes (GPs), one must compute a log determinant of an n by n positive definite matrix, and its derivatives---leading to prohibitive O(n^3) computations. We propose novel O(n) approaches to estimating these quantities from only fast matrix vector multiplications (MVMs). These stochastic approximations are based on Chebyshev, Lanczos, and surrogate models, and converge quickly even for kernel matrices that have challenging spectra. We leverage these approximations to develop a scalable Gaussian process approach to kernel learning. We find that Lanczos is generally superior to Chebyshev for kernel learning, and that a surrogate approach can be highly efficient and accurate with popular kernels.


Fast Black-box Variational Inference through Stochastic Trust-Region Optimization

Neural Information Processing Systems

We introduce TrustVI, a fast second-order algorithm for black-box variational inference based on trust-region optimization and the reparameterization trick. At each iteration, TrustVI proposes and assesses a step based on minibatches of draws from the variational distribution. The algorithm provably converges to a stationary point. We implemented TrustVI in the Stan framework and compared it to two alternatives: Automatic Differentiation Variational Inference (ADVI) and Hessian-free Stochastic Gradient Variational Inference (HFSGVI). The former is based on stochastic first-order optimization. The latter uses second-order information, but lacks convergence guarantees. TrustVI typically converged at least one order of magnitude faster than ADVI, demonstrating the value of stochastic second-order information. TrustVI often found substantially better variational distributions than HFSGVI, demonstrating that our convergence theory can matter in practice.


Variational Inference for Gaussian Process Models with Linear Complexity

Neural Information Processing Systems

Large-scale Gaussian process inference has long faced practical challenges due to time and space complexity that is superlinear in dataset size. While sparse variational Gaussian process models are capable of learning from large-scale data, standard strategies for sparsifying the model can prevent the approximation of complex functions. In this work, we propose a novel variational Gaussian process model that decouples the representation of mean and covariance functions in reproducing kernel Hilbert space. We show that this new parametrization generalizes previous models. Furthermore, it yields a variational inference problem that can be solved by stochastic gradient ascent with time and space complexity that is only linear in the number of mean function parameters, regardless of the choice of kernels, likelihoods, and inducing points. This strategy makes the adoption of large-scale expressive Gaussian process models possible. We run several experiments on regression tasks and show that this decoupled approach greatly outperforms previous sparse variational Gaussian process inference procedures.


A Disentangled Recognition and Nonlinear Dynamics Model for Unsupervised Learning

Neural Information Processing Systems

This paper takes a step towards temporal reasoning in a dynamically changing video, not in the pixel space that constitutes its frames, but in a latent space that describes the non-linear dynamics of the objects in its world. We introduce the Kalman variational auto-encoder, a framework for unsupervised learning of sequential data that disentangles two latent representations: an object's representation, coming from a recognition model, and a latent state describing its dynamics. As a result, the evolution of the world can be imagined and missing data imputed, both without the need to generate high dimensional frames at each time step. The model is trained end-to-end on videos of a variety of simulated physical systems, and outperforms competing methods in generative and missing data imputation tasks.