Collaborating Authors

An exact kernel framework for spatio-temporal dynamics Machine Learning

A kernel-based framework for spatio-temporal data analysis is introduced that applies in situations when the underlying system dynamics are governed by a dynamic equation. The key ingredient is a representer theorem that involves time-dependent kernels. Such kernels occur commonly in the expansion of solutions of partial differential equations. The representer theorem is applied to find among all solutions of a dynamic equation the one that minimizes the error with given spatio-temporal samples. This is motivated by the fact that very often a differential equation is given a priori (e.g.~by the laws of physics) and a practitioner seeks the best solution that is compatible with her noisy measurements. Our guiding example is the Fokker-Planck equation, which describes the evolution of density in stochastic diffusion processes. A regression and density estimation framework is introduced for spatio-temporal modeling under Fokker-Planck dynamics with initial and boundary conditions.

Boundary Estimation from Point Clouds: Algorithms, Guarantees and Applications Machine Learning

We investigate identifying the boundary of a domain from sample points in the domain. We introduce new estimators for the normal vector to the boundary, distance of a point to the boundary, and a test for whether a point lies within a boundary strip. The estimators can be efficiently computed and are more accurate than the ones present in the literature. We provide rigorous error estimates for the estimators. Furthermore we use the detected boundary points to solve boundary-value problems for PDE on point clouds. We prove error estimates for the Laplace and eikonal equations on point clouds. Finally we provide a range of numerical experiments illustrating the performance of our boundary estimators, applications to PDE on point clouds, and tests on image data sets.

Higher-Order Orthogonal Causal Learning for Treatment Effect Machine Learning

Most existing studies on the double/debiased machine learning method concentrate on the causal parameter estimation recovering from the first-order orthogonal score function. In this paper, we will construct the $k^{\mathrm{th}}$-order orthogonal score function for estimating the average treatment effect (ATE) and present an algorithm that enables us to obtain the debiased estimator recovered from the score function. Such a higher-order orthogonal estimator is more robust to the misspecification of the propensity score than the first-order one does. Besides, it has the merit of being applicable with many machine learning methodologies such as Lasso, Random Forests, Neural Nets, etc. We also undergo comprehensive experiments to test the power of the estimator we construct from the score function using both the simulated datasets and the real datasets.

Model Selection for High-Dimensional Regression under the Generalized Irrepresentability Condition Machine Learning

In the high-dimensional regression model a response variable is linearly related to $p$ covariates, but the sample size $n$ is smaller than $p$. We assume that only a small subset of covariates is `active' (i.e., the corresponding coefficients are non-zero), and consider the model-selection problem of identifying the active covariates. A popular approach is to estimate the regression coefficients through the Lasso ($\ell_1$-regularized least squares). This is known to correctly identify the active set only if the irrelevant covariates are roughly orthogonal to the relevant ones, as quantified through the so called `irrepresentability' condition. In this paper we study the `Gauss-Lasso' selector, a simple two-stage method that first solves the Lasso, and then performs ordinary least squares restricted to the Lasso active set. We formulate `generalized irrepresentability condition' (GIC), an assumption that is substantially weaker than irrepresentability. We prove that, under GIC, the Gauss-Lasso correctly recovers the active set.

On pattern recovery of the fused Lasso Machine Learning

We study the property of the Fused Lasso Signal Approximator (FLSA) for estimating a blocky signal sequence with additive noise. We transform the FLSA to an ordinary Lasso problem. By studying the property of the design matrix in the transformed Lasso problem, we find that the irrepresentable condition might not hold, in which case we show that the FLSA might not be able to recover the signal pattern. We then apply the newly developed preconditioning method -- Puffer Transformation [Jia and Rohe, 2012] on the transformed Lasso problem. We call the new method the preconditioned fused Lasso and we give non-asymptotic results for this method. Results show that when the signal jump strength (signal difference between two neighboring groups) is big and the noise level is small, our preconditioned fused Lasso estimator gives the correct pattern with high probability. Theoretical results give insight on what controls the signal pattern recovery ability -- it is the noise level {instead of} the length of the sequence. Simulations confirm our theorems and show significant improvement of the preconditioned fused Lasso estimator over the vanilla FLSA.