Regression
Joint Gaussian Processes for Biophysical Parameter Retrieval
Svendsen, Daniel Heestermans, Martino, Luca, Campos-Taberner, Manuel, García-Haro, Francisco Javier, Camps-Valls, Gustau
Solving inverse problems is central to geosciences and remote sensing. Radiative transfer models (RTMs) represent mathematically the physical laws which govern the phenomena in remote sensing applications (forward models). The numerical inversion of the RTM equations is a challenging and computationally demanding problem, and for this reason, often the application of a nonlinear statistical regression is preferred. In general, regression models predict the biophysical parameter of interest from the corresponding received radiance. However, this approach does not employ the physical information encoded in the RTMs. An alternative strategy, which attempts to include the physical knowledge, consists in learning a regression model trained using data simulated by an RTM code. In this work, we introduce a nonlinear nonparametric regression model which combines the benefits of the two aforementioned approaches. The inversion is performed taking into account jointly both real observations and RTM-simulated data. The proposed Joint Gaussian Process (JGP) provides a solid framework for exploiting the regularities between the two types of data. The JGP automatically detects the relative quality of the simulated and real data, and combines them accordingly. This occurs by learning an additional hyper-parameter w.r.t. a standard GP model, and fitting parameters through maximizing the pseudo-likelihood of the real observations. The resulting scheme is both simple and robust, i.e., capable of adapting to different scenarios. The advantages of the JGP method compared to benchmark strategies are shown considering RTM-simulated and real observations in different experiments. Specifically, we consider leaf area index (LAI) retrieval from Landsat data combined with simulated data generated by the PROSAIL model.
Sparse High-Dimensional Linear Regression. Algorithmic Barriers and a Local Search Algorithm
We consider a sparse high dimensional regression model where the goal is to recover a k-sparse unknown vector \beta^* from n noisy linear observations of the form Y=X\beta^*+W \in R^n where X \in R^{n \times p} has iid N(0,1) entries and W \in R^n has iid N(0,\sigma^2) entries. Under certain assumptions on the parameters, an intriguing assymptotic gap appears between the minimum value of n, call it n^*, for which the recovery is information theoretically possible, and the minimum value of n, call it n_{alg}, for which an efficient algorithm is known to provably recover \beta^*. In a recent paper it was conjectured that the gap is not artificial, in the sense that for sample sizes n \in [n^*,n_{alg}] the problem is algorithmically hard. We support this conjecture in two ways. Firstly, we show that a well known recovery mechanism called Basis Pursuit Denoising Scheme provably fails to \ell_2-stably recover the vector when n \in [n^*,c n_{alg}], for some sufficiently small constant c>0. Secondly, we establish that n_{alg}, up to a multiplicative constant factor, is a phase transition point for the appearance of a certain Overlap Gap Property (OGP) over the space of k-sparse vectors. The presence of such an Overlap Gap Property phase transition, which originates in statistical physics, is known to provide evidence of an algorithmic hardness. Finally we show that if n>C n_{alg} for some large enough constant C>0, a very simple algorithm based on a local search improvement is able to infer correctly the support of the unknown vector \beta^*, adding it to the list of provably successful algorithms for the high dimensional linear regression problem.
Tensor Decompositions for Modeling Inverse Dynamics
Modeling inverse dynamics is crucial for accurate feedforward robot control. The model computes the necessary joint torques, to perform a desired movement. The highly non-linear inverse function of the dynamical system can be approximated using regression techniques. We propose as regression method a tensor decomposition model that exploits the inherent three-way interaction of positions x velocities x accelerations. Most work in tensor factorization has addressed the decomposition of dense tensors. In this paper, we build upon the decomposition of sparse tensors, with only small amounts of nonzero entries. The decomposition of sparse tensors has successfully been used in relational learning, e.g., the modeling of large knowledge graphs. Recently, the approach has been extended to multi-class classification with discrete input variables. Representing the data in high dimensional sparse tensors enables the approximation of complex highly non-linear functions. In this paper we show how the decomposition of sparse tensors can be applied to regression problems. Furthermore, we extend the method to continuous inputs, by learning a mapping from the continuous inputs to the latent representations of the tensor decomposition, using basis functions. We evaluate our proposed model on a dataset with trajectories from a seven degrees of freedom SARCOS robot arm. Our experimental results show superior performance of the proposed functional tensor model, compared to challenging state-of-the art methods.
Using Phone Sensors and an Artificial Neural Network to Detect Gait Changes During Drinking Episodes in the Natural Environment
Suffoletto, Brian, Gharani, Pedram, Chung, Tammy, Karimi, Hassan
Phone sensors could be useful in assessing changes in gait that occur with alcohol consumption. This study determined (1) feasibility of collecting gait-related data during drinking occasions in the natural environment, and (2) how gait-related features measured by phone sensors relate to estimated blood alcohol concentration (eBAC). Ten young adult heavy drinkers were prompted to complete a 5-step gait task every hour from 8pm to 12am over four consecutive weekends. We collected 3-xis accelerometer, gyroscope, and magnetometer data from phone sensors, and computed 24 gait-related features using a sliding window technique. eBAC levels were calculated at each time point based on Ecological Momentary Assessment (EMA) of alcohol use. We used an artificial neural network model to analyze associations between sensor features and eBACs in training (70% of the data) and validation and test (30% of the data) datasets. We analyzed 128 data points where both eBAC and gait-related sensor data was captured, either when not drinking (n=60), while eBAC was ascending (n=55) or eBAC was descending (n=13). 21 data points were captured at times when the eBAC was greater than the legal limit (0.08 mg/dl). Using a Bayesian regularized neural network, gait-related phone sensor features showed a high correlation with eBAC (Pearson's r > 0.9), and >95% of estimated eBAC would fall between -0.012 and +0.012 of actual eBAC. It is feasible to collect gait-related data from smartphone sensors during drinking occasions in the natural environment. Sensor-based features can be used to infer gait changes associated with elevated blood alcohol content.
Machine Learning Crash Course, Part I: Supervised Machine Learning IoT For All
When you type'machine learning' into Google News, the first link you see is a Forbes Magazine piece called "What's The Difference Between Machine Learning And Artificial Intelligence?" This article contained so many flowery, grandiose descriptions about ML and AI technology that I couldn't help but laugh. With all the nonsense the media uses to describe machine learning (ML) and artificial intelligence (AI), it's time we do a deep dive into what these technologies actually do. First, we need to learn the difference between AI and ML. Fortunately, a fellow writer has already written an excellent explanation here.
Implementing a Neural Network from Scratch in Python – an Introduction
Get the code: To follow along, all the code is also available as an iPython notebook on Github. In this post we will implement a simple 3-layer neural network from scratch. We won't derive all the math that's required, but I will try to give an intuitive explanation of what we are doing. I will also point to resources for you read up on the details. Here I'm assuming that you are familiar with basic Calculus and Machine Learning concepts, e.g.
Linear Models Don't have to Fit Exactly for P-Values To Be Accurate, Right, and Useful
There is no need to get confused with multiple linear regression, generalized linear model or general linear methods. The general linear model or multivariate regression model is a statistical linear model and is written as Y XB U. Usually, a linear model includes a number of different statistical models such as ANOVA, ANCOVA, MANOVA, MANCOVA, ordinary linear regression, t-test and F-test. The GLM is a generalization of multiple linear regression models to the case of more than one dependent variable. So if Y, B, and U represent column vectors, the matrix equation above will portray a multiple linear regression. Which are the key assumptions made in a multiple linear regression analysis?
TensorFlow: What Parameters to Optimize?
This article targets whom have a basic understanding for TensorFlow Core API. Learning TensorFlow Core API, which is the lowest level API in TensorFlow, is a very good step for starting learning TensorFlow because it let you understand the kernel of the library. Here is a very simple example of TensorFlow Core API in which we create and train a linear regression model. The loss returned is 53.76. Existence of error, specially for large error, means that the parameters used must be updated.
A Sparse Graph-Structured Lasso Mixed Model for Genetic Association with Confounding Correction
Ye, Wenting, Liu, Xiang, Wang, Haohan, Xing, Eric P.
While linear mixed model (LMM) has shown a competitive performance in correcting spurious associations raised by population stratification, family structures, and cryptic relatedness, more challenges are still to be addressed regarding the complex structure of genotypic and phenotypic data. For example, geneticists have discovered that some clusters of phenotypes are more co-expressed than others. Hence, a joint analysis that can utilize such relatedness information in a heterogeneous data set is crucial for genetic modeling. We proposed the sparse graph-structured linear mixed model (sGLMM) that can incorporate the relatedness information from traits in a dataset with confounding correction. Our method is capable of uncovering the genetic associations of a large number of phenotypes together while considering the relatedness of these phenotypes. Through extensive simulation experiments, we show that the proposed model outperforms other existing approaches and can model correlation from both population structure and shared signals. Further, we validate the effectiveness of sGLMM in the real-world genomic dataset on two different species from plants and humans. In Arabidopsis thaliana data, sGLMM behaves better than all other baseline models for 63.4% traits. We also discuss the potential causal genetic variation of Human Alzheimer's disease discovered by our model and justify some of the most important genetic loci.
Optimal and Adaptive Off-policy Evaluation in Contextual Bandits
Wang, Yu-Xiang, Agarwal, Alekh, Dudik, Miroslav
We study the off-policy evaluation problem---estimating the value of a target policy using data collected by another policy---under the contextual bandit model. We consider the general (agnostic) setting without access to a consistent model of rewards and establish a minimax lower bound on the mean squared error (MSE). The bound is matched up to constants by the inverse propensity scoring (IPS) and doubly robust (DR) estimators. This highlights the difficulty of the agnostic contextual setting, in contrast with multi-armed bandits and contextual bandits with access to a consistent reward model, where IPS is suboptimal. We then propose the SWITCH estimator, which can use an existing reward model (not necessarily consistent) to achieve a better bias-variance tradeoff than IPS and DR. We prove an upper bound on its MSE and demonstrate its benefits empirically on a diverse collection of data sets, often outperforming prior work by orders of magnitude.