Regression
Machine Learning and Optimization Techniques for Solving Inverse Kinematics in a 7-DOF Robotic Arm
As the pace of AI technology continues to accelerate, more tools have become available to researchers to solve longstanding problems, Hybrid approaches available today continue to push the computational limits of efficiency and precision. One of such problems is the inverse kinematics of redundant systems. This paper explores the complexities of a 7 degree of freedom manipulator and explores 13 optimization techniques to solve it. Additionally, a novel approach is proposed to contribute to the field of algorithmic research. This was found to be over 200 times faster than the well-known traditional Particle Swarm Optimization technique. This new method may serve as a new field of search that combines the explorative capabilities of Machine Learning with the exploitative capabilities of numerical methods.
Predicting the energetic proton flux with a machine learning regression algorithm
Stumpo, Mirko, Laurenza, Monica, Benella, Simone, Marcucci, Maria Federica
ABSTRACT The need of real-time of monitoring and alerting systems for Space Weather hazards has grown significantly in the last two decades. One of the most important challenge for space mission operations and planning is the prediction of solar proton events (SPEs). In this context, artificial intelligence and machine learning techniques have opened a new frontier, providing a new paradigm for statistical forecasting algorithms. The great majority of these models aim to predict the occurrence of a SPE, i.e., they are based on the classification approach. In this work we present a simple and efficient machine learning regression algorithm which is able to forecast the energetic proton flux up to 1 hour ahead by exploiting features derived from the electron flux only. This approach could be helpful to improve monitoring systems of the radiation risk in both deep space and near-Earth environments. The model is very relevant for mission operations and planning, especially when flare characteristics and source location are not available in real time, as at Mars distance. INTRODUCTION Solar Proton Events (SPEs) are pronounced enhancements of the energetic proton flux measured by instruments placed on different space probes across the Heliosphere. Solar protons can reach high energies, say tens of GeVs, as a consequence of different acceleration processes occurring at the Sun in association with transient phenomena like solar flares and coronal mass ejections (CMEs; Kahler et al. 1984; Shea & Smart 1990; Aschwanden 2002; Iucci et al. 2005). Then, particles travel along interplanetary magnetic field lines and can produce a geoeffective SPE that can be detected by instruments placed on Earth-orbiting satellites, such as the Geostationary Operational Environmental Satellite (GOES).
Joint Optimization of Piecewise Linear Ensembles
Raymond, Matt, Violi, Angela, Scott, Clayton
Tree ensembles achieve state-of-the-art performance on numerous prediction tasks. We propose Joint Optimization of Piecewise Linear ENsembles (JOPLEN), which jointly fits piecewise linear models at all leaf nodes of an existing tree ensemble. In addition to enhancing the expressiveness of an ensemble, JOPLEN allows several common penalties, including sparsity-promoting matrix norms and subspace-norms, to be applied to nonlinear prediction. We demonstrate the performance of JOPLEN on over 100 regression and classification datasets and with a variety of penalties. JOPLEN leads to improved prediction performance relative to not only standard random forest and gradient boosted tree ensembles, but also other methods for enhancing tree ensembles. We demonstrate that JOPLEN with a nuclear norm penalty learns subspace-aligned functions. Additionally, JOPLEN combined with a Dirty LASSO penalty is an effective feature selection method for nonlinear prediction in multitask learning.
A variational Bayes approach to debiased inference for low-dimensional parameters in high-dimensional linear regression
Castillo, Ismaรซl, L'Huillier, Alice, Ray, Kolyan, Travis, Luke
We propose a scalable variational Bayes method for statistical inference for a single or low-dimensional subset of the coordinates of a high-dimensional parameter in sparse linear regression. Our approach relies on assigning a mean-field approximation to the nuisance coordinates and carefully modelling the conditional distribution of the target given the nuisance. This requires only a preprocessing step and preserves the computational advantages of mean-field variational Bayes, while ensuring accurate and reliable inference for the target parameter, including for uncertainty quantification. We investigate the numerical performance of our algorithm, showing that it performs competitively with existing methods. We further establish accompanying theoretical guarantees for estimation and uncertainty quantification in the form of a Bernstein--von Mises theorem.
How to regularize your regression
Can we learn how to set the regularization parameter from similar domain-specific data? Perhaps the simplest relation between a real dependent variable and a vector of features is a linear model . Given some training examples or datapoints consisting of pairs of features and dependent variables, we would like to learn which would give the best prediction given features of an unseen example. This process of fitting a linear model to the datapoints is called linear regression. This simple yet effective model finds ubiquitous applications, ranging from biological, behavioral, and social sciences to environmental studies and financial forecasting, to make reliable predictions on future data.
Perceptron Collaborative Filtering
While multivariate logistic regression classifiers are a great way of implementing collaborative filtering - a method of making automatic predictions about the interests of a user by collecting preferences or taste information from many other users, we can also achieve similar results using neural networks. A recommender system is a subclass of information filtering system that provide suggestions for items that are most pertinent to a particular user. A perceptron or a neural network is a machine learning model designed for fitting complex datasets using backpropagation and gradient descent. When coupled with advanced optimization techniques, the model may prove to be a great substitute for classical logistic classifiers. The optimizations include feature scaling, mean normalization, regularization, hyperparameter tuning and using stochastic/mini-batch gradient descent instead of regular gradient descent. In this use case, we will use the perceptron in the recommender system to fit the parameters i.e., the data from a multitude of users and use it to predict the preference/interest of a particular user.
Model-Based Inference and Experimental Design for Interference Using Partial Network Data
Reeves, Steven Wilkins, Lubold, Shane, Chandrasekhar, Arun G., McCormick, Tyler H.
The stable unit treatment value assumption states that the outcome of an individual is not affected by the treatment statuses of others, however in many real world applications, treatments can have an effect on many others beyond the immediately treated. Interference can generically be thought of as mediated through some network structure. In many empirically relevant situations however, complete network data (required to adjust for these spillover effects) are too costly or logistically infeasible to collect. Partially or indirectly observed network data (e.g., subsamples, aggregated relational data (ARD), egocentric sampling, or respondent-driven sampling) reduce the logistical and financial burden of collecting network data, but the statistical properties of treatment effect adjustments from these design strategies are only beginning to be explored. In this paper, we present a framework for the estimation and inference of treatment effect adjustments using partial network data through the lens of structural causal models. We also illustrate procedures to assign treatments using only partial network data, with the goal of either minimizing estimator variance or optimally seeding. We derive single network asymptotic results applicable to a variety of choices for an underlying graph model. We validate our approach using simulated experiments on observed graphs with applications to information diffusion in India and Malawi.
Minimax Linear Regression under the Quantile Risk
Hanchi, Ayoub El, Maddison, Chris J., Erdogdu, Murat A.
We study the problem of designing minimax procedures in linear regression under the quantile risk. We start by considering the realizable setting with independent Gaussian noise, where for any given noise level and distribution of inputs, we obtain the exact minimax quantile risk for a rich family of error functions and establish the minimaxity of OLS. This improves on the lower bounds obtained by Lecuรฉ and Mendelson (2016) and Mendelson (2017) for the special case of square error, and provides us with a lower bound on the minimax quantile risk over larger sets of distributions. Under the square error and a fourth moment assumption on the distribution of inputs, we show that this lower bound is tight over a larger class of problems. Specifically, we prove a matching upper bound on the worst-case quantile risk of a variant of the procedure proposed by Lecuรฉ and Lerasle (2020), thereby establishing its minimaxity, up to absolute constants. We illustrate the usefulness of our approach by extending this result to all p-th power error functions for p (2,). Along the way, we develop a generic analogue to the classical Bayesian method for lower bounding the minimax risk when working with the quantile risk, as well as a tight characterization of the quantiles of the smallest eigenvalue of the sample covariance matrix.
An Optimal Transport Approach for Network Regression
Zalles, Alex G., Hung, Kai M., Finneran, Ann E., Beaudrot, Lydia, Uribe, Cรฉsar A.
We study the problem of network regression, where one is interested in how the topology of a network changes as a function of Euclidean covariates. We build upon recent developments in generalized regression models on metric spaces based on Fr\'echet means and propose a network regression method using the Wasserstein metric. We show that when representing graphs as multivariate Gaussian distributions, the network regression problem requires the computation of a Riemannian center of mass (i.e., Fr\'echet means). Fr\'echet means with non-negative weights translates into a barycenter problem and can be efficiently computed using fixed point iterations. Although the convergence guarantees of fixed-point iterations for the computation of Wasserstein affine averages remain an open problem, we provide evidence of convergence in a large number of synthetic and real-data scenarios. Extensive numerical results show that the proposed approach improves existing procedures by accurately accounting for graph size, topology, and sparsity in synthetic experiments. Additionally, real-world experiments using the proposed approach result in higher Coefficient of Determination ($R^{2}$) values and lower mean squared prediction error (MSPE), cementing improved prediction capabilities in practice.
Enriching the Machine Learning Workloads in BigBench
Polag, Matthias, Ivanov, Todor, Eichhorn, Timo
In the era of Big Data and the growing support for Machine Learning, Deep Learning and Artificial Intelligence algorithms in the current software systems, there is an urgent need of standardized application benchmarks that stress test and evaluate these new technologies. Relying on the standardized BigBench (TPCx-BB) benchmark, this work enriches the improved BigBench V2 with three new workloads and expands the coverage of machine learning algorithms. Our workloads utilize multiple algorithms and compare different implementations for the same algorithm across several popular libraries like MLlib, SystemML, Scikit-learn and Pandas, demonstrating the relevance and usability of our benchmark extension.