Goto

Collaborating Authors

 Regression


The Incremental Proximal Method: A Probabilistic Perspective

arXiv.org Machine Learning

In this work, we highlight a connection between the incremental proximal method and stochastic filters. We begin by showing that the proximal operators coincide, and hence can be realized with, Bayes updates. We give the explicit form of the updates for the linear regression problem and show that there is a one-to-one correspondence between the proximal operator of the least-squares regression and the Bayes update when the prior and the likelihood are Gaussian. We then carry out this observation to a general sequential setting: We consider the incremental proximal method, which is an algorithm for large-scale optimization, and show that, for a linear-quadratic cost function, it can naturally be realized by the Kalman filter. We then discuss the implications of this idea for nonlinear optimization problems where proximal operators are in general not realizable. In such settings, we argue that the extended Kalman filter can provide a systematic way for the derivation of practical procedures.


One-Class Kernel Spectral Regression for Outlier Detection

arXiv.org Machine Learning

The paper introduces a new efficient nonlinear one-class classifier formulated as the Rayleigh quotient criterion. The method, operating in a reproducing kernel Hilbert subspace, minimises the scatter of target distribution along an optimal projection direction while at the same time keeping projections of positive observations as distant as possible from the mean of the negative class. We provide a graph embedding view of the problem which can then be solved efficiently using the spectral regression approach. In this sense, unlike previous similar methods which often require costly eigen-computations of dense matrices, the proposed approach casts the problem under consideration into a regression framework which avoids eigen-decomposition computations. In particular, it is shown that the dominant complexity of the proposed method is the complexity of computing the kernel matrix. Additional appealing characteristics of the proposed one-class classifier are: 1-the ability to be trained in an incremental fashion (allowing for application in streaming data scenarios while also reducing computational complexity in a non-streaming operation mode); 2-being unsupervised while also providing the functionality for refining the solution using negative training examples, in case available; And last but not least 3-the deployment of the kernel trick allowing for nonlinearly mapping the data into a high-dimensional feature space. Extensive experiments conducted on several datasets verify the merits of the proposed approach in comparison with some other alternatives.


Orthogonal Random Forest for Heterogeneous Treatment Effect Estimation

arXiv.org Machine Learning

We study the problem of estimating heterogeneous treatment effects from observational data, where the treatment policy on the collected data was determined by potentially many confounding observable variables. We propose orthogonal random forest, an algorithm that combines orthogonalization, a technique that effectively removes the confounding effect in two-stage estimation, with generalized random forests [Athey et al., 2017], a flexible method for estimating treatment effect heterogeneity. We prove a consistency rate result of our estimator in the partially linear regression model, and en route we provide a consistency analysis for a general framework of performing generalized method of moments (GMM) estimation. We also provide a comprehensive empirical evaluation of our algorithms, and show that they consistently outperform baseline approaches.


Proactive Intervention to Downtrend Employee Attrition using Artificial Intelligence Techniques

arXiv.org Machine Learning

To predict the employee attrition beforehand and to enable management to take individualized preventive action. Using Ensemble classification modeling techniques and Linear Regression. Model could predict over 91% accurate employee prediction, lead-time in separation and individual reasons causing attrition. Prior intimation of employee attrition enables manager to take preventive actions to retain employee or to manage the business consequences of attrition. Once deployed this will model can help in downtrend Employee Attrition, will help manager to manage team more effectively. Model does not cover the natural calamities, and unforeseen events occurring at an individual level like accident, death etc.


Stable Prediction across Unknown Environments

arXiv.org Machine Learning

In many important machine learning applications, the training distribution used to learn a probabilistic classifier differs from the testing distribution on which the classifier will be used to make predictions. Traditional methods correct the distribution shift by reweighting the training data with the ratio of the density between test and training data. In many applications training takes place without prior knowledge of the testing distribution on which the algorithm will be applied in the future. Recently, methods have been proposed to address the shift by learning causal structure, but those methods rely on the diversity of multiple training data to a good performance, and have complexity limitations in high dimensions. In this paper, we propose a novel Deep Global Balancing Regression (DGBR) algorithm to jointly optimize a deep auto-encoder model for feature selection and a global balancing model for stable prediction across unknown environments. The global balancing model constructs balancing weights that facilitate estimating of partial effects of features (holding fixed all other features), a problem that is challenging in high dimensions, and thus helps to identify stable, causal relationships between features and outcomes. The deep auto-encoder model is designed to reduce the dimensionality of the feature space, thus making global balancing easier. We show, both theoretically and with empirical experiments, that our algorithm can make stable predictions across unknown environments. Our experiments on both synthetic and real world datasets demonstrate that our DGBR algorithm outperforms the state-of-the-art methods for stable prediction across unknown environments.


A Hierarchical Bayesian Linear Regression Model with Local Features for Stochastic Dynamics Approximation

arXiv.org Machine Learning

One of the challenges in model-based control of stochastic dynamical systems is that the state transition dynamics are involved, and it is not easy or efficient to make good-quality predictions of the states. Moreover, there are not many representational models for the majority of autonomous systems, as it is not easy to build a compact model that captures the entire dynamical subtleties and uncertainties. In this work, we present a hierarchical Bayesian linear regression model with local features to learn the dynamics of a micro-robotic system as well as two simpler examples, consisting of a stochastic mass-spring damper and a stochastic double inverted pendulum on a cart. The model is hierarchical since we assume non-stationary priors for the model parameters. These non-stationary priors make the model more flexible by imposing priors on the priors of the model. To solve the maximum likelihood (ML) problem for this hierarchical model, we use the variational expectation maximization (EM) algorithm, and enhance the procedure by introducing hidden target variables. The algorithm yields parsimonious model structures, and consistently provides fast and accurate predictions for all our examples involving large training and test sets. This demonstrates the effectiveness of the method in learning stochastic dynamics, which makes it suitable for future use in a paradigm, such as model-based reinforcement learning, to compute optimal control policies in real time.


Fully Nonparametric Bayesian Additive Regression Trees

arXiv.org Machine Learning

Bayesian Additive Regression Trees (BART) is a fully Bayesian approach to modeling with ensembles of trees. BART can uncover complex regression functions with high dimensional regressors in a fairly automatic way and provide Bayesian quantification of the uncertainty through the posterior. However, BART assumes IID normal errors. This strong parametric assumption can lead to misleading inference and uncertainty quantification. In this paper, we use the classic Dirichlet process mixture (DPM) mechanism to nonparametrically model the error distribution. A key strength of BART is that default prior settings work reasonably well in a variety of problems. The challenge in extending BART is to choose the parameters of the DPM so that the strengths of the standard BART approach is not lost when the errors are close to normal, but the DPM has the ability to adapt to non-normal errors.


Logistic Regression, Neural Networks and Dempster-Shafer Theory: a New Perspective

arXiv.org Machine Learning

We revisit logistic regression and its nonlinear extensions, including multilayer feedforward neural networks, by showing that these classifiers can be viewed as converting input or higher-level features into Dempster-Shafer mass functions and aggregating them by Dempster's rule of combination. The probabilistic outputs of these classifiers are the normalized plausibilities corresponding to the underlying combined mass function. This mass function is more informative than the output probability distribution. In particular, it makes it possible to distinguish between lack of evidence (when none of the features provides discriminant information) from conflicting evidence (when different features support different classes). This expressivity of mass functions allows us to gain insight into the role played by each input feature in logistic regression, and to interpret hidden unit outputs in multilayer neural networks. It also makes it possible to use alternative decision rules, such as interval dominance, which select a set of classes when the available evidence does not unambiguously point to a single class, thus trading reduced error rate for higher imprecision.


Best machine learning, deep learning, ai & ios courses online

#artificialintelligence

It covers both the theoretical aspects of Statisticalconcepts and the practical implementation using R. Real life examples: Every concept is explained with the help of examples, case studies and source code in R wherever necessary. The examples cover a wide array of topics and range from A/B testing in an Internet company context to the Capital Asset Pricing Model in a quant finance context. What you will learn Harness R and R packages to read, process and visualize data Understand linear regression and use it confidently to build models Understand the intricacies of all the different data structures in R Use Linear regression in R to overcome the difficulties of LINEST() in Excel Draw inferences from data and support them using tests of significance Use descriptive statistics to perform a quick study of some data and present results Click here To join us for more information, get in touch keep enhancing Complete iOS 11 Machine Learning Masterclass 3. If you want to learn how to start building professional, career-boosting mobile apps and use Machine Learning to take things to the next level, then this course is for you. The Complete iOS Machine Learning Masterclass is the only course that you need for machine learning on iOS. Machine Learning is a fast-growing field that is revolutionizing many industries with tech giants like Google and IBM taking the lead. In this course, you'll use the most cutting-edge iOS Machine Learning technology stacks to add a layer of intelligence and polish to your mobile apps. We're approaching a new era where only apps and games that are considered "smart" will survive.


Accurate Uncertainties for Deep Learning Using Calibrated Regression

arXiv.org Machine Learning

Methods for reasoning under uncertainty are a key building block of accurate and reliable machine learning systems. Bayesian methods provide a general framework to quantify uncertainty. However, because of model misspecification and the use of approximate inference, Bayesian uncertainty estimates are often inaccurate -- for example, a 90% credible interval may not contain the true outcome 90% of the time. Here, we propose a simple procedure for calibrating any regression algorithm; when applied to Bayesian and probabilistic models, it is guaranteed to produce calibrated uncertainty estimates given enough data. Our procedure is inspired by Platt scaling and extends previous work on classification. We evaluate this approach on Bayesian linear regression, feedforward, and recurrent neural networks, and find that it consistently outputs well-calibrated credible intervals while improving performance on time series forecasting and model-based reinforcement learning tasks.