Goto

Collaborating Authors

 Regression


Regression Vs Classification In Machine Learning

#artificialintelligence

Regression and classification are many times confusing to many beginners in the field of Machine learning. Eventually, this will make it impossible for them to adopt the correct methodologies for solving problems with prediction. Regression and classification are both types of supervised machine learning algorithms, where a model is trained along with correctly labeled data according to the current model. Let's understand each algorithm first. Regression algorithms estimate a continuous value based on the input variables.


Creating an Algorithmic Trading Strategy Using Python and Logistic Regression

#artificialintelligence

Obtaining historical data on the stocks that we want to observe is a two-step process. The library get-all-tickers allows us to compile a list of stock tickers by filtering companies on aspects like market cap or exchange. For this example, I am looking at companies that have a market cap between $150,000 and $10,000,000 (in millions). You will notice that I also included a line of code to print the number of tickers we are using. You will need to be sure that you are not targeting more than 2,000 tickers, because the Yfinance API has a 2,000 API calls per hour limit.


Intro to Regularization With Ridge And Lasso Regression with Sklearn

#artificialintelligence

Ordinary Least Squares is one of the easiest and most widely used ML algorithms. But it suffers from a fatal flaw -- it is super easy for the algorithm to overfit the training data. But as the number of predictor variables (or dimensions) increases, the coefficients ฮฒ_i also tend to get very large. With large coefficients, it is easy to predict nearly everything -- you just take the relevant combination of individual slopes (ฮฒs) and you get the answer. That's why it is common for linear regression models to overfit the training data.


Panel semiparametric quantile regression neural network for electricity consumption forecasting

arXiv.org Machine Learning

China has made great achievements in electric power industry during the long-term deepening of reform and opening up. However, the complex regional economic, social and natural conditions, electricity resources are not evenly distributed, which accounts for the electricity deficiency in some regions of China. It is desirable to develop a robust electricity forecasting model. Motivated by which, we propose a Panel Semiparametric Quantile Regression Neural Network (PSQRNN) by utilizing the artificial neural network and semiparametric quantile regression. The PSQRNN can explore a potential linear and nonlinear relationships among the variables, interpret the unobserved provincial heterogeneity, and maintain the interpretability of parametric models simultaneously. And the PSQRNN is trained by combining the penalized quantile regression with LASSO, ridge regression and backpropagation algorithm. To evaluate the prediction accuracy, an empirical analysis is conducted to analyze the provincial electricity consumption from 1999 to 2018 in China based on three scenarios. From which, one finds that the PSQRNN model performs better for electricity consumption forecasting by considering the economic and climatic factors. Finally, the provincial electricity consumptions of the next $5$ years (2019-2023) in China are reported by forecasting.


Feedback Coding for Active Learning

arXiv.org Machine Learning

Active learning is an area of modern machine learning that studies how data points can be sequentially selected for labeling to train a model with as few labeled examples as possible (Settles, 2009). Minimizing the number of labeled examples is critical in any learning scenario where labels are expensive to obtain, such as in healthcare applications where a medical expert must hand-label each training example (Liu, 2004), or where only a limited number of examples can be evaluated, such as in drug discovery (Warmuth et al., 2003). The active selection of data points shares many technical parallels with channel coding with feedback, where a message is encoded into a sequence of symbols transmitted across a noisy channel and each symbol is selected based on the message and past channel outputs. In active learning, the optimal classifier parameters play the role of the "message" while the sequence of examples with noisy labels plays the role of "channel outputs" available through feedback to select the next example for labeling. Both feedback channel coding and active learning seek to minimize the number of encoder actions, leverage a history of noisy observations to select the next most informative action, must account for observation noise, and should operate in a computationally efficient manner. Although there exists a large literature studying the intersection of information theory with machine learning (Xu and Raginsky, 2017) and specifically active learning (Naghshvar et al., 2015), there remain open questions about the best ways to directly leverage techniques in channel coding for active example selection. The main contribution of this work is a formulation of general active learning problems in terms of a feedback coding system, and a demonstration of this approach through the application and analysis of active learning in logistic regression. To motivate this approach, we first examine active learning through the lens of feedback channel coding by identifying communications system components, including a deterministic encoder, noisy channel, channel input constraints, and capacity-achieving distribution. With these components identified, we show how typical structural constraints in active learning problems prevent the direct application of existing feedback coding approaches such as posterior matching (Ma and Coleman, 2011).


Supervised Learning in the Presence of Concept Drift: A modelling framework

arXiv.org Machine Learning

We present a modelling framework for the investigation of supervised learning in non-stationary environments. Specifically, we model two example types of learning systems: prototype-based Learning Vector Quantization (LVQ) for classification and shallow, layered neural networks for regression tasks. We investigate so-called student teacher scenarios in which the systems are trained from a stream of high-dimensional, labeled data. Properties of the target task are considered to be non-stationary due to drift processes while the training is performed. Different types of concept drift are studied, which affect the density of example inputs only, the target rule itself, or both. By applying methods from statistical physics, we develop a modelling framework for the mathematical analysis of the training dynamics in non-stationary environments. Our results show that standard LVQ algorithms are already suitable for the training in non-stationary environments to a certain extent. However, the application of weight decay as an explicit mechanism of forgetting does not improve the performance under the considered drift processes. Furthermore, we investigate gradient-based training of layered neural networks with sigmoidal activation functions and compare with the use of rectified linear units (ReLU). Our findings show that the sensitivity to concept drift and the effectiveness of weight decay differs significantly between the two types of activation function.


Bayesian Thinking & Estimating Posterior Distribution for Linear Regression @ Data Ketchupโ€ฆ

#artificialintelligence

One of the major motivations of this research is the fact that there has been an increasing focus on Deep model interpretability with the advent of more and more complex models. More is the complexity of the model, difficult it gets to have interpretability with respect to the outputs and a lot of research is going in the field of Bayesian thinking and learning. But before understanding and being able to appreciate Bayesian in deep neural models, we should be well versed and adept with Bayesian thinking in linear models for example- Bayesian Linear regression. But there are very few good materials available online in a combined fashion which can give a clear motivation and understanding of the Bayesian Linear regression. This was one of the major motivations for this blog and here I will try to give an understanding of how to approach the Linear regression from a Bayesian analysis standpoint.


Machine Learning Exercises In Python, Part 5

#artificialintelligence

This post is part of a series covering the exercises from Andrew Ng's machine learning class on Coursera. The original code, exercise text, and data files for this post are available here. In part four we wrapped up our implementation of logistic regression by extending our solution to handle multi-class classification and testing it on the hand-written digits data set. Using just logistic regression we were able to hit a classification accuracy of about 97.5%, which is reasonably good but pretty much maxes out what we can achieve with a linear model. In this blog post we'll again tackle the hand-written digits data set, but this time using a feed-forward neural network with backpropagation.


Beware of the Simulated DAG! Varsortability in Additive Noise Models

arXiv.org Machine Learning

Additive noise models are a class of causal models in which each variable is defined as a function of its causes plus independent noise. In such models, the ordering of variables by marginal variances may be indicative of the causal order. We introduce varsortability as a measure of agreement between the ordering by marginal variance and the causal order. We show how varsortability dominates the performance of continuous structure learning algorithms on synthetic data. On real-world data, varsortability is an implausible and untestable assumption and we find no indication of high varsortability. We aim to raise awareness that varsortability easily occurs in simulated additive noise models. We provide a baseline method that explicitly exploits varsortability and advocate reporting varsortability in benchmarking data.


Learning Prediction Intervals for Regression: Generalization and Calibration

arXiv.org Machine Learning

We study the generation of prediction intervals in regression for uncertainty quantification. This task can be formalized as an empirical constrained optimization problem that minimizes the average interval width while maintaining the coverage accuracy across data. We strengthen the existing literature by studying two aspects of this empirical optimization. First is a general learning theory to characterize the optimality-feasibility tradeoff that encompasses Lipschitz continuity and VC-subgraph classes, which are exemplified in regression trees and neural networks. Second is a calibration machinery and the corresponding statistical theory to optimally select the regularization parameter that manages this tradeoff, which bypasses the overfitting issues in previous approaches in coverage attainment. We empirically demonstrate the strengths of our interval generation and calibration algorithms in terms of testing performances compared to existing benchmarks.