Goto

Collaborating Authors

 Regression


Logistic Regression Algorithm in Java

#artificialintelligence

Regression analysis is a technique used to determine the relationship between the dependent and the independent variable (s) for prediction purposes. It is a good tool for data modelling and analysis. There are different regression techniques. Our focus will be on Logistic Regression. Logistic Regression is suitable when there are more than one independent variables in a dataset.


Introduction to Linear Regression

#artificialintelligence

Linear regression attempts to model the relationship between two variables by fitting a linear equation to observed data. If this observed data is from the complete population, then the regression is a population regression. For this case, the determined values of the coefficients are exactly those describing the regression model for the given population. However, if the data is only a sample, then it's called sample regression. The determined values of the regression coefficients describe a regression model that is more or less "representative" for the population is thus provides just a more or less uncertain estimate.


Top 6 Machine Learning Algorithms for Classification

#artificialintelligence

The easiest way to distinguish a supervised learning and unsupervised learning is to see whether the data is labelled or not. Supervised learning learns a function to make prediction of a defined label based on the input data. It can be either classifying data into a category (classification problem) or forecasting an outcome (regression algorithms). Reinforcement learning is another type of machine learning, where the agents learn to take actions based on its interaction with the environment, with the aim to maximize rewards. It is most similar to the learning process of human, following a trial-and-error approach.


Multivariate Analysis using SAS

#artificialintelligence

As a first method, it gives you a good idea about the level of multicollinearity involved and how much the two specified sets relate to themselves and each other. Do not forget -- CCA is mainly used for prediction, not interpretation. And the last of the trio is the Discriminant Function Analysis (DFA) which is used to answer the question: Can a combination of variables be used to predict group membership? Because, if a set of variables predicts group membership, it is also connected to that group. DFA is a dimension-reduction technique related to Principal Component Analysis (PCA) and Canonical Correlation Analysis (CCA).


A Class of Geometric Structures in Transfer Learning: Minimax Bounds and Optimality

arXiv.org Machine Learning

We study the problem of transfer learning, observing that previous efforts to understand its information-theoretic limits do not fully exploit the geometric structure of the source and target domains. In contrast, our study first illustrates the benefits of incorporating a natural geometric structure within a linear regression model, which corresponds to the generalized eigenvalue problem formed by the Gram matrices of both domains. We next establish a finite-sample minimax lower bound, propose a refined model interpolation estimator that enjoys a matching upper bound, and then extend our framework to multiple source domains and generalized linear models. Surprisingly, as long as information is available on the distance between the source and target parameters, negative-transfer does not occur. Simulation studies show that our proposed interpolation estimator outperforms state-of-the-art transfer learning methods in both moderate- and high-dimensional settings.


Deep Learning Prerequisites: Linear Regression in Python

#artificialintelligence

Deep Learning Prerequisites: Linear Regression in Python, Data science: Learn linear regression from scratch and build your own working program in Python for data analysis. Created by Lazy Programmer Inc. Preview this Course ย - GET COUPON CODE 100% Off Udemy Coupon . Free Udemy Courses . Online Classes


The harm of class imbalance corrections for risk prediction models: illustration and simulation using logistic regression

#artificialintelligence

Methods to correct class imbalance, i.e. imbalance between the frequency of outcome events and non-events, are receiving increasing interest for developing prediction models. We examined the effect of imbalance correction on the performance of standard and penalized (ridge) logistic regression models in terms of discrimination, calibration, and classification. We examined random undersampling, random oversampling and SMOTE using Monte Carlo simulations and a case study on ovarian cancer diagnosis. The results indicated that all imbalance correction methods led to poor calibration (strong overestimation of the probability to belong to the minority class), but not to better discrimination in terms of the area under the receiver operating characteristic curve. Imbalance correction improved classification in terms of sensitivity and specificity, but similar results were obtained by shifting the probability threshold instead. Our study shows that outcome imbalance is not a problem in itself, and that imbalance correction may even worsen model performance.




Generalized Bayesian Additive Regression Trees Models: Beyond Conditional Conjugacy

arXiv.org Machine Learning

Bayesian additive regression trees have seen increased interest in recent years due to their ability to combine machine learning techniques with principled uncertainty quantification. The Bayesian backfitting algorithm used to fit BART models, however, limits their application to a small class of models for which conditional conjugacy exists. In this article, we greatly expand the domain of applicability of BART to arbitrary \emph{generalized BART} models by introducing a very simple, tuning-parameter-free, reversible jump Markov chain Monte Carlo algorithm. Our algorithm requires only that the user be able to compute the likelihood and (optionally) its gradient and Fisher information. The potential applications are very broad; we consider examples in survival analysis, structured heteroskedastic regression, and gamma shape regression.