Goto

Collaborating Authors

 Regression


Deep Learning Prerequisites: Logistic Regression in Python

#artificialintelligence

Online Courses Udemy - Data science techniques for professionals and students - learn the theory behind logistic regression and code in Python BESTSELLER Created by Lazy Programmer Inc English [Auto-generated], Portuguese [Auto-generated], 1 more Students also bought Data Science: Deep Learning in Python Natural Language Processing with Deep Learning in Python Advanced AI: Deep Reinforcement Learning in Python Deep Learning: Advanced NLP and RNNs Deep Learning A-Z: Hands-On Artificial Neural Networks Preview this course GET COUPON CODE Description This course is a lead-in to deep learning and neural networks - it covers a popular and fundamental technique used in machine learning, data science and statistics: logistic regression. We cover the theory from the ground up: derivation of the solution, and applications to real-world problems. We show you how one might code their own logistic regression module in Python. This course does not require any external materials. Everything needed (Python, and some Python libraries) can be obtained for free.


Multivariate Adaptive Regression Splines (MARS) in Python

#artificialintelligence

Multivariate Adaptive Regression Splines, or MARS, is an algorithm for complex non-linear regression problems. The algorithm involves finding a set of simple linear functions that in aggregate result in the best predictive performance. In this way, MARS is a type of ensemble of simple linear functions and can achieve good performance on challenging regression problems with many input variables and complex non-linear relationships. In this tutorial, you will discover how to develop Multivariate Adaptive Regression Spline models in Python. Multivariate Adaptive Regression Splines (MARS) in Python Photo by Sei F, some rights reserved.


Learning to extrapolate using continued fractions: Predicting the critical temperature of superconductor materials

arXiv.org Artificial Intelligence

In Artificial Intelligence we often seek to identify an unknown target function of many variables $y=f(\mathbf{x})$ giving a limited set of instances $S=\{(\mathbf{x^{(i)}},y^{(i)})\}$ with $\mathbf{x^{(i)}} \in D$ where $D$ is a domain of interest. We refer to $S$ as the training set and the final quest is to identify the mathematical model that approximates this target function for new $\mathbf{x}$; with the set $T=\{ \mathbf{x^{(j)}} \} \subset D$ with $T \neq S$ (i.e. thus testing the model generalisation). However, for some applications, the main interest is approximating well the unknown function on a larger domain $D'$ that contains $D$. In cases involving the design of new structures, for instance, we may be interested in maximizing $f$; thus, the model derived from $S$ alone should also generalize well in $D'$ for samples with values of $y$ larger than the largest observed in $S$. In that sense, the AI system would provide important information that could guide the design process, e.g., using the learned model as a surrogate function to design new lab experiments. We introduce a method for multivariate regression based on iterative fitting of a continued fraction by incorporating additive spline models. We compared it with established methods such as AdaBoost, Kernel Ridge, Linear Regression, Lasso Lars, Linear Support Vector Regression, Multi-Layer Perceptrons, Random Forests, Stochastic Gradient Descent and XGBoost. We tested the performance on the important problem of predicting the critical temperature of superconductors based on physical-chemical characteristics.


CYPUR-NN: Crop Yield Prediction Using Regression and Neural Networks

arXiv.org Artificial Intelligence

Our recent study using historic data of paddy yield and associated conditions include humidity, luminescence, and temperature. By incorporating regression models and neural networks (NN), one can produce highly satisfactory forecasting of paddy yield. Simulations indicate that our model can predict paddy yield with high accuracy while concurrently detecting diseases that may exist and are oblivious to the human eye. Crop Yield Prediction Using Regression and Neural Networks (CYPUR-NN) is developed here as a system that will facilitate agriculturists and farmers to predict yield from a picture or by entering values via a web interface. CYPUR-NN has been tested on stock images and the experimental results are promising.


Feature space approximation for kernel-based supervised learning

arXiv.org Machine Learning

We propose a method for the approximation of high- or even infinite-dimensional feature vectors, which play an important role in supervised learning. The goal is to reduce the size of the training data, resulting in lower storage consumption and computational complexity. Furthermore, the method can be regarded as a regularization technique, which improves the generalizability of learned target functions. We demonstrate significant improvements in comparison to the computation of data-driven predictions involving the full training data set. The method is applied to classification and regression problems from different application areas such as image recognition, system identification, and oceanographic time series analysis.


Machine Learning & Linear Regression

#artificialintelligence

This course is targeted for Beginner Python Developers who want to kickstart their journey in Machine Learning. In this course, we are going to use a linear regression model from scikit-learn library in Python to predict the total no. of positive cases for COVID19 in a particular state in India. After completing this course, you'll be able to:


Complete Linear Regression Analysis in Python

#artificialintelligence

In this section we will learn - What does Machine Learning mean. What are the meanings or different terms associated with machine learning? You will see some examples so that you understand what machine learning actually is. It also contains steps involved in building a machine learning model, not just linear models, any machine learning model.


Reinforced optimal control

arXiv.org Machine Learning

Least squares Monte Carlo methods are a popular numerical approximation method for solving stochastic control problems. Based on dynamic programming, their key feature is the approximation of the conditional expectation of future rewards by linear least squares regression. Hence, the choice of basis functions is crucial for the accuracy of the method. Earlier work by some of us [Belomestny, Schoenmakers, Spokoiny, Zharkynbay. Commun.~Math.~Sci., 18(1):109-121, 2020] proposes to \emph{reinforce} the basis functions in the case of optimal stopping problems by already computed value functions for later times, thereby considerably improving the accuracy with limited additional computational cost. We extend the reinforced regression method to a general class of stochastic control problems, while considerably improving the method's efficiency, as demonstrated by substantial numerical examples as well as theoretical analysis.


Classification supporting COVID-19 diagnostics based on patient survey data

arXiv.org Artificial Intelligence

Distinguishing COVID-19 from other flu-like illnesses can be difficult due to ambiguous symptoms and still an initial experience of doctors. Whereas, it is crucial to filter out those sick patients who do not need to be tested for SARS-CoV-2 infection, especially in the event of the overwhelming increase in disease. As a part of the presented research, logistic regression and XGBoost classifiers, that allow for effective screening of patients for COVID-19, were generated. Each of the methods was tuned to achieve an assumed acceptable threshold of negative predictive values during classification. Additionally, an explanation of the obtained classification models was presented. The explanation enables the users to understand what was the basis of the decision made by the model. The obtained classification models provided the basis for the DECODE service (decode.polsl.pl), which can serve as support in screening patients with COVID-19 disease. Moreover, the data set constituting the basis for the analyses performed is made available to the research community. This data set consisting of more than 3,000 examples is based on questionnaires collected at a hospital in Poland.


Mini-DDSM: Mammography-based Automatic Age Estimation

arXiv.org Artificial Intelligence

Age estimation has attracted attention for its various medical applications. There are many studies on human age estimation from biomedical images. However, there is no research done on mammograms for age estimation, as far as we know. The purpose of this study is to devise an AI-based model for estimating age from mammogram images. Due to lack of public mammography data sets that have the age attribute, we resort to using a web crawler to download thumbnail mammographic images and their age fields from the public data set; the Digital Database for Screening Mammography. The original images in this data set unfortunately can only be retrieved by a software which is broken. Subsequently, we extracted deep learning features from the collected data set, by which we built a model using Random Forests regressor to estimate the age automatically. The performance assessment was measured using the mean absolute error values. The average error value out of 10 tests on random selection of samples was around 8 years. In this paper, we show the merits of this approach to fill up missing age values. We ran logistic and linear regression models on another independent data set to further validate the advantage of our proposed work. This paper also introduces the free-access Mini-DDSM data set.