Goto

Collaborating Authors

 Regression


How to start career in Data Science and Machine Learning

#artificialintelligence

It does not matter how much experience you have, actually anybody can start or switch to data science and machine learning. The only important this is, how much eager you are for it. What it means to you. If you are very much keen to work in this field then nobody can stop you. There might be some short term hurdles however if you are focused enough and know your goals regarding where you want to see yourself after certain years, then you will definitely be successful in overcoming those hurdles.


Machine learning-based dynamic mortality prediction after traumatic brain injury

#artificialintelligence

Our aim was to create simple and largely scalable machine learning-based algorithms that could predict mortality in a real-time fashion during intensive care after traumatic brain injury. We performed an observational multicenter study including adult TBI patients that were monitored for intracranial pressure (ICP) for at least 24 h in three ICUs. We used machine learning-based logistic regression modeling to create two algorithms (based on ICP, mean arterial pressure [MAP], cerebral perfusion pressure [CPP] and Glasgow Coma Scale [GCS]) to predict 30-day mortality. We used a stratified cross-validation technique for internal validation. Of 472 included patients, 92 patients (19%) died within 30 days.



Top 10 Machine Learning Algorithms for Beginners Machine Learning Tutorial [Data Science]

#artificialintelligence

This Machine Learning Algorithms Tutorial video by Learnaholic India will help you learn Machine Learning Tutorial, what is Machine Learning, [Data Science] various Machine Learning problems and the algorithms, key Machine Learning algorithms with simple examples. The key Machine Learning algorithms discussed in detail are Linear Regression, Logistic Regression, Decision Tree, Random Forest and KNN algorithm. Machine Learning Tutorial [Data Science] Top 10 Machine Learning Algorithms for Beginners In this Machine Learning Algorithms Tutorial video you will understand: 1) Types of Machine Learning Algorithms (00:25) 2) Supervised Learning Algorithms (00:30) 3) Unsupervised Learning Algorithms (1:59) 4) Reinforcement Learning Algorithms (3:38) 5) Top 10 Machine Learning Algorithms for Beginners (4:33) This Machine Learning Algorithms Tutorial shall teach you what machine learning is, and the various ways in which you can use machine learning to solve a problem! Towards the end, you will learn how to prepare a data-set for model creation and validation and how you can create a model using any machine learning algorithm! Hit the subscribe button above.


Qini-based Uplift Regression

arXiv.org Machine Learning

This article proposes methodology that identifies characteristics associated with a home insurance policy that can be used to infer the link between marketing intervention and policy renewal rate. Using the resulting statistical model, the goal is to predict which customers the company should focus on, in order to deploy future retention campaigns. A subscription-based company loses its customers when they stop doing business with their service. Also known as customer attrition, customer churn can be a drag on the business growth. It is less expensive to retain existing customers than to acquire new customers, so businesses put effort into marketing strategies to reduce customer attrition. Customer loyalty, on the other hand, is usually more profitable because the company have already earned the trust and loyalty of existing customers. Businesses mostly have a defined strategy for fighting customer churn over a period of time. Organizations are able to determine their success rate in customer loyalty and identify improvement strategies using available data and learning about churn.


Reading The Markets -- Machine Learning Versus The Financial News

#artificialintelligence

Suffice it to say that they are a form of non-linear regression tool whose underlying design found inspiration in a simplification of the basic architecture of the human brain. Many of the great advances that we have experienced in Machine Learning over the last few years make use of neural networks. The basic algorithm has been around for decades -- but it has come into its own as processing power and data availability have steadily increased. For this project we implemented our neural network in Python using the popular TensorFlow library from Google. The characteristics of our neural network, and in particular its complexity, were chosen to balance precision and generalization.


Adaptive Estimation of Multivariate Piecewise Polynomials and Bounded Variation Functions by Optimal Decision Trees

arXiv.org Machine Learning

Proposed by Donoho (1997), Dyadic CART is a nonparametric regression method which computes a globally optimal dyadic decision tree and fits piecewise constant functions. In this article we define and study Dyadic CART and a closely related estimator, namely Optimal Regression Tree (ORT), in the context of estimating piecewise smooth functions in general dimensions. More precisely, these optimal decision tree estimators fit piecewise polynomials of any given degree. Like Dyadic CART in two dimensions, we reason that these estimators can also be computed in polynomial time in the sample size via dynamic programming. We prove oracle inequalities for the finite sample risk of Dyadic CART and ORT which imply tight risk bounds for several function classes of interest. Firstly, they imply that the finite sample risk of ORT of order $r \geq 0$ is always bounded by $C k \frac{\log N}{N}$ ($N$ is the sample size) whenever the regression function is piecewise polynomial of degree $r$ on some reasonably regular axis aligned rectangular partition of the domain with at most $k$ rectangles. Beyond the univariate case, such guarantees are scarcely available in the literature for computationally efficient estimators. Secondly, our oracle inequalities uncover optimality and adaptivity of the Dyadic CART estimator for function spaces with bounded variation. We consider two function spaces of recent interest where multivariate total variation denoising and univariate trend filtering are the state of the art methods. We show that Dyadic CART enjoys certain advantages over these estimators while still maintaining all their known guarantees.


Recursive Prediction of Graph Signals with Incoming Nodes

arXiv.org Machine Learning

Kernel and linear regression have been recently explored in the prediction of graph signals as the output, given arbitrary input signals that are agnostic to the graph. In many real-world problems, the graph expands over time as new nodes get introduced. Keeping this premise in mind, we propose a method to recursively obtain the optimal prediction or regression coefficients for the recently propose Linear Regression over Graphs (LRG), as the graph expands with incoming nodes. This comes as a natural consequence of the structure C(W)= of the regression problem, and obviates the need to solve a new regression problem each time a new node is added. Experiments with real-world graph signals show that our approach results in good prediction performance which tends to be close to that obtained from knowing the entire graph apriori.


High Dimensional M-Estimation with Missing Outcomes: A Semi-Parametric Framework

arXiv.org Machine Learning

We consider high dimensional $M$-estimation in settings where the response $Y$ is possibly missing at random and the covariates $\mathbf{X} \in \mathbb{R}^p$ can be high dimensional compared to the sample size $n$. The parameter of interest $\boldsymbol{\theta}_0 \in \mathbb{R}^d$ is defined as the minimizer of the risk of a convex loss, under a fully non-parametric model, and $\boldsymbol{\theta}_0$ itself is high dimensional which is a key distinction from existing works. Standard high dimensional regression and series estimation with possibly misspecified models and missing $Y$ are included as special cases, as well as their counterparts in causal inference using 'potential outcomes'. Assuming $\boldsymbol{\theta}_0$ is $s$-sparse ($s \ll n$), we propose an $L_1$-regularized debiased and doubly robust (DDR) estimator of $\boldsymbol{\theta}_0$ based on a high dimensional adaptation of the traditional double robust (DR) estimator's construction. Under mild tail assumptions and arbitrarily chosen (working) models for the propensity score (PS) and the outcome regression (OR) estimators, satisfying only some high-level conditions, we establish finite sample performance bounds for the DDR estimator showing its (optimal) $L_2$ error rate to be $\sqrt{s (\log d)/ n}$ when both models are correct, and its consistency and DR properties when only one of them is correct. Further, when both the models are correct, we propose a desparsified version of our DDR estimator that satisfies an asymptotic linear expansion and facilitates inference on low dimensional components of $\boldsymbol{\theta}_0$. Finally, we discuss various of choices of high dimensional parametric/semi-parametric working models for the PS and OR estimators. All results are validated via detailed simulations.


Cognitive Assessment Estimation from Behavioral Responses in Emotional Faces Evaluation Task -- AI Regression Approach for Dementia Onset Prediction in Aging Societies

arXiv.org Machine Learning

We present a practical health-theme machine learning (ML) application concerning `AI for social good' domain for `Producing Good Outcomes' track. In particular, the solution is concerning the problem of a potential elderly adult dementia onset prediction in aging societies. The paper discusses our attempt and encouraging preliminary study results of behavioral responses analysis in a working memory-based emotional evaluation experiment. We focus on the development of digital biomarkers for dementia progress detection and monitoring. We present a behavioral data collection concept for a subsequent AI-based application together with a range of regression encouraging results of Montreal Cognitive Assessment (MoCA) scores in the leave-one-subject-out cross-validation setup. The regressor input variables include experimental subject's emotional valence and arousal recognition responses, as well as reaction times, together with self-reported education levels and ages, obtained from a group of twenty older adults taking part in the reported data collection project. The presented results showcase the potential social benefits of artificial intelligence application for elderly and establish a step forward to develop ML approaches, for the subsequent application of simple behavioral objective testing for dementia onset diagnostics replacing subjective MoCA.