Goto

Collaborating Authors

 Regression


Linear Regression Vs Logistic Regression

#artificialintelligence

Logistic regression is a part of the supervised learning category; it measures the relationship between the categorical dependent variable and one or more independent variables by estimating probabilities using a logistic/sigmoid function. In spite of the name'logistic regression,' this is not used for regression problem where the task is to predict the real-valued output. It is a classification problem which is used to predict a binary outcome (1/0, -1/1, True/False) given a set of independent variables. In linear regression, you predict a real-valued output y based on a weighted sum of input variables as shown below. The aim of linear regression is to estimate values for the model coefficients c, w1, w2, w3 ….wn and fit the training data with minimum error to predict the output y.


57 Best Machine Learning Course Online & Tutorial Digital Learning Land

#artificialintelligence

Data visualization: In this section, you will learn how to create simple plots like scatter plot histogram bar, etc. Data manipulation: You will learn in detail about data manipulation. GUI Programming: This section is a combination of life instructor-led training and self-paced learning. Developing web Maps and representing information using plots: In this section, you will understand how to design Python applications. Computer vision using open CV and visualization using bokeh: You will also learn designing Python application in the section.


LMLFM: Longitudinal Multi-Level Factorization Machine

arXiv.org Machine Learning

We consider the problem of learning predictive models from longitudinal data, consisting of irregularly repeated, sparse observations from a set of individuals over time. Such data often exhibit {\em longitudinal correlation} (LC) (correlations among observations for each individual over time), {\em cluster correlation} (CC) (correlations among individuals that have similar characteristics), or both. These correlations are often accounted for using {\em mixed effects models} that include {\em fixed effects} and {\em random effects}, where the fixed effects capture the regression parameters that are shared by all individuals, whereas random effects capture those parameters that vary across individuals. However, the current state-of-the-art methods are unable to select the most predictive fixed effects and random effects from a large number of variables, while accounting for complex correlation structure in the data and non-linear interactions among the variables. We propose Longitudinal Multi-Level Factorization Machine (LMLFM), to the best of our knowledge, the first model to address these challenges in learning predictive models from longitudinal data. We establish the convergence properties, and analyze the computational complexity, of LMLFM. We present results of experiments with both simulated and real-world longitudinal data which show that LMLFM outperforms the state-of-the-art methods in terms of predictive accuracy, variable selection ability, and scalability to data with large number of variables. The code and supplemental material is available at \url{https://github.com/junjieliang672/LMLFM}.


Response Transformation and Profit Decomposition for Revenue Uplift Modeling

arXiv.org Machine Learning

Uplift models support decision-making in marketing campaign planning. Estimating the causal effect of a marketing treatment, an uplift model facilitates targeting communication to responsive customers and efficient allocation of marketing budgets. Research into uplift models focuses on conversion models to maximize incremental sales. The paper introduces uplift modeling strategies for maximizing incremental revenues. If customers differ in their spending behavior, revenue maximization is a more plausible business objective compared to maximizing conversions. The proposed methodology entails a transformation of the prediction target, customer-level revenues, that facilitates implementing a causal uplift model using standard machine learning algorithms. The distribution of campaign revenues is typically zero-inflated because of many non-buyers. Remedies to this modeling challenge are incorporated in the proposed revenue uplift strategies in the form of two-stage models. Empirical experiments using real-world e-commerce data confirm the merits of the proposed revenue uplift strategy over relevant alternatives including uplift models for conver-sion and recently developed causal machine learning algorithms. To quantify the degree to which improved targeting decisions raise return on marketing, the paper develops a decomposition of campaign profit. Applying the decomposition to a digital coupon targeting campaign, the paper provides evidence that revenue uplift modeling, as well as causal machine learning, can improve cam-paign profit substantially.


Consistent Robust Adversarial Prediction for General Multiclass Classification

arXiv.org Machine Learning

Some example of the task are the zero-one loss classification where the predictor suffers a loss of one when making incorrect prediction and zero otherwise as well as the ordinal classification (also known as ordinal regression) where the predictor suffers a loss that increases as the prediction moves away from the true label. Empirical risk minimization (ERM) (Vapnik, 1992) is a standard approach for solving general multiclass classification problems by finding the classifier that minimizes a loss metric over the training data. However, since directly minimizing this loss over training data within the ERM framework is generally NPhard (Steinwart and Christmann, 2008), convex surrogate losses that can be efficiently optimized are employed to approximate the loss. Constructing surrogate losses for binary classification has been well studied, resulting in surrogate losses that enjoy desirable theoretical properties and good performance in practice. Among the popular examples are the logarithmic loss, which is minimized by the logistic regression classifier (McCullagh and Nelder, 1989), and the hinge loss, which is minimized by the support vector machine (SVM) (Boser et al., 1992; Cortes and Vapnik, 1995).


Bayesian Curiosity for Efficient Exploration in Reinforcement Learning

arXiv.org Machine Learning

Balancing exploration and exploitation is a fundamental part of reinforcement learning, yet most state-of-the-art algorithms use a naive exploration protocol like $\epsilon$-greedy. This contributes to the problem of high sample complexity, as the algorithm wastes effort by repeatedly visiting parts of the state space that have already been explored. We introduce a novel method based on Bayesian linear regression and latent space embedding to generate an intrinsic reward signal that encourages the learning agent to seek out unexplored parts of the state space. This method is computationally efficient, simple to implement, and can extend any state-of-the-art reinforcement learning algorithm. We evaluate the method on a range of algorithms and challenging control tasks, on both simulated and physical robots, demonstrating how the proposed method can significantly improve sample complexity.


Estimation of the yield curve for Costa Rica using combinatorial optimization metaheuristics applied to nonlinear regression

arXiv.org Machine Learning

The term structure of interest rates or yield curve is a function relating the interest rate with its own term. Nonlinear regression models of Nelson - Si egel and Svensson were used to estimate the yield curve using a sample of historical data supplied by th e National Stock Exchange of Costa Rica. The optimization problem involved in the estimation process of model parameters is addressed by the use of four well known combinatorial optimization metaheu-ristics: Ant colony optimization, Genetic algorithm, Part icle swarm optimization and Simulated annealing. The aim of the study is to improve the local minima obtained by a classical quasi - Newton optimization m ethod using a descent direction. Good results with at least two metaheuristics are achieved, Particle sw arm optimization and Simulated annealing.


TITAN: A Spatiotemporal Feature Learning Framework for Traffic Incident Duration Prediction

arXiv.org Machine Learning

Critical incident stages identification and reasonable prediction of traffic incident duration are essential in traffic incident management. In this paper, we propose a traffic incident duration prediction model that simultaneously predicts the impact of the traffic incidents and identifies the critical groups of temporal features via a multi-task learning framework. First, we formulate a sparsity optimization problem that extracts low-level temporal features based on traffic speed readings and then generalizes higher level features as phases of traffic incidents. Second, we propose novel constraints on feature similarity exploiting prior knowledge about the spatial connectivity of the road network to predict the incident duration. The proposed problem is challenging to solve due to the orthogonality constraints, non-convexity objective, and non-smoothness penalties. We develop an algorithm based on the alternating direction method of multipliers (ADMM) framework to solve the proposed formulation. Extensive experiments and comparisons to other models on real-world traffic data and traffic incident records justify the efficacy of our model.


Heterogeneous Deep Graph Infomax

arXiv.org Machine Learning

Graph representation learning is to learn universal node representations that preserve both node attributes and structural information. The derived node representations can be used to serve various downstream tasks, such as node classification and node clustering. When a graph is heterogeneous, the problem becomes more challenging than the homogeneous graph node learning problem. Inspired by the emerging information theoretic-based learning algorithm, in this paper we propose an unsupervised graph neural network Heterogeneous Deep Graph Infomax (HDGI) for heterogeneous graph representation learning. We use the meta-path structure to analyze the connections involving semantics in heterogeneous graphs and utilize graph convolution module and semantic-level attention mechanism to capture local representations. By maximizing local-global mutual information, HDGI effectively learns high-level node representations that can be utilized in downstream graph-related tasks. Experiment results show that HDGI remarkably outperforms state-of-the-art unsupervised graph representation learning methods on both classification and clustering tasks. By feeding the learned representations into a parametric model, such as logistic regression, we even achieve comparable performance in node classification tasks when comparing with state-of-the-art supervised end-to-end GNN models.


Predicting overweight and obesity in later life from childhood data: A review of predictive modeling approaches

arXiv.org Machine Learning

Background: Overweight and obesity are an increasing phenomenon worldwide. Predicting future overweight or obesity early in the childhood reliably could enable a successful intervention by experts. While a lot of research has been done using explanatory modeling methods, capability of machine learning, and predictive modeling, in particular, remain mainly unexplored. In predictive modeling models are validated with previously unseen examples, giving a more accurate estimate of their performance and generalization ability in real-life scenarios. Objective: To find and review existing overweight or obesity research from the perspective of employing childhood data and predictive modeling methods. Methods: The initial phase included bibliographic searches using relevant search terms in PubMed, IEEE database and Google Scholar. The second phase consisted of iteratively searching references of potential studies and recent research that cite the potential studies. Results: Eight research articles and three review articles were identified as relevant for this review. Conclusions: Prediction models with high performance either have a relatively short time period to predict or/and are based on late childhood data. Logistic regression is currently the most often used method in forming the prediction models. In addition to child's own weight and height information, maternal weight status or body mass index was often used as predictors in the models.