Predict Customer Churn with Gradient Boosting


Customer churn is a key predictor of the long term success or failure of a business. But when it comes to all this data, what's the best model to use? This post shows that gradient boosting is the most accurate way of predicting customer attrition. I'll show you how you can create your own data analysis using gradient boosting to identify and save those at risk customers! Customer retention should be a top priority of any business as acquiring new customers is often far more expensive that keeping existing ones.

Accurate and Diverse Sampling of Sequences based on a "Best of Many" Sample Objective Machine Learning

For autonomous agents to successfully operate in the real world, anticipation of future events and states of their environment is a key competence. This problem has been formalized as a sequence extrapolation problem, where a number of observations are used to predict the sequence into the future. Real-world scenarios demand a model of uncertainty of such predictions, as predictions become increasingly uncertain -- in particular on long time horizons. While impressive results have been shown on point estimates, scenarios that induce multi-modal distributions over future sequences remain challenging. Our work addresses these challenges in a Gaussian Latent Variable model for sequence prediction. Our core contribution is a "Best of Many" sample objective that leads to more accurate and more diverse predictions that better capture the true variations in real-world sequence data. Beyond our analysis of improved model fit, our models also empirically outperform prior work on three diverse tasks ranging from traffic scenes to weather data.

Introducing Inbox Samples: saving your data for future training samples MonkeyLearn Blog


Today we're launching Inbox Samples, an exciting new feature that will make it much easier to improve the machine learning models built on our platform. Now, whenever you send a new text to be analyzed by MonkeyLearn (via our API, integrations or user interface), the system will save your data within the Inbox of your module. Later on, you can use the texts in your Inbox as new training samples and improve your models over time. Training samples saved in the inbox of a classifier. You can see the category predicted by the model by clicking on Categories and selecting Unassigned Samples.

High-low level support vector regression prediction approach (HL-SVR) for data modeling with input parameters of unequal sample sizes Machine Learning

Support vector regression (SVR) has been widely used to reduce the high computational cost of computer simulation. SVR assumes the input parameters have equal sample sizes, but unequal sample sizes are often encountered in engineering practices. To solve this issue, a new prediction approach based on SVR, namely as high-low-level SVR approach (HL-SVR) is proposed for data modeling of input parameters of unequal sample sizes in this paper. The proposed approach is consisted of low-level SVR models for the input parameters of larger sample sizes and high-level SVR model for the input parameters of smaller sample sizes. For each training point of the input parameters of smaller sample sizes, one low-level SVR model is built based on its corresponding input parameters of larger sample sizes and their responses of interest. The high-level SVR model is built based on the obtained responses from the low-level SVR models and the input parameters of smaller sample sizes. Several numerical examples are used to validate the performance of HL-SVR. The experimental results indicate that HL-SVR can produce more accurate prediction results than conventional SVR. The proposed approach is applied on the stress analysis of dental implant, which the structural parameters have massive samples but the material of implant can only be selected from several Ti and its alloys. The prediction performance of the proposed approach is much better than the conventional SVR. The proposed approach can be used for the design, optimization and analysis of engineering systems with input parameters of unequal sample sizes.

Deep pNML: Predictive Normalized Maximum Likelihood for Deep Neural Networks Machine Learning

The Predictive Normalized Maximum Likelihood (pNML) scheme has been recently suggested for universal learning in the individual setting, where both the training and test samples are individual data. The goal of universal learning is to compete with a ``genie'' or reference learner that knows the data values, but is restricted to use a learner from a given model class. The pNML minimizes the associated regret for any possible value of the unknown label. Furthermore, its min-max regret can serve as a pointwise measure of learnability for the specific training and data sample. In this work we examine the pNML and its associated learnability measure for the Deep Neural Network (DNN) model class. As shown, the pNML outperforms the commonly used Empirical Risk Minimization (ERM) approach and provides robustness against adversarial attacks. Together with its learnability measure it can detect out of distribution test examples, be tolerant to noisy labels and serve as a confidence measure for the ERM. Finally, we extend the pNML to a ``twice universal'' solution, that provides universality for model class selection and generates a learner competing with the best one from all model classes.