Goto

Collaborating Authors

 training and test error


Export Reviews, Discussions, Author Feedback and Meta-Reviews

Neural Information Processing Systems

Note: References present in the paper are referred to by their numerical citations as used in the paper. Summary of Paper The paper seeks to establish a connection between algorithmic stability and generalization performance. Notions of algorithmic stability have been proposed before and linked to the generalization performance of learning algorithms [6,11,13,14] and have also been shown to be crucial for learnability [14]. This paper first establishes that for Vapnik's general setting of learning, a probabilistic notion of stability, is necessary and sufficient for the training losses to converge to test losses uniformly for all distributions. The paper then presents some discussions on how this notion of stability can be interpreted to give results in terms of the capacity of the function class or the size of the population.


Behavior of linear L2-boosting algorithms in the vanishing learning rate asymptotic

arXiv.org Machine Learning

In the past decades, boosting has become a major and powerful prediction method in machine learning. The success of the classification algorithm AdaBoost by Freund and Schapire (1999) demonstrated the possibility to combine many weak learners in a sequential way in order to produce better predictions, with widespread applications in gene expression (Dudoit et al., 2002) or music genre identification (Bergstra et al., 2006), to name only a few. Friedman et al. (2000) were able to see a wider statistical framework that lead to the gradient boosting (Friedman, 2001), where a weak learner (e.g., regression trees) is used to optimize a loss function in a sequential procedure akin to gradient descent. Choosing the loss function according to the statistical problem at hand results in a versatile and efficient tool that can handle classification, regression, quantile regression or survival analysis... The popularity of gradient boosting is also due to its efficient implementation in the R package gbm by Ridgeway (2007). Along the methodological developments, strong theoretical results have justified the good performance of boosting. Consistency of boosting algorithm, i.e. their ability to achieve the optimal Bayes error rate for large samples, is considered in Breiman (2004), Zhang and Yu (2005) or Bartlett and Traskin (2007). The present paper is strongly influenced by Bühlmann 2 and Yu (2003) that proposes an analysis of regression boosting algorithms built on linear base learners thanks to explicit formulas for the boosted predictor and its error rate. In this paper, we focus on gradient boosting for regression with square loss and we briefly describe the corresponding algorithm.


Classify A Rare Event Using 5 Machine Learning Algorithms - KDnuggets

#artificialintelligence

Supervised Learning is the crown jewel of Machine Learning. A couple years ago, Harvard Business Review released an article with the following title "Data Scientist: The Sexiest Job of the 21st Century." Ever since its release, Data Science or Statistics Departments become widely pursued by college students and, and Data Scientists (Nerds), for the first time, is referred to as being sexy. For some industries, Data Scientists have reshaped the corporation structure and reallocated a lot of decision-makings to the "front-line" workers. Being able to generate useful business insights from data has never been so easy.


Classify A Rare Event Using 5 Machine Learning Algorithms - KDnuggets

#artificialintelligence

Supervised Learning is the crown jewel of Machine Learning. A couple years ago, Harvard Business Review released an article with the following title "Data Scientist: The Sexiest Job of the 21st Century." Ever since its release, Data Science or Statistics Departments become widely pursued by college students and, and Data Scientists (Nerds), for the first time, is referred to as being sexy. For some industries, Data Scientists have reshaped the corporation structure and reallocated a lot of decision-makings to the "front-line" workers. Being able to generate useful business insights from data has never been so easy.


Classifying Rare Events Using Five Machine Learning Techniques

#artificialintelligence

Supervised learning is the crown jewel of Machine Learning. Supervised learning is the machine learning task or process of producing a function that predicts output variables. It has been adopted widely in the industry. For example, banks apply supervised models to detect credit card fraud. Quantitative traders make purchase decisions based on ML model predictions.


Classifying Rare Events Using Five Machine Learning Techniques

#artificialintelligence

Supervised learning is the crown jewel of Machine Learning. Supervised learning is the machine learning task or process of producing a function that predicts output variables. It has been adopted widely in the industry. For example, banks apply supervised models to detect credit card fraud. Quantitative traders make purchase decisions based on ML model predictions.


Classifying Rare Events Using Five Machine Learning Techniques

#artificialintelligence

Supervised learning is the crown jewel of Machine Learning. Supervised learning is the machine learning task or process of producing a function that predicts output variables. It has been adopted widely in the industry. For example, banks apply supervised models to detect credit card fraud. Quantitative traders make purchase decisions based on ML model predictions.


Limits on Learning Machine Accuracy Imposed by Data Quality

Neural Information Processing Systems

Random errors and insufficiencies in databases limit the performance of any classifier trained from and applied to the database. In this paper we propose a method to estimate the limiting performance of classifiers imposed by the database. We demonstrate this technique on the task of predicting failure in telecommunication paths. 1 Introduction Data collection for a classification or regression task is prone to random errors, e.g.


Limits on Learning Machine Accuracy Imposed by Data Quality

Neural Information Processing Systems

Random errors and insufficiencies in databases limit the performance of any classifier trained from and applied to the database. In this paper we propose a method to estimate the limiting performance of classifiers imposed by the database. We demonstrate this technique on the task of predicting failure in telecommunication paths. 1 Introduction Data collection for a classification or regression task is prone to random errors, e.g.


Limits on Learning Machine Accuracy Imposed by Data Quality

Neural Information Processing Systems

Random errors and insufficiencies in databases limit the performance ofany classifier trained from and applied to the database. In this paper we propose a method to estimate the limiting performance ofclassifiers imposed by the database. We demonstrate this technique on the task of predicting failure in telecommunication paths. 1 Introduction Data collection for a classification or regression task is prone to random errors, e.g.