multiple classification
McRank: Learning to Rank Using Multiple Classification and Gradient Boosting
We cast the ranking problem as (1) multiple classification ("Mc") (2) multiple or- dinal classification, which lead to computationally tractable learning algorithms for relevance ranking in Web search. We consider the DCG criterion (discounted cumulative gain), a standard quality measure in information retrieval. Our ap- proach is motivated by the fact that perfect classifications result in perfect DCG scores and the DCG errors are bounded by classification errors. We propose us- ing the Expected Relevance to convert class probabilities into ranking scores. The class probabilities are learned using a gradient boosting tree algorithm.
Making \emph{ordinary least squares} linear classfiers more robust
In the field of statistics and machine learning, the sums-of-squares, commonly referred to as \emph{ordinary least squares}, can be used as a convenient choice of cost function because of its many nice analytical properties, though not always the best choice. However, it has been long known that \emph{ordinary least squares} is not robust to outliers. Several attempts to resolve this problem led to the creation of alternative methods that, either did not fully resolved the \emph{outlier problem} or were computationally difficult. In this paper, we provide a very simple solution that can make \emph{ordinary least squares} less sensitive to outliers in data classification, by \emph{scaling the augmented input vector by its length}. We show some mathematical expositions of the \emph{outlier problem} using some approximations and geometrical techniques. We present numerical results to support the efficacy of our method.
McRank: Learning to Rank Using Multiple Classification and Gradient Boosting
Li, Ping, Wu, Qiang, Burges, Christopher J.
We cast the ranking problem as (1) multiple classification ("Mc") (2) multiple ordinal classification, which lead to computationally tractable learning algorithms for relevance ranking in Web search. We consider the DCG criterion (discounted cumulative gain), a standard quality measure in information retrieval. Our approach is motivated by the fact that perfect classifications result in perfect DCG scores and the DCG errors are bounded by classification errors. We propose using the Expected Relevance to convert class probabilities into ranking scores. The class probabilities are learned using a gradient boosting tree algorithm. Evaluations on large-scale datasets show that our approach can improve LambdaRank [5] and the regressions-based ranker [6], in terms of the (normalized) DCG scores. An efficient implementation of the boosting tree algorithm is also presented.
McRank: Learning to Rank Using Multiple Classification and Gradient Boosting
Li, Ping, Wu, Qiang, Burges, Christopher J.
We cast the ranking problem as (1) multiple classification ("Mc") (2) multiple ordinal classification, which lead to computationally tractable learning algorithms for relevance ranking in Web search. We consider the DCG criterion (discounted cumulative gain), a standard quality measure in information retrieval. Our approach is motivated by the fact that perfect classifications result in perfect DCG scores and the DCG errors are bounded by classification errors. We propose using the Expected Relevance to convert class probabilities into ranking scores. The class probabilities are learned using a gradient boosting tree algorithm. Evaluations on large-scale datasets show that our approach can improve LambdaRank [5] and the regressions-based ranker [6], in terms of the (normalized) DCG scores. An efficient implementation of the boosting tree algorithm is also presented.
McRank: Learning to Rank Using Multiple Classification and Gradient Boosting
Li, Ping, Wu, Qiang, Burges, Christopher J.
We cast the ranking problem as (1) multiple classification ("Mc") (2) multiple ordinal classification,which lead to computationally tractable learning algorithms for relevance ranking in Web search. We consider the DCG criterion (discounted cumulative gain), a standard quality measure in information retrieval. Our approach ismotivated by the fact that perfect classifications result in perfect DCG scores and the DCG errors are bounded by classification errors. We propose using theExpected Relevance to convert class probabilities into ranking scores. The class probabilities are learned using a gradient boosting tree algorithm. Evaluations onlarge-scale datasets show that our approach can improve LambdaRank [5] and the regressions-based ranker [6], in terms of the (normalized) DCG scores. An efficient implementation of the boosting tree algorithm is also presented.