McRank: Learning to Rank Using Multiple Classification and Gradient Boosting

Abstract

We cast the ranking problem as (1) multiple classification (“Mc”) (2) multiple ordinal classification, which lead to computationally tractable learning algorithms for relevance ranking in Web search. We consider the DCG criterion (discounted cumulative gain), a standard quality measure in information retrieval. Our approach is motivated by the fact that perfect classifications result in perfect DCG scores and the DCG errors are bounded by classification errors. We propose using the Expected Relevance to convert class probabilities into ranking scores. The class probabilities are learned using a gradient boosting tree algorithm. Evaluations on large-scale datasets show that our approach can improve LambdaRank [5] and the regressions-based ranker [6], in terms of the (normalized) DCG scores. An efficient implementation of the boosting tree algorithm is also presented. 1

Keywords

Boosting (machine learning)Gradient boostingLearning to rankRanking (information retrieval)Artificial intelligenceMachine learningComputer scienceBounded functionMathematicsPattern recognition (psychology)Data miningRandom forest

Affiliated Institutions

Related Publications

LightGBM: A Highly Efficient Gradient Boosting Decision Tree

Guolin Ke , Qi Meng , Thomas Finley +5 more

Gradient Boosting Decision Tree (GBDT) is a popular machine learning algorithm, and has quite a few effective implementations such as XGBoost and pGBRT. Although many engineerin...

2017 HAL (Le Centre pour la Communication ... 9477 citations

<b>ada</b>: An<i>R</i>Package for Stochastic Boosting

Mark V. Culp , Kjell Johnson , George Michailidis

Boosting is an iterative algorithm that combines simple classification rules with "mediocre" performance in terms of misclassification error rate to produce a highly accurate cl...

2006 Journal of Statistical Software 92 citations

Parallel boosted regression trees for web search ranking

Stephen Tyree , Kilian Q. Weinberger , Kunal Agrawal +1 more

Gradient Boosted Regression Trees (GBRT) are the current state-of-the-art learning paradigm for machine learned web-search ranking - a domain notorious for very large data sets....

2011 164 citations

Generalized Boosted Models: A guide to the gbm package

Greg Ridgeway

Boosting takes on various forms with different programs using different loss functions, different base models, and different optimization schemes. The gbm package takes the appr...

2006 769 citations

A Communication-Efficient Parallel Algorithm for Decision Tree

Qi Meng , Guolin Ke , Taifeng Wang +4 more

Decision tree (and its extensions such as Gradient Boosting Decision Trees and Random Forest) is a widely used machine learning algorithm, due to its practical effectiveness and...

2016 arXiv (Cornell University) 69 citations

Publication Info

Year: 2007
Type: article
Volume: 20
Pages: 897-904
Citations: 434
Access: Closed

External Links

Citation Metrics

434

OpenAlex

Cite This

APA Style

                            
                                    Ping Li, 
                                
                                    Qiang Wu, 
                                
                                    Christopher J. C. Burges
                                
                            (2007). 
                            McRank: Learning to Rank Using Multiple Classification and Gradient Boosting. 
                            
                            , 20
                            
                            , 897-904.