Abstract
Tree boosting is a highly effective and widely used machine learning method.\nIn this paper, we describe a scalable end-to-end tree boosting system called\nXGBoost, which is used widely by data scientists to achieve state-of-the-art\nresults on many machine learning challenges. We propose a novel sparsity-aware\nalgorithm for sparse data and weighted quantile sketch for approximate tree\nlearning. More importantly, we provide insights on cache access patterns, data\ncompression and sharding to build a scalable tree boosting system. By combining\nthese insights, XGBoost scales beyond billions of examples using far fewer\nresources than existing systems.\n
Affiliated Institutions
Related Publications
A Communication-Efficient Parallel Algorithm for Decision Tree
Decision tree (and its extensions such as Gradient Boosting Decision Trees and Random Forest) is a widely used machine learning algorithm, due to its practical effectiveness and...
Publication Info
- Year
- 2016
- Type
- article
- Pages
- 785-794
- Citations
- 41264
- Access
- Closed
External Links
Social Impact
Social media, news, blog, policy document mentions
Citation Metrics
Cite This
Identifiers
- DOI
- 10.1145/2939672.2939785