Abstract

Random forests (Breiman, 2001, Machine Learning 45: 5–32) is a statistical- or machine-learning algorithm for prediction. In this article, we introduce a corresponding new command, rforest. We overview the random forest algorithm and illustrate its use with two examples: The first example is a classification problem that predicts whether a credit card holder will default on his or her debt. The second example is a regression problem that predicts the logscaled number of shares of online news articles. We conclude with a discussion that summarizes key points demonstrated in the examples.

Keywords

Random forestComputer scienceKey (lock)Artificial intelligenceCredit cardMachine learningAlgorithmStatistical learningWorld Wide Web

Affiliated Institutions

Related Publications

Publication Info

Year
2020
Type
article
Volume
20
Issue
1
Pages
3-29
Citations
1126
Access
Closed

External Links

Social Impact

Social media, news, blog, policy document mentions

Citation Metrics

1126
OpenAlex

Cite This

Matthias Schonlau, Rosie Yuyan Zou (2020). The random forest algorithm for statistical learning. The Stata Journal Promoting communications on statistics and Stata , 20 (1) , 3-29. https://doi.org/10.1177/1536867x20909688

Identifiers

DOI
10.1177/1536867x20909688