Abstract

The past 3 years of work in NLP have been characterized by the development and deployment of ever larger language models, especially for English. BERT, its variants, GPT-2/3, and others, most recently Switch-C, have pushed the boundaries of the possible both through architectural innovations and through sheer size. Using these pretrained models and the methodology of fine-tuning them for specific tasks, researchers have extended the state of the art on a wide array of tasks as measured by leaderboards on specific benchmarks for English. In this paper, we take a step back and ask: How big is too big? What are the possible risks associated with this technology and what paths are available for mitigating those risks? We provide recommendations including weighing the environmental and financial costs first, investing resources into curating and carefully documenting datasets rather than ingesting everything on the web, carrying out pre-development exercises evaluating how the planned approach fits into research and development goals and supports stakeholder values, and encouraging research directions beyond ever larger language models.

Keywords

Software deploymentComputer scienceStakeholderAsk priceWork (physics)Data scienceBig dataArtificial intelligenceSoftware engineeringBusinessFinanceEngineeringData miningManagement

Affiliated Institutions

Related Publications

Optuna

The purpose of this study is to introduce new design-criteria for next-generation hyperparameter optimization software. The criteria we propose include (1) define-by-run API tha...

2019 Proceedings of the 25th ACM SIGKDD In... 5681 citations

Publication Info

Year
2021
Type
article
Pages
610-623
Citations
4223
Access
Closed

External Links

Social Impact

Altmetric

Social media, news, blog, policy document mentions

Citation Metrics

4223
OpenAlex

Cite This

Emily M. Bender, Timnit Gebru, Angelina McMillan-Major et al. (2021). On the Dangers of Stochastic Parrots. , 610-623. https://doi.org/10.1145/3442188.3445922

Identifiers

DOI
10.1145/3442188.3445922