Abstract

Stochastic Gradient Descent (SGD) is a popular algorithm that can achieve state-of-the-art performance on a variety of machine learning tasks. Several researchers have recently proposed schemes to parallelize SGD, but all require performance-destroying memory locking and synchronization. This work aims to show using novel theoretical analysis, algorithms, and implementation that SGD can be implemented without any locking. We present an update scheme called HOGWILD! which allows processors access to shared memory with the possibility of overwriting each other's work. We show that when the associated optimization problem is sparse, meaning most gradient updates only modify small parts of the decision variable, then HOGWILD! achieves a nearly optimal rate of convergence. We demonstrate experimentally that HOGWILD! outperforms alternative schemes that use locking by an order of magnitude.

Keywords

Stochastic gradient descentLock (firearm)Computer scienceDescent (aeronautics)Parallel computingGradient descentArtificial intelligenceEngineeringMechanical engineeringAerospace engineering

Related Publications

Handbook of Genetic Algorithms

This book sets out to explain what genetic algorithms are and how they can be used to solve real-world problems. The first objective is tackled by the editor, Lawrence Davis. Th...

1991 7308 citations

A ConvNet for the 2020s

The "Roaring 20s" of visual recognition began with the introduction of Vision Transformers (ViTs), which quickly superseded ConvNets as the state-of-the-art image classification...

2022 2022 IEEE/CVF Conference on Computer ... 5683 citations

Publication Info

Year
2011
Type
preprint
Citations
1224
Access
Closed

External Links

Social Impact

Social media, news, blog, policy document mentions

Citation Metrics

1224
OpenAlex

Cite This

Feng Niu, Benjamin Recht, Christopher Ré et al. (2011). HOGWILD!: A Lock-Free Approach to Parallelizing Stochastic Gradient Descent. arXiv (Cornell University) . https://doi.org/10.48550/arxiv.1106.5730

Identifiers

DOI
10.48550/arxiv.1106.5730