HOGWILD!: A Lock-Free Approach to Parallelizing Stochastic Gradient Descent

Abstract

Stochastic Gradient Descent (SGD) is a popular algorithm that can achieve state-of-the-art performance on a variety of machine learning tasks. Several researchers have recently proposed schemes to parallelize SGD, but all require performance-destroying memory locking and synchronization. This work aims to show using novel theoretical analysis, algorithms, and implementation that SGD can be implemented without any locking. We present an update scheme called HOGWILD! which allows processors access to shared memory with the possibility of overwriting each other's work. We show that when the associated optimization problem is sparse, meaning most gradient updates only modify small parts of the decision variable, then HOGWILD! achieves a nearly optimal rate of convergence. We demonstrate experimentally that HOGWILD! outperforms alternative schemes that use locking by an order of magnitude.

Keywords

Stochastic gradient descentLock (firearm)Computer scienceDescent (aeronautics)Parallel computingGradient descentArtificial intelligenceEngineeringMechanical engineeringAerospace engineering

Related Publications

Reconfigurable Intelligent Surfaces for Energy Efficiency in Wireless Communication

Chongwen Huang , Alessio Zappone , George C. Alexandropoulos +2 more

The adoption of a Reconfigurable Intelligent Surface (RIS) for downlink multi-user communication from a multi-antenna base station is investigated in this paper. We develop ener...

2018 arXiv (Cornell University) 3473 citations

Handbook of Genetic Algorithms

Lawrence Davis

This book sets out to explain what genetic algorithms are and how they can be used to solve real-world problems. The first objective is tackled by the editor, Lawrence Davis. Th...

1991 7308 citations

Performance evaluation of genetic algorithms for flowshop scheduling problems

Tadahiko Murata , Hisao Ishibuchi

The aim of this paper is to evaluate the performance of genetic algorithms for the flowshop scheduling problem with an objective of minimizing the makespan. First we examine var...

2002 230 citations

A ConvNet for the 2020s

Zhuang Liu , Hanzi Mao , Chao-Yuan Wu +3 more

The "Roaring 20s" of visual recognition began with the introduction of Vision Transformers (ViTs), which quickly superseded ConvNets as the state-of-the-art image classification...

2022 2022 IEEE/CVF Conference on Computer ... 5683 citations

Designing deep networks for surface normal estimation

Xiaolong Wang , David F. Fouhey , Abhinav Gupta

In the past few years, convolutional neural nets (CNN) have shown incredible promise for learning visual representations. In this paper, we use CNNs for the task of predicting s...

2015 345 citations

Publication Info

Year: 2011
Type: preprint
Citations: 1224
Access: Closed

External Links

View on DOI.org

Social Impact

Altmetric

HOGWILD!: A Lock-Free Approach to Parallelizing Stochastic Gradient Descent

PlumX Metrics

Social media, news, blog, policy document mentions

Citation Metrics

1224

OpenAlex

Cite This

APA Style

                            
                                    Feng Niu, 
                                
                                    Benjamin Recht, 
                                
                                    Christopher Ré
                                
                                et al.
                            
                            (2011). 
                            HOGWILD!: A Lock-Free Approach to Parallelizing Stochastic Gradient Descent. 
                            arXiv (Cornell University)
                            
                            .
                            https://doi.org/10.48550/arxiv.1106.5730

Identifiers

DOI: 10.48550/arxiv.1106.5730