Abstract

Neural architecture search (NAS) has a great impact by automatically\ndesigning effective neural network architectures. However, the prohibitive\ncomputational demand of conventional NAS algorithms (e.g. $10^4$ GPU hours)\nmakes it difficult to \\emph{directly} search the architectures on large-scale\ntasks (e.g. ImageNet). Differentiable NAS can reduce the cost of GPU hours via\na continuous representation of network architecture but suffers from the high\nGPU memory consumption issue (grow linearly w.r.t. candidate set size). As a\nresult, they need to utilize~\\emph{proxy} tasks, such as training on a smaller\ndataset, or learning with only a few blocks, or training just for a few epochs.\nThese architectures optimized on proxy tasks are not guaranteed to be optimal\non the target task. In this paper, we present \\emph{ProxylessNAS} that can\n\\emph{directly} learn the architectures for large-scale target tasks and target\nhardware platforms. We address the high memory consumption issue of\ndifferentiable NAS and reduce the computational cost (GPU hours and GPU memory)\nto the same level of regular training while still allowing a large candidate\nset. Experiments on CIFAR-10 and ImageNet demonstrate the effectiveness of\ndirectness and specialization. On CIFAR-10, our model achieves 2.08\\% test\nerror with only 5.7M parameters, better than the previous state-of-the-art\narchitecture AmoebaNet-B, while using 6$\\times$ fewer parameters. On ImageNet,\nour model achieves 3.1\\% better top-1 accuracy than MobileNetV2, while being\n1.2$\\times$ faster with measured GPU latency. We also apply ProxylessNAS to\nspecialize neural architectures for hardware with direct hardware metrics (e.g.\nlatency) and provide insights for efficient CNN architecture design.\n

Keywords

Task (project management)Computer architectureArchitectureComputer scienceComputer hardwareEmbedded systemArtificial intelligenceEngineeringGeographySystems engineering

Related Publications

Publication Info

Year
2018
Type
preprint
Citations
1280
Access
Closed

External Links

Social Impact

Social media, news, blog, policy document mentions

Citation Metrics

1280
OpenAlex

Cite This

Han Cai, Ligeng Zhu, Song Han (2018). ProxylessNAS: Direct Neural Architecture Search on Target Task and\n Hardware. arXiv (Cornell University) . https://doi.org/10.48550/arxiv.1812.00332

Identifiers

DOI
10.48550/arxiv.1812.00332