Abstract

The superior performance of Deformable Convolutional Networks arises from its ability to adapt to the geometric variations of objects. Through an examination of its adaptive behavior, we observe that while the spatial support for its neural features conforms more closely than regular ConvNets to object structure, this support may nevertheless extend well beyond the region of interest, causing features to be influenced by irrelevant image content. To address this problem, we present a reformulation of Deformable ConvNets that improves its ability to focus on pertinent image regions, through increased modeling power and stronger training. The modeling power is enhanced through a more comprehensive integration of deformable convolution within the network, and by introducing a modulation mechanism that expands the scope of deformation modeling. To effectively harness this enriched modeling capability, we guide network training via a proposed feature mimicking scheme that helps the network to learn features that reflect the object focus and classification power of R-CNN features. With the proposed contributions, this new version of Deformable ConvNets yields significant performance gains over the original model and produces leading results on the COCO benchmark for object detection and instance segmentation.

Keywords

Computer scienceFocus (optics)Artificial intelligenceBenchmark (surveying)Convolution (computer science)Feature (linguistics)SegmentationObject detectionConvolutional neural networkFeature extractionPattern recognition (psychology)Artificial neural networkObject (grammar)Computer visionImage segmentationImage (mathematics)

Affiliated Institutions

Related Publications

Publication Info

Year
2019
Type
preprint
Pages
9300-9308
Citations
2431
Access
Closed

Social Impact

Social media, news, blog, policy document mentions

Citation Metrics

2431
OpenAlex
178
Influential
1978
CrossRef

Cite This

Xizhou Zhu, Han Hu, Stephen Lin et al. (2019). Deformable ConvNets V2: More Deformable, Better Results. 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) , 9300-9308. https://doi.org/10.1109/cvpr.2019.00953

Identifiers

DOI
10.1109/cvpr.2019.00953
arXiv
1811.11168

Data Quality

Data completeness: 84%