Pixelwise Instance Segmentation with a Dynamically Instantiated Network

Abstract

Semantic segmentation and object detection research have recently achieved rapid progress. However, the former task has no notion of different instances of the same object, and the latter operates at a coarse, bounding-box level. We propose an Instance Segmentation system that produces a segmentation map where each pixel is assigned an object class and instance identity label. Most approaches adapt object detectors to produce segments instead of boxes. In contrast, our method is based on an initial semantic segmentation module, which feeds into an instance subnetwork. This subnetwork uses the initial category-level segmentation, along with cues from the output of an object detector, within an end-to-end CRF to predict instances. This part of our model is dynamically instantiated to produce a variable number of instances per image. Our end-to-end approach requires no post-processing and considers the image holistically, instead of processing independent proposals. Therefore, unlike some related work, a pixel cannot belong to multiple instances. Furthermore, far more precise segmentations are achieved, as shown by our substantial improvements at high APr thresholds.

Keywords

SubnetworkSegmentationComputer scienceArtificial intelligenceMinimum bounding boxObject (grammar)PixelObject detectionImage segmentationComputer visionSegmentation-based object categorizationPattern recognition (psychology)Scale-space segmentationBounding overwatchTask (project management)Class (philosophy)Image (mathematics)

Affiliated Institutions

University of Oxford GB

Related Publications

FCOS: Fully Convolutional One-Stage Object Detection

Zhi Tian , Chunhua Shen , Hao Chen +1 more

We propose a fully convolutional one-stage object detector (FCOS) to solve object detection in a per-pixel prediction fashion, analogue to semantic segmentation. Almost all stat...

2019 5672 citations

Fully convolutional networks for semantic segmentation

Jonathan Long , Evan Shelhamer , Trevor Darrell

Convolutional networks are powerful visual models that yield hierarchies of features. We show that convolutional networks by themselves, trained end-to-end, pixels-to-pixels, ex...

2015 35498 citations

Deep High-Resolution Representation Learning for Visual Recognition

Jingdong Wang , Ke Sun , Tianheng Cheng +9 more

High-resolution representations are essential for position-sensitive vision problems, such as human pose estimation, semantic segmentation, and object detection. Existing state-...

2020 IEEE Transactions on Pattern Analysis... 4035 citations

A general framework for object detection

C. Papageorgiou , Michael B. Oren , Tomaso Poggio

This paper presents a general trainable framework for object detection in static images of cluttered scenes. The detection technique we develop is based on a wavelet representat...

2002 1428 citations

Unsupervised Feature Learning via Non-parametric Instance Discrimination

Zhirong Wu , Yuanjun Xiong , Stella X. Yu +1 more

Neural net classifiers trained on data with annotated class labels can also capture apparent visual similarity among categories without being directed to do so. We study whether...

2018 3435 citations

Publication Info

Year: 2017
Type: article
Pages: 879-888
Citations: 241
Access: Closed

External Links

View on DOI.org

Social Impact

Altmetric

Pixelwise Instance Segmentation with a Dynamically Instantiated Network

PlumX Metrics

Social media, news, blog, policy document mentions

Citation Metrics

241

OpenAlex

Cite This

APA Style

                            
                                    Anurag Arnab, 
                                
                                    Philip H. S. Torr
                                
                            (2017). 
                            Pixelwise Instance Segmentation with a Dynamically Instantiated Network. 
                            
                            , 879-888.
                            https://doi.org/10.1109/cvpr.2017.100

Identifiers

DOI: 10.1109/cvpr.2017.100