Abstract
Most successful object recognition systems rely on binary classification, deciding only if an object is present or not, but not providing information on the actual object location. To estimate the object's location, one can take a sliding window approach, but this strongly increases the computational cost because the classifier or similarity function has to be evaluated over a large set of candidate subwindows. In this paper, we propose a simple yet powerful branch and bound scheme that allows efficient maximization of a large class of quality functions over all possible subimages. It converges to a globally optimal solution typically in linear or even sublinear time, in contrast to the quadratic scaling of exhaustive or sliding window search. We show how our method is applicable to different object detection and image retrieval scenarios. The achieved speedup allows the use of classifiers for localization that formerly were considered too slow for this task, such as SVMs with a spatial pyramid kernel or nearest-neighbor classifiers based on the \\chi;2 distance. We demonstrate state-of-the-art localization performance of the resulting systems on the UIUC Cars data set, the PASCAL VOC 2006 data set, and in the PASCAL VOC 2007 competition.
Keywords
Affiliated Institutions
Related Publications
How good are detection proposals, really?
Current top performing Pascal VOC object detectors employ detection proposals to guide the search for objects thereby avoiding exhaustive sliding window search across images. De...
What Makes for Effective Detection Proposals?
Current top performing object detectors employ detection proposals to guide the search for objects, thereby avoiding exhaustive sliding window search across images. Despite the ...
The Role of Context for Object Detection and Semantic Segmentation in the Wild
In this paper we study the role of context in existing state-of-the-art detection and segmentation approaches. Towards this goal, we label every pixel of PASCAL VOC 2010 detecti...
YOLO9000: Better, Faster, Stronger
We introduce YOLO9000, a state-of-the-art, real-time object detection system that can detect over 9000 object categories. First we propose various improvements to the YOLO detec...
Object Detection via a Multi-region and Semantic Segmentation-Aware CNN Model
We propose an object detection system that relies on a multi-region deep convolutional neural network (CNN) that also encodes semantic segmentation-aware features. The resulting...
Publication Info
- Year
- 2009
- Type
- article
- Volume
- 31
- Issue
- 12
- Pages
- 2129-2142
- Citations
- 347
- Access
- Closed
External Links
Social Impact
Social media, news, blog, policy document mentions
Citation Metrics
Cite This
Identifiers
- DOI
- 10.1109/tpami.2009.144