The Open Images Dataset V4: Unified image classification, object detection, and visual relationship detection at scale

Abstract

We present Open Images V4, a dataset of 9.2M images with unified annotations for image classification, object detection and visual relationship detection. The images have a Creative Commons Attribution license that allows to share and adapt the material, and they have been collected from Flickr without a predefined list of class names or tags, leading to natural class statistics and avoiding an initial design bias. Open Images V4 offers large scale across several dimensions: 30.1M image-level labels for 19.8k concepts, 15.4M bounding boxes for 600 object classes, and 375k visual relationship annotations involving 57 classes. For object detection in particular, we provide 15x more bounding boxes than the next largest datasets (15.4M boxes on 1.9M images). The images often show complex scenes with several objects (8 annotated objects per image on average). We annotated visual relationships between them, which support visual relationship detection, an emerging task that requires structured reasoning. We provide in-depth comprehensive statistics about the dataset, we validate the quality of the annotations, we study how the performance of several modern models evolves with increasing amounts of training data, and we demonstrate two applications made possible by having unified annotations of multiple types coexisting in the same images. We hope that the scale, quality, and variety of Open Images V4 will foster further research and innovation even beyond the areas of image classification, object detection, and visual relationship detection.

Keywords

Computer scienceArtificial intelligenceBounding overwatchMinimum bounding boxObject detectionObject (grammar)Pattern recognition (psychology)Scale (ratio)Class (philosophy)Variety (cybernetics)Image (mathematics)Task (project management)Computer visionGeography

Affiliated Institutions

Related Publications

Is object localization for free? - Weakly-supervised learning with convolutional neural networks

Maxime Oquab , Léon Bottou , Ivan Laptev +1 more

Successful methods for visual object recognition typically rely on training datasets containing lots of richly annotated images. Detailed image annotation, e.g. by object boundi...

2015 915 citations

Object class recognition by unsupervised scale-invariant learning

Rob Fergus , Pietro Perona , Andrew Zisserman

We present a method to learn and recognize object class models from unlabeled and unsegmented cluttered scenes in a scale invariant manner. Objects are modeled as flexible const...

2003 2035 citations

SUN database: Large-scale scene recognition from abbey to zoo

Jianxiong Xiao , James Hays , Krista A. Ehinger +2 more

Scene categorization is a fundamental problem in computer vision. However, scene understanding research has been constrained by the limited scope of currently-used databases whi...

2010 3052 citations

A general framework for object detection

C. Papageorgiou , Michael B. Oren , Tomaso Poggio

This paper presents a general trainable framework for object detection in static images of cluttered scenes. The detection technique we develop is based on a wavelet representat...

2002 1428 citations

Deep neural networks are easily fooled: High confidence predictions for unrecognizable images

Anh‐Tu Nguyen , Jason Yosinski , Jeff Clune

Deep neural networks (DNNs) have recently been achieving state-of-the-art performance on a variety of pattern-recognition tasks, most notably visual classification problems. Giv...

2015 3232 citations

Publication Info

Year: 2018
Type: article
Citations: 1429
Access: Closed

External Links

Download PDF (Free) View on DOI.org arXiv Semantic Scholar

Social Impact

Altmetric

The Open Images Dataset V4: Unified image classification, object detection, and visual relationship detection at scale

PlumX Metrics

Social media, news, blog, policy document mentions

Citation Metrics

1429

OpenAlex

129

Influential

Cite This

APA Style

                            
                                    Alina Kuznetsova, 
                                
                                    Hassan Rom, 
                                
                                    Neil Alldrin
                                
                                et al.
                            
                            (2018). 
                            The Open Images Dataset V4: Unified image classification, object detection, and visual relationship detection at scale. 
                            arXiv (Cornell University)
                            
                            .
                            https://doi.org/10.1007/s11263-020-01316-z

Identifiers

DOI: 10.1007/s11263-020-01316-z
arXiv: 1811.00982

Data Quality

Data completeness: 84%