What, where and who? Classifying events by scene and object recognition

Abstract

We propose a first attempt to classify events in static images by integrating scene and object categorizations. We define an event in a static image as a human activity taking place in a specific environment. In this paper, we use a number of sport games such as snow boarding, rock climbing or badminton to demonstrate event classification. Our goal is to classify the event in the image as well as to provide a number of semantic labels to the objects and scene environment within the image. For example, given a rowing scene, our algorithm recognizes the event as rowing by classifying the environment as a lake and recognizing the critical objects in the image as athletes, rowing boat, water, etc. We achieve this integrative and holistic recognition through a generative graphical model. We have assembled a highly challenging database of 8 widely varied sport events. We show that our system is capable of classifying these event classes at 73.4% accuracy. While each component of the model contributes to the final recognition, using scene or objects alone cannot achieve this performance.

Keywords

RowingEvent (particle physics)Computer scienceArtificial intelligenceObject (grammar)Component (thermodynamics)Generative grammarImage (mathematics)Generative modelComputer visionPattern recognition (psychology)Machine learning

Affiliated Institutions

Related Publications

On the Analysis of Accumulative Difference Pictures from Image Sequences of Real World Scenes

Ramesh Jain , Hans-Hellmut Nagel

The count of events where sample areas from the second and subsequent frames of a TV-image sequence are incompatible with the corresponding sample area of the first frame are ac...

1979 IEEE Transactions on Pattern Analysis... 343 citations

SUN database: Large-scale scene recognition from abbey to zoo

Jianxiong Xiao , James Hays , Krista A. Ehinger +2 more

Scene categorization is a fundamental problem in computer vision. However, scene understanding research has been constrained by the limited scope of currently-used databases whi...

2010 3052 citations

Deep neural networks are easily fooled: High confidence predictions for unrecognizable images

Anh‐Tu Nguyen , Jason Yosinski , Jeff Clune

Deep neural networks (DNNs) have recently been achieving state-of-the-art performance on a variety of pattern-recognition tasks, most notably visual classification problems. Giv...

2015 3232 citations

Object class recognition by unsupervised scale-invariant learning

Rob Fergus , Pietro Perona , Andrew Zisserman

We present a method to learn and recognize object class models from unlabeled and unsegmented cluttered scenes in a scale invariant manner. Objects are modeled as flexible const...

2003 2035 citations

High-Resolution Image Synthesis and Semantic Manipulation with Conditional GANs

Ting-Chun Wang , Ming-Yu Liu , Jun-Yan Zhu +3 more

We present a new method for synthesizing high-resolution photo-realistic images from semantic label maps using conditional generative adversarial networks (conditional GANs). Co...

2018 4266 citations

Publication Info

Year: 2007
Type: article
Pages: 1-8
Citations: 792
Access: Closed

External Links

View on DOI.org

Social Impact

Altmetric

What, where and who? Classifying events by scene and object recognition

PlumX Metrics

Social media, news, blog, policy document mentions

Citation Metrics

792

OpenAlex

Cite This

APA Style

                            
                                    Li-Jia Li, 
                                
                                    Li Fei-Fei
                                
                            (2007). 
                            What, where and who? Classifying events by scene and object recognition. 
                            
                            , 1-8.
                            https://doi.org/10.1109/iccv.2007.4408872

Identifiers

DOI: 10.1109/iccv.2007.4408872