Bags of Spacetime Energies for Dynamic Scene Recognition

Abstract

This paper presents a unified bag of visual word (BoW) framework for dynamic scene recognition. The approach builds on primitive features that uniformly capture spatial and temporal orientation structure of the imagery (e.g., video), as extracted via application of a bank of spatiotemporally oriented filters. Various feature encoding techniques are investigated to abstract the primitives to an intermediate representation that is best suited to dynamic scene representation. Further, a novel approach to adaptive pooling of the encoded features is presented that captures spatial layout of the scene even while being robust to situations where camera motion and scene dynamics are confounded. The resulting overall approach has been evaluated on two standard, publically available dynamic scene datasets. The results show that in comparison to a representative set of alternatives, the proposed approach outperforms the previous state-of-the-art in classification accuracy by 10%.

Keywords

Computer scienceArtificial intelligenceRepresentation (politics)PoolingSet (abstract data type)Computer visionPattern recognition (psychology)Feature (linguistics)Encoding (memory)

Affiliated Institutions

Related Publications

Dual Attention Network for Scene Segmentation

Jun Fu , Jing Liu , Haijie Tian +4 more

In this paper, we address the scene segmentation task by capturing rich contextual dependencies based on the self-attention mechanism. Unlike previous works that capture context...

2019 2019 IEEE/CVF Conference on Computer ... 6497 citations

Learning Hierarchical Features for Scene Labeling

Clément Farabet , Camille Couprie , Laurent Najman +1 more

Scene labeling consists of labeling each pixel in an image with the category of the object it belongs to. We propose a method that uses a multiscale convolutional network traine...

2012 IEEE Transactions on Pattern Analysis... 2684 citations

Coordinate Attention for Efficient Mobile Network Design

Qibin Hou , Daquan Zhou , Jiashi Feng

Recent studies on mobile network design have demonstrated the remarkable effectiveness of channel attention (e.g., the Squeeze-and-Excitation attention) for lifting model perfor...

2021 2021 IEEE/CVF Conference on Computer ... 4986 citations

Recognizing indoor scenes

Ariadna Quattoni , Antonio Torralba

We propose a scheme for indoor place identification based on the recognition of global scene views. Scene views are encoded using a holistic representation that provides low-res...

2009 2009 IEEE Conference on Computer Visi... 1464 citations

MOPED: A scalable and low latency object recognition and pose estimation system

Manuel Martínez , Alvaro Collet , Siddhartha S Srinivasa

The latency of a perception system is crucial for a robot performing interactive tasks in dynamic human environments. We present MOPED, a fast and scalable perception system for...

2010 100 citations

Publication Info

Year: 2014
Type: article
Pages: 2681-2688
Citations: 61
Access: Closed

External Links

View on DOI.org

Social Impact

Altmetric

Bags of Spacetime Energies for Dynamic Scene Recognition

PlumX Metrics

Social media, news, blog, policy document mentions

Citation Metrics

OpenAlex

Cite This

APA Style

                            
                                    Christoph Feichtenhofer, 
                                
                                    Axel Pinz, 
                                
                                    Richard P. Wildes
                                
                            (2014). 
                            Bags of Spacetime Energies for Dynamic Scene Recognition. 
                            
                            , 2681-2688.
                            https://doi.org/10.1109/cvpr.2014.343

Identifiers

DOI: 10.1109/cvpr.2014.343