Abstract

We introduce UCF101 which is currently the largest dataset of human actions. It consists of 101 action classes, over 13k clips and 27 hours of video data. The database consists of realistic user uploaded videos containing camera motion and cluttered background. Additionally, we provide baseline action recognition results on this new dataset using standard bag of words approach with overall performance of 44.5%. To the best of our knowledge, UCF101 is currently the most challenging dataset of actions due to its large number of classes, large number of clips and also unconstrained nature of such clips.

Keywords

CLIPSUploadComputer scienceAction recognitionArtificial intelligenceAction (physics)Motion (physics)Baseline (sea)Human motionPattern recognition (psychology)Machine learningClass (philosophy)World Wide Web

Related Publications

Publication Info

Year
2024
Type
preprint
Volume
141
Issue
5
Pages
676-7
Citations
4432
Access
Closed

External Links

Social Impact

Social media, news, blog, policy document mentions

Citation Metrics

4432
OpenAlex

Cite This

Khurram Soomro, Amir Zamir, Mubarak Shah (2024). UCF-101: A dataset of 101 human actions classes from videos in the wild. arXiv (Cornell University) , 141 (5) , 676-7. https://doi.org/10.57702/46wax343

Identifiers

DOI
10.57702/46wax343