Abstract

We present AVOD, an Aggregate View Object Detection network for autonomous driving scenarios. The proposed neural network architecture uses LIDAR point clouds and RGB images to generate features that are shared by two subnetworks: a region proposal network (RPN) and a second stage detector network. The proposed RPN uses a novel architecture capable of performing multimodal feature fusion on high resolution feature maps to generate reliable 3D object proposals for multiple object classes in road scenes. Using these proposals, the second stage detection network performs accurate oriented 3D bounding box regression and category classification to predict the extents, orientation, and classification of objects in 3D space. Our proposed architecture is shown to produce state of the art results on the KITTI 3D object detection benchmark [1] while running in real time with a low memory footprint, making it a suitable candidate for deployment on autonomous vehicles. Code is available at: https://github.com/kujason/avod.

Keywords

Computer scienceObject detectionMinimum bounding boxArtificial intelligenceBenchmark (surveying)Point cloudMemory footprintComputer visionFeature (linguistics)Orientation (vector space)Feature extractionLidarObject (grammar)Code (set theory)Pattern recognition (psychology)Image (mathematics)

Affiliated Institutions

Related Publications

Publication Info

Year
2018
Type
article
Citations
1593
Access
Closed

Social Impact

Social media, news, blog, policy document mentions

Citation Metrics

1593
OpenAlex
175
Influential
1196
CrossRef

Cite This

Jason S. Ku, Melissa Mozifian, Jungwook Lee et al. (2018). Joint 3D Proposal Generation and Object Detection from View Aggregation. 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) . https://doi.org/10.1109/iros.2018.8594049

Identifiers

DOI
10.1109/iros.2018.8594049
arXiv
1712.02294

Data Quality

Data completeness: 79%