Abstract

We address the problem of real-time 3D object detection from point clouds in the context of autonomous driving. Speed is critical as detection is a necessary component for safety. Existing approaches are, however, expensive in computation due to high dimensionality of point clouds. We utilize the 3D data more efficiently by representing the scene from the Bird's Eye View (BEV), and propose PIXOR, a proposal-free, single-stage detector that outputs oriented 3D object estimates decoded from pixel-wise neural network predictions. The input representation, network architecture, and model optimization are specially designed to balance high accuracy and real-time efficiency. We validate PIXOR on two datasets: the KITTI BEV object detection benchmark, and a large-scale 3D vehicle detection benchmark. In both datasets we show that the proposed detector surpasses other state-of-the-art methods notably in terms of Average Precision (AP), while still runs at 10 FPS.

Keywords

Benchmark (surveying)Computer scienceObject detectionPoint cloudArtificial intelligenceContext (archaeology)ComputationDetectorComputer visionRepresentation (politics)Object (grammar)PixelCurse of dimensionalityPoint (geometry)Pattern recognition (psychology)Algorithm

Affiliated Institutions

Related Publications

Publication Info

Year
2018
Type
preprint
Pages
7652-7660
Citations
1284
Access
Closed

External Links

Social Impact

Social media, news, blog, policy document mentions

Citation Metrics

1284
OpenAlex

Cite This

Bin Yang, Wenjie Luo, Raquel Urtasun (2018). PIXOR: Real-time 3D Object Detection from Point Clouds. , 7652-7660. https://doi.org/10.1109/cvpr.2018.00798

Identifiers

DOI
10.1109/cvpr.2018.00798