Abstract

Building discriminative representations for 3D data has been an important task in computer graphics and computer vision research. Convolutional Neural Networks (CNNs) have shown to operate on 2D images with great success for a variety of tasks. Lifting convolution operators to 3D (3DCNNs) seems like a plausible and promising next step. Unfortunately, the computational complexity of 3D CNNs grows cubically with respect to voxel resolution. Moreover, since most 3D geometry representations are boundary based, occupied regions do not increase proportionately with the size of the discretization, resulting in wasted computation. In this work, we represent 3D spaces as volumetric fields, and propose a novel design that employs field probing filters to efficiently extract features from them. Each field probing filter is a set of probing points -- sensors that perceive the space. Our learning algorithm optimizes not only the weights associated with the probing points, but also their locations, which deforms the shape of the probing filters and adaptively distributes them in 3D space. The optimized probing points sense the 3D space intelligently, rather than operating blindly over the entire domain. We show that field probing is significantly more efficient than 3DCNNs, while providing state-of-the-art performance, on classification tasks for 3D object recognition benchmark datasets.

Keywords

Computer scienceConvolutional neural networkArtificial intelligenceBenchmark (surveying)Discriminative modelField (mathematics)Convolution (computer science)Computer graphicsPattern recognition (psychology)Computer visionVoxelComputationFilter (signal processing)Set (abstract data type)Artificial neural networkAlgorithmMathematics

Affiliated Institutions

Related Publications

Publication Info

Year
2016
Type
article
Volume
29
Pages
307-315
Citations
121
Access
Closed

External Links

Citation Metrics

121
OpenAlex

Cite This

Yangyan Li, Sören Pirk, Hao Su et al. (2016). FPNN: Field Probing Neural Networks for 3D Data. arXiv (Cornell University) , 29 , 307-315.