Generalized Convolutional Neural Networks for Point Cloud Data
Over the past half decade, sensors capable of precisely measuring distances have dropped in price dramatically. RGB-D (RGB Distance) cameras such as the Microsoft Kinect are able to assign distances to individual pixels, and LIDAR (Light Detection and Ranging) scanners are more effective and affordable. A combination of these advances in hardware and research into SLAM (Simultaneous Localization and Mapping) have allowed robots and self driving cars to stitch together individual images into maps of their environment. Whereas 2D image based object detection and segmentation has seen plenty of advancement, the processing of point cloud data is still slightly lagging. This can be attributed partly to the ubiquity of 2D images and relative scarcity of point cloud data, but also partly to the convenient nature of RGB images, as spatial relationships between pixels are encoded in the structure of the image itself by the indices of pixels in the matrix. CNNs exploit this efficiently, as individual pixels can be matched with individual weights, resulting in a computationally cheap operation. In a point cloud however, individual points can exist in any location in the array, and spatial information is encoded explicitly alongside other information. A map generated from an RGB-D camera would consist of points that would each be structured as such: [X,Y,Z,R,G,B].
Jul-20-2017
- Genre:
- Research Report (0.52)
- Industry:
- Technology: