In this post, I want to review a technique which works directly with point clouds to detect a grasp configuration. By grasp configuration, I mean the position and orientation of the gripper. The following picture shows a general overview of the approach. To summarize, the key contributions of this work are: • Proposing a network to evaluate the grasp quality by performing geometry analysis directly from a 3D point cloud based on the network architecture of PointNet. Compared with other CNN-based methods, this method can exploit the 3D geometry information in the depth image better without any hand-crafted features and sustain a relatively small amount of parameters for learning and inference efficiency.
The digital factory provides undoubtedly a great potential for future production systems in terms of efficiency and effectivity. A key aspect on the way to realize the digital copy of a real factory is the understanding of complex indoor environments on the basis of 3D data. In order to generate an accurate factory model including the major components, i.e. building parts, product assets and process details, the 3D data collected during digitalization can be processed with advanced methods of deep learning. In this work, we propose a fully Bayesian and an approximate Bayesian neural network for point cloud segmentation. This allows us to analyze how different ways of estimating uncertainty in these networks improve segmentation results on raw 3D point clouds. We achieve superior model performance for both, the Bayesian and the approximate Bayesian model compared to the frequentist one. This performance difference becomes even more striking when incorporating the networks' uncertainty in their predictions. For evaluation we use the scientific data set S3DIS as well as a data set, which was collected by the authors at a German automotive production plant. The methods proposed in this work lead to more accurate segmentation results and the incorporation of uncertainty information makes this approach especially applicable to safety critical applications.
Data can take on a variety of forms. For processing visual information, images are extremely common. Images store a two-dimensional grid of pixels that often represent our three-dimensional world. Some of the most successful advances in machine learning have come from problems involving images. However, for capturing data in 3D directly, it is less common to have a three-dimensional array of pixels representing a full volume.
In multimodal traffic monitoring, we gather traffic statistics for distinct transportation modes, such as pedestrians, cars and bicycles, in order to analyze and improve people's daily mobility in terms of safety and convenience. On account of its robustness to bad light and adverse weather conditions, and inherent speed measurement ability, the radar sensor is a suitable option for this application. However, the sparse radar data from conventional commercial radars make it extremely challenging for transportation mode classification. Thus, we propose to use a high-resolution millimeter-wave(mmWave) radar sensor to obtain a relatively richer radar point cloud representation for a traffic monitoring scenario. Based on a new feature vector, we use the multivariate Gaussian mixture model (GMM) to do the radar point cloud segmentation, i.e. `point-wise' classification, in an unsupervised learning environment. In our experiment, we collected radar point clouds for pedestrians and cars, which also contained the inevitable clutter from the surroundings. The experimental results using GMM on the new feature vector demonstrated a good segmentation performance in terms of the intersection-over-union (IoU) metrics. The detailed methodology and validation metrics are presented and discussed.
3D object recognition is becoming a key desired capability for many computer vision systems such as autonomous vehicles, service robots and surveillance drones to operate more effectively in unstructured environments. These real-time systems require effective classification methods that are robust to sampling resolution, measurement noise, and pose configuration of the objects. Previous research has shown that sparsity, rotation and positional variance of points can lead to a significant drop in the performance of point cloud based classification techniques. In this regard, we propose a novel approach for 3D classification that takes sparse point clouds as input and learns a model that is robust to rotational and positional variance as well as point sparsity. To this end, we introduce new feature descriptors which are fed as an input to our proposed neural network in order to learn a robust latent representation of the 3D object. We show that such latent representations can significantly improve the performance of object classification and retrieval. Further, we show that our approach outperforms PointNet and 3DmFV by 34.4% and 27.4% respectively in classification tasks using sparse point clouds of only 16 points under arbitrary SO(3) rotation.