It is widely recognized that combining multiple classification or regression models typically provides superior results compared to using a single, well-tuned model. However, there are no well known approaches to combining multiple nonhierarchical clusterings. The idea of combining cluster labelings without accessing the original features leads us to a general knowledge reuse framework that we call cluster ensembles. Our contribution in this paper is to formally define the cluster ensemble problem as an optimization problem and to propose three effective and efficient combiners for solving it based on a hypergraph model. Results on synthetic as well as real data sets are given to show that cluster ensembles can (i) improve quality and robustness, and (ii) enable distributed clustering.
This paper presents a novel people detection and tracking method based on a multi-modal sensor fusion approach that utilizes 2D laser range and camera data. The data points in the laser scans are clustered using a novel graph-based method and an SVM based version of the cascaded AdaBoost classifier is trained with a set of geometrical features of these clusters. In the detection phase, the classified laser data is projected into the camera image to define a region of interest for the vision-based people detector. This detector is a fast version of the Implicit Shape Model (ISM) that learns an appearance codebook of local SIFT descriptors from a set of hand-labeled images of pedestrians and uses them in a voting scheme to vote for centers of detected people. The extension consists in a fast and detailed analysis of the spatial distribution of voters per detected person. Each detected person is tracked using a greedy data association method and multiple Extended Kalman Filters that use different motion models. This way, the filter can cope with a variety of different motion patterns. The tracker is asynchronously updated by the detections from the laser and the camera data. Experiments conducted in real-world outdoor scenarios with crowds of pedestrians demonstrate the usefulness of our approach.
In this paper, we address the problem of continually parsing a stream of 3D point cloud data acquired from a laser sensor mounted on a road vehicle. We leverage an online star clustering algorithm coupled with an incremental belief update in an evolving undirected graphical model. The fusion of these techniques allows the robot to parse streamed data and to continually improve its understanding of the world. The core competency produced is an ability to infer object classes from similarities based on appearance and shape features, and to concurrently combine that with a spatial smoothing algorithm incorporating geometric consistency. This formulation of feature-space star clustering modulating the potentials of a spatial graphical model is entirely novel. In our method, the two sources of information: feature similarity and geometrical consistency are fed continu- ally into the system, improving the belief over the class distributions as new data arrives. The algorithm obviates the need for hand-labeled training data and makes no apriori assumptions on the number or characteristics of object categories. Rather, they are learnt incrementally over time from streamed input data. In experiments per- formed on real 3D laser data from an outdoor scene, we show that our approach is capable of obtaining an ever- improving unsupervised scene categorization.
For many of the state-of-the-art computer vision algorithms, image segmentation is an important preprocessing step. As such, several image segmentation algorithms have been proposed, however, with certain reservation due to high computational load and many hand-tuning parameters. Correlation clustering, a graph-partitioning algorithm often used in natural language processing and document clustering, has the potential to perform better than previously proposed image segmentation algorithms. We improve the basic correlation clustering formulation by taking into account higher-order cluster relationships. This improves clustering in the presence of local boundary ambiguities. We first apply the pairwise correlation clustering to image segmentation over a pairwise superpixel graph and then develop higher-order correlation clustering over a hypergraph that considers higher-order relations among superpixels. Fast inference is possible by linear programming relaxation, and also effective parameter learning framework by structured support vector machine is possible. Experimental results on various datasets show that the proposed higher-order correlation clustering outperforms other state-of-the-art image segmentation algorithms.
This paper addresses the task of unsupervised feature learning for three-dimensional occupancy mapping, as a way to segment higher-level structures based on raw unorganized point cloud data. In particular, we focus on detecting planar surfaces, which are common in most structured or semi-structured environments. This segmentation is then used to minimize the amount of parameters necessary to properly create a 3D occupancy model of the surveyed space, thus increasing computational speed and decreasing memory requirements. As the 3D modeling tool, an extension to Hilbert Maps was selected, since it naturally uses a feature-based representation of the environment to achieve real-time performance. Experiments conducted in simulated and real large-scale datasets show a substantial gain in performance, while decreasing the amount of stored information by orders of magnitude without sacrificing accuracy.