A variety of real-world tasks involve the classification of images into pre-determined categories. Designing image classification algorithms that exhibit robustness to acquisition noise and image distortions, particularly when the available training data are insufficient to learn accurate models, is a significant challenge. This dissertation explores the development of discriminative models for robust image classification that exploit underlying signal structure, via probabilistic graphical models and sparse signal representations. Probabilistic graphical models are widely used in many applications to approximate high-dimensional data in a reduced complexity set-up. Learning graphical structures to approximate probability distributions is an area of active research. Recent work has focused on learning graphs in a discriminative manner with the goal of minimizing classification error. In the first part of the dissertation, we develop a discriminative learning framework that exploits the complementary yet correlated information offered by multiple representations (or projections) of a given signal/image. Specifically, we propose a discriminative tree-based scheme for feature fusion by explicitly learning the conditional correlations among such multiple projections in an iterative manner. Experiments reveal the robustness of the resulting graphical model classifier to training insufficiency.
This research involves a method and system which integrates multimodal human-computer interaction with reactive planning to operate a telerobot for use as an assistive device. The Multimodal User Supervised Interface and Intelligent Control (MUSIIC) strategy is a novel approach for intelligent assistive telerobotic system. This approach to robotic interaction is both a step towards addressing the problem of allowing individuals with physical disabilities to operate a robot in an unstructured environment and an illustration of general principles of integrating speech-deictic gesture control with a knowledge-driven reactive planner and a stereo-vision system. Introduction While interfacing and control are areas of rehabilitation robotics which have been significantly researched [Foulds, 1986; Gilbert and Trefsger, 1990], unfortunately none of the resulting prototypes have met the requirements of the user community. To achieve effective use by individuals with disabilities, the prototypes have taken two interface approaches: command-based and control-based. In command-based interfaces, the robot is programmed with predefined movements and it is expected that items which the robot is manipulating will be in predetermined locations [Seamone and Schmeisser, 1986; Fu, 1986; Gilbert and Foulds, 1987; Van der loos et al., 1990; Van der loos et al., 1991; Hammel, Van der loose, Perkash, 1991; Beitler, Stanger, Howell, 1994], which limits the use to a preset workstation type environment.
Yang et al.  used Fisher widely advocated for image classification problems. To further Information criterion in their class-specific reconstruction errors sharpen their discriminative capabilities, most state-ofthe-art to compose their approach. DL methods have additional constraints included in Besides SDL, Analysis Dictionary Learning (ADL) [8, 9] the learning stages. These various constraints, however, lead has recently been of interest on account of its fast encoding to additional computational complexity. We hence propose an and stability attributes. ADL provides a linear transformation efficient Discriminative Convolutional Analysis Dictionary of a signal to a nearly sparse representation. Inspired by Learning (DCADL) method, as a lower cost Discriminative the SDL methodology in image classification, ADL has also DL framework, to both characterize the image structures and been adapted to the supervised learning problems by promoting refine the interclass structure representations. The proposed discriminative sparse representations [10, 11]. In , DCADL jointly learns a convolutional analysis dictionary and Guo et al. incorporated both a topological structure and a representation a universal classifier, while greatly reducing the time complexity similarity constraint to encourage a suitable classselective in both training and testing phases, and achieving a representation for a 1-Nearest Neighbor classifier.
Sparse representation based classification (SRC) has gained great success in image recognition. Motivated by the fact that kernel trick can capture the nonlinear similarity of features, which may help improve the separability and margin between nearby data points, we propose Euler SRC for image classification, which is essentially the SRC with Euler sparse representation. To be specific, it first maps the images into the complex space by Euler representation, which has a negligible effect for outliers and illumination, and then performs complex SRC with Euler representation. The major advantage of our method is that Euler representation is explicit with no increase of the image space dimensionality, thereby enabling this technique to be easily deployed in real applications. To solve Euler SRC, we present an efficient algorithm, which is fast and has good convergence. Extensive experimental results illustrate that Euler SRC outperforms traditional SRC and achieves better performance for image classification.
The Gromov-Hausdorff distance provides a metric on the set of isometry classes of compact metric spaces. Unfortunately, computing this metric directly is believed to be computationally intractable. Motivated by applications in shape matching and point-cloud comparison, we study a semidefinite programming relaxation of the Gromov-Hausdorff metric. This relaxation can be computed in polynomial time, and somewhat surprisingly is itself a pseudometric. We describe the induced topology on the set of compact metric spaces. Finally, we demonstrate the numerical performance of various algorithms for computing the relaxed distance and apply these algorithms to several relevant data sets. In particular we propose a greedy algorithm for finding the best correspondence between finite metric spaces that can handle hundreds of points.