Goto

Collaborating Authors

 Asia


Deep Representation Learning with Target Coding

AAAI Conferences

We consider the problem of learning deep representation when target labels are available. In this paper, we show that there exists intrinsic relationship between target coding and feature representation learning in deep networks. Specifically, we found that distributed binary acode with error correcting capability is more capable of encouraging discriminative features, in comparison tothe 1-of-K coding that is typically used in supervised deep learning. This new finding reveals additional benefit of using error-correcting code for deep model learning,apart from its well-known error correcting property. Extensive experiments are conducted on popular visual benchmark datasets.


Complex Event Detection via Event Oriented Dictionary Learning

AAAI Conferences

Complex event detection is a retrieval task with the goal of finding videos of a particular event in a large-scale unconstrained internet video archive, given example videos and text descriptions. Nowadays, different multimodal fusion schemes of low-level and high-level features are extensively investigated and evaluated for the complex event detection task. However, how to effectively select the high-level semantic meaningful concepts from a large pool to assist complex event detection is rarely studied in the literature. In this paper, we propose two novel strategies to automatically select semantic meaningful concepts for the event detection task based on both the events-kit text descriptions and the concepts high-level feature descriptions. Moreover, we introduce a novel event oriented dictionary representation based on the selected semantic concepts. Towards this goal, we leverage training samples of selected concepts from the Semantic Indexing (SIN) dataset with a pool of 346 concepts, into a novel supervised multi-task dictionary learning framework. Extensive experimental results on TRECVID Multimedia Event Detection (MED) dataset demonstrate the efficacy of our proposed method.


Multi-View Point Registration via Alternating Optimization

AAAI Conferences

Multi-view point registration is a relatively less studied problem compared with two-view point registration. Directly applying pairwise registration often leads to matching discrepancy as the mapping between two point sets can be determined either by direct correspondences or by any intermediate point set. Also, the local two-view registration tends to be sensitive to noises. We propose a novel multi-view registration method, where the optimal registration is achieved via an efficient and effective alternating concave minimization process. We further extend our solution to a general case in practice of registration among point sets with different cardinalities. Extensive empirical evaluations of peer methods on both synthetic data and real images suggest our method is robust to large disturbance. In particular, it is shown that our method outperforms peer point matching methods and performs competitively against graph matching approaches. The latter approaches utilize the additional second-order information at the cost of exponentially increased run-time, thus usually being less efficient.


Robust Subspace Clustering via Thresholding Ridge Regression

AAAI Conferences

Given a data set from a union of multiple linear subspaces, a robust subspace clustering algorithm fits each group of data points with a low-dimensional subspace and then clusters these data even though they are grossly corrupted or sampled from the union of dependent subspaces. Under the framework of spectral clustering, recent works using sparse representation, low rank representation and their extensions achieve robust clustering results by formulating the errors (e.g., corruptions) into their objective functions so that the errors can be removed from the inputs. However, these approaches have suffered from the limitation that the structure of the errors should be known as the prior knowledge. In this paper, we present a new method of robust subspace clustering by eliminating the effect of the errors from the projection space (representation) rather than from the input space. We firstly prove that ell_1-, ell_2-, and ell_infty-norm-based linear projection spaces share the property of intra-subspace projection dominance, i.e., the coefficients over intra-subspace data points are larger than those over inter-subspace data points. Based on this property, we propose a robust and efficient subspace clustering algorithm, called Thresholding Ridge Regression (TRR). TRR calculates the ell2-norm-based coefficients of a given data set and performs a hard thresholding operator; and then the coefficients are used to build a similarity graph for clustering. Experimental studies show that TRR outperforms the state-of-the-art methods with respect to clustering quality, robustness, and time-saving.


Automatic Topic Discovery for Multi-Object Tracking

AAAI Conferences

This paper proposes a new approach to multi-object tracking by semantic topic discovery. We dynamically cluster frame-by-frame detections and treat objects as topics, allowing the application of the Dirichlet Process Mixture Model (DPMM). The tracking problem is cast as a topic-discovery task where the video sequence is treated analogously to a document. This formulation addresses tracking issues such as object exclusivity constraints as well as cannot-link constraints which are integrated without the need for heuristic thresholds. The video is temporally segmented into epochs to model the dynamics of word (superpixel) co-occurrences and to model the temporal damping effect. In experiments on public data sets we demonstrate the effectiveness of the proposed algorithm.


Sparse Deep Stacking Network for Image Classification

AAAI Conferences

Sparse coding can learn good robust representation to noise and model more higher-order representation for image classification. However, the inference algorithm is computationally expensive even though the supervised signals are used to learn compact and discriminative dictionaries in sparse coding techniques. Luckily, a simplified neural network module (SNNM) has been proposed to directly learn the discriminative dictionaries for avoiding the expensive inference. But the SNNM module ignores the sparse representations. Therefore, we propose a sparse SNNM module by adding the mixed-norm regularization (l1/l2 norm). The sparse SNNM modules are further stacked to build a sparse deep stacking network (S-DSN). In the experiments, we evaluate S-DSN with four databases, including Extended YaleB, AR, 15 scene and Caltech101. Experimental results show that our model outperforms related classification methods with only a linear classifier. It is worth noting that we reach 98.8% recognition accuracy on 15 scene.


A Local Sparse Model for Matching Problem

AAAI Conferences

Feature matching problem that incorporates pairwise constraints is usually formulated as a quadratic assignment problem (QAP). Since it is NP-hard, relaxation models are required. In this paper, we first formulate the QAP from the match selection point of view; and then propose a local sparse model for matching problem. Our local sparse matching (LSM) method has the following advantages: (1) It is parameter-free; (2) It generates a local sparse solution which is closer to a discrete matrix than most other continuous relaxation methods for the matching problem. (3) The one-to-one matching constraints are better maintained in LSM solution. Promising experimental results show the effectiveness of the Proposed LSM method.


Compute Less to Get More: Using ORC to Improve Sparse Filtering

AAAI Conferences

Sparse Filtering is a popular feature learning algorithm for image classification pipelines. In this paper, we connect the performance of Sparse Filtering with spectral properties of the corresponding feature matrices. This connection provides new insights into Sparse Filtering; in particular, it suggests early stopping of Sparse Filtering. We therefore introduce the Optimal Roundness Criterion (ORC), a novel stopping criterion for Sparse Filtering. We show that this stopping criterion is related with pre-processing procedures such as Statistical Whitening and demonstrate that it can make image classification with Sparse Filtering considerably faster and more accurate.


Learning Predictable and Discriminative Attributes for Visual Recognition

AAAI Conferences

Utilizing attributes for visual recognition has attracted increasingly interest because attributes can effectively bridge the semantic gap between low-level visual features and high-level semantic labels. In this paper, we propose a novel method for learning predictable and discriminative attributes. Specifically, we require the learned attributes can be reliably predicted from visual features, and discover the inherent discriminative structure of data. In addition, we propose to exploit the intra-category locality of data to overcome the intra-category variance in visual data. We conduct extensive experiments on Animals with Attributes (AwA) and Caltech256 datasets, and the results demonstrate that the proposed method achieves state-of-the-art performance.


Building Effective Representations for Sketch Recognition

AAAI Conferences

As the popularity of touch-screen devices, understanding a user's hand-drawn sketch has become an increasingly important research topic in artificial intelligence and computer vision. However, different from natural images, the hand-drawn sketches are often highly abstract, with sparse visual information and large intra-class variance, making the problem more challenging. In this work, we study how to build effective representations for sketch recognition. First, to capture saliency patterns of different scales and spatial arrangements, a Gabor-based low-level representation is proposed. Then, based on this representation, to discovery more complex patterns in a sketch, a Hybrid Multilayer Sparse Coding (HMSC) model is proposed to learn mid-level representations. An improved dictionary learning algorithm is also leveraged in HMSC to reduce overfitting to common but trivial patterns. Extensive experiments show that the proposed representations are highly discriminative and lead to large improvements over the state of the arts.