Goto

Collaborating Authors

 University of Electronic Science and Technology of China


Deep Region Hashing for Generic Instance Search from Images

AAAI Conferences

Instance Search (INS) is a fundamental problem for many applications, while it is more challenging comparing to traditional image search since the relevancy is defined at the instance level. Existing works have demonstrated the success of many complex ensemble systems that are typically conducted by firstly generating object proposals, and then extracting handcrafted and/or CNN features of each proposal for matching. However, object bounding box proposals and feature extraction are often conducted in two separated steps, thus the effectiveness of these methods collapses. Also, due to the large amount of generated proposals, matching speed becomes the bottleneck that limits its application to large-scale datasets. To tackle these issues, in this paper we propose an effective and efficient Deep Region Hashing (DRH) approach for large-scale INS using an image patch as the query. Specifically, DRH is an end-to-end deep neural network which consists of object proposal, feature extraction, and hash code generation. DRH shares full-image convolutional feature map with the region proposal network, thus enabling nearly cost-free region proposals. Also, each high-dimensional, real-valued region features are mapped onto a low-dimensional, compact binary codes for the efficient object region level matching on large-scale dataset. Experimental results on four datasets show that our DRH can achieve even better performance than the state-of-the-arts in terms of mAP, while the efficiency is improved by nearly 100 times.


Sparse Modeling-Based Sequential Ensemble Learning for Effective Outlier Detection in High-Dimensional Numeric Data

AAAI Conferences

The large proportion of irrelevant or noisy features in real-life high-dimensional data presents a significant challenge to subspace/feature selection-based high-dimensional outlier detection (a.k.a. outlier scoring) methods. These methods often perform the two dependent tasks: relevant feature subset search and outlier scoring independently, consequently retaining features/subspaces irrelevant to the scoring method and downgrading the detection performance. This paper introduces a novel sequential ensemble-based framework SEMSE and its instance CINFO to address this issue. SEMSE learns the sequential ensembles to mutually refine feature selection and outlier scoring by iterative sparse modeling with outlier scores as the pseudo target feature. CINFO instantiates SEMSE by using three successive recurrent components to build such sequential ensembles. Given outlier scores output by an existing outlier scoring method on a feature subset, CINFO first defines a Cantelli's inequality-based outlier thresholding function to select outlier candidates with a false positive upper bound. It then performs lasso-based sparse regression by treating the outlier scores as the target feature and the original features as predictors on the outlier candidate set to obtain a feature subset that is tailored for the outlier scoring method. Our experiments show that two different outlier scoring methods enabled by CINFO (i) perform significantly better on 11 real-life high-dimensional data sets, and (ii) have much better resilience to noisy features, compared to their bare versions and three state-of-the-art competitors. The source code of CINFO is available at https://sites.google.com/site/gspangsite/sourcecode.


Discovering and Distinguishing Multiple Visual Senses for Polysemous Words

AAAI Conferences

To reduce the dependence on labeled data, there have been increasing research efforts on learning visual classifiers by exploiting web images. One issue that limits their performance is the problem of polysemy. To solve this problem, in this work, we present a novel framework that solves the problem of polysemy by allowing sense-specific diversity in search results. Specifically, we first discover a list of possible semantic senses to retrieve sense-specific images. Then we merge visual similar semantic senses and prune noises by using the retrieved images. Finally, we train a visual classifier for each selected semantic sense and use the learned sense-specific classifiers to distinguish multiple visual senses. Extensive experiments on classifying images into sense-specific categories and re-ranking search results demonstrate the superiority of our proposed approach.


High Rank Matrix Completion With Side Information

AAAI Conferences

We address the problem of high-rank matrix completion with side information. In contrast to existing work dealing with side information, which assume that the data matrix is low-rank, we consider the more general scenario where the columns of the data matrix are drawn from a union of low-dimensional subspaces, which can lead to a high rank matrix. Our goal is to complete the matrix while taking advantage of the side information. To do so, we use the self-expressive property of the data, searching for a sparse representation of each column of matrix as a combination of a few other columns. More specifically, we propose a factorization of the data matrix as the product of side information matrices with an unknown interaction matrix, under which each column of the data matrix can be reconstructed using a sparse combination of other columns. As our proposed optimization, searching for missing entries and sparse coefficients, is non-convex and NP-hard, we propose a lifting framework, where we couple sparse coefficients and missing values and define an equivalent optimization that is amenable to convex relaxation. We also propose a fast implementation of our convex framework using a Linearized Alternating Direction Method. By extensive experiments on both synthetic and real data, and, in particular, by studying the problem of multi-label learning, we demonstrate that our method outperforms existing techniques in both low-rank and high-rank data regimes.


Unified Spectral Clustering With Optimal Graph

AAAI Conferences

Spectral clustering has found extensive use in many areas. Most traditional spectral clustering algorithms work in three separate steps: similarity graph construction; continuous labels learning; discretizing the learned labels by k-means clustering. Such common practice has two potential flaws, which may lead to severe information loss and performance degradation. First, predefined similarity graph might not be optimal for subsequent clustering. It is well-accepted that similarity graph highly affects the clustering results. To this end, we propose to automatically learn similarity information from data and simultaneously consider the constraint that the similarity matrix has exact c connected components if there are c clusters. Second, the discrete solution may deviate from the spectral solution since k-means method is well-known as sensitive to the initialization of cluster centers. In this work, we transform the candidate solution into a new one that better approximates the discrete one. Finally, those three subtasks are integrated into a unified framework, with each subtask iteratively boosted by using the results of the others towards an overall optimal solution. It is known that the performance of a kernel method is largely determined by the choice of kernels. To tackle this practical problem of how to select the most suitable kernel for a particular data set, we further extend our model to incorporate multiple kernel learning ability. Extensive experiments demonstrate the superiority of our proposed method as compared to existing clustering approaches.


Learning With Incomplete Labels

AAAI Conferences

For many real-world tagging problems, training labels are usually obtained through social tagging and are notoriously incomplete. Consequently, handling data with incomplete labels has become a difficult challenge, which usually leads to a degenerated performance on label prediction. To improve the generalization performance, in this paper, we first propose the Improved Cross-View learning (referred as ICVL) model, which considers both global and local patterns of label relationship to enrich the original label set. Further, by extending the ICVL model with an outlier detection mechanism, we introduce the Improved Cross-View learning with Outlier Detection (referred as ICVL-OD) model to remove the abnormal tags resulting from label enrichment. Extensive evaluations on three benchmark datasets demonstrate that ICVL and ICVL-OD outstand with superior performances in comparison with the competing methods.


A Fast Algorithm to Compute Maximum k -Plexes in Social Network Analysis

AAAI Conferences

A clique model is one of the most important techniques on the cohesive subgraph detection; however, its applications are rather limited due to restrictive conditions of the model. Hence much research resorts to k -plex โ€” a graph in which any vertex is adjacent to all but at most k vertices โ€” which is a relaxation model of the clique. In this paper, we study the maximum k -plex problem and propose a fast algorithm to compute maximum k -plexes by exploiting structural properties of the problem. In an n -vertex graph, the algorithm computes optimal solutions in c n n O(1) time for a constant c < 2 depending only on k . To the best of our knowledge, this is the first algorithm that breaks the trivial theoretical bound of 2 n for each k โ‰ฅ 3. We also provide experimental results over multiple real-world social network instances in support.


Event Video Mashup: From Hundreds of Videos to Minutes of Skeleton

AAAI Conferences

The explosive growth of video content on the Web has been revolutionizing the way people share, exchange and perceive information, such as events. While an individual video usually concerns a specific aspect of an event, the videos that are uploaded by different users at different locations and times can embody different emphasis and compensate each other in describing the event. Combining these videos from different sources together can unveil a more complete picture of the event. Simply concatenating videos together is an intuitive solution, but it may degrade user experience since it is time-consuming and tedious to view those highly redundant, noisy and disorganized content. Therefore, we develop a novel approach, termed event video mashup (EVM), to automatically generate a unified short video from a collection of Web videos to describe the storyline of an event. We propose a submodular based content selection model that embodies both importance and diversity to depict the event from comprehensive aspects in an efficient way. Importantly, the video content is organized temporally and semantically conforming to the event evolution. We evaluate our approach on a real-world YouTube event dataset collected by ourselves. The extensive experimental results demonstrate the effectiveness of the proposed framework.


Discrete Personalized Ranking for Fast Collaborative Filtering from Implicit Feedback

AAAI Conferences

Personalized ranking is usually considered as an ultimate goal of recommendation systems, but it suffers from efficiency issues when making recommendations. To this end, we propose a learning-based hashing framework called Discrete Personalized Ranking (DPR), to map users and items to a Hamming space, where user-item affinity can be efficiently calculated via Hamming distance. Due to the existence of discrete constraints, it is possible to exploit a two-stage learning procedure for learning binary codes according to most existing methods. This two-stage procedure consists of relaxed optimization by discarding discrete constraints and subsequent binary quantization. However, such a procedure has been shown resulting in a large quantization loss, so that longer binary codes would be required. To this end, DPR directly tackles the discrete optimization problem of personalized ranking. And the balance and un-correlation constraints of binary codes are imposed to derive compact but informatics binary codes. Based on the evaluation on several datasets, the proposed framework shows consistent superiority to the competing baselines even though only using shorter binary code.


Maximizing the Probability of Arriving on Time: A Practical Q-Learning Method

AAAI Conferences

The stochastic shortest path problem is of crucial importance for the development of sustainable transportation systems. Existing methods based on the probability tail model seek for the path that maximizes the probability of arriving at the destination before a deadline. However, they suffer from low accuracy and/or high computational cost. We design a novel Q-learning method where the converged Q-values have the practical meaning as the actual probabilities of arriving on time so as to improve accuracy. By further adopting dynamic neural networks to learn the value function, our method can scale well to large road networks with arbitrary deadlines. Experimental results on real road networks demonstrate the significant advantages of our method over other counterparts.