Goto

Collaborating Authors

 Harbin Institute of Technology


Learning Heterogeneous Dictionary Pair with Feature Projection Matrix for Pedestrian Video Retrieval via Single Query Image

AAAI Conferences

Person re-identification (re-id) plays an important role in video surveillance and forensics applications. In many cases, person re-id needs to be conducted between image and video clip, e.g., re-identifying a suspect from large quantities of pedestrian videos given a single image of him. We call re-id in this scenario as image to video person re-id (IVPR). In practice, image and video are usually represented with different features, and there usually exist large variations between frames within each video. These factors make matching between image and video become a very challenging task. In this paper, we propose a joint feature projection matrix and heterogeneous dictionary pair learning (PHDL) approach for IVPR. Specifically, PHDL jointly learns an intra-video projection matrix and a pair of heterogeneous image and video dictionaries. With the learned projection matrix, the influence of variations within each video to the matching can be reduced. With the learned dictionary pair, the heterogeneous image and video features can be transformed into coding coefficients with the same dimension, such that the matching can be conducted using coding coefficients. Furthermore, to ensure that the obtained coding coefficients have favorable discriminability, PHDL designs a point-to-set coefficient discriminant term. Experiments on the public iLIDS-VID and PRID 2011 datasets demonstrate the effectiveness of the proposed approach.


Affective Computing and Applications of Image Emotion Perceptions

AAAI Conferences

Images can convey rich semantics and evoke strong emotions in viewers. The research of my PhD thesis focuses on image emotion computing (IEC), which aims to predict the emotion perceptions of given images. The development of IEC is greatly constrained by two main challenges: affective gap and subjective evaluation. Previous works mainly focused on finding features that can express emotions better to bridge the affective gap, such as elements-of-art based features and shape features. According to the emotion representation models, including categorical emotion states (CES) and dimensional emotion space (DES), three different tasks are traditionally performed on IEC: affective image classification, regression and retrieval. The state-of-the-art methods on the three above tasks are image-centric, focusing on the dominant emotions for the majority of viewers. For my PhD thesis, I plan to answer the following questions: (1) Compared to the low-level elements-of-art based features, can we find some higher level features that are more interpretable and have stronger link to emotions? (2) Are the emotions that are evoked in viewers by an image subjective and different? If they are, how can we tackle the user-centric emotion prediction? (3) For image-centric emotion computing, can we predict the emotion distribution instead of the dominant emotion category?


Two-Stream Contextualized CNN for Fine-Grained Image Classification

AAAI Conferences

Human's cognition system prompts that context information provides potentially powerful clue while recognizing objects. However, for fine-grained image classification, the contribution of context may vary over different images, and sometimes the context even confuses the classification result. To alleviate this problem, in our work, we develop a novel approach, two-stream contextualized Convolutional Neural Network, which provides a simple but efficient context-content joint classification model under deep learning framework. The network merely requires the raw image and a coarse segmentation as input to extract both content and context features without need of human interaction. Moreover, our network adopts a weighted fusion scheme to combine the content and the context classifiers, while a subnetwork is introduced to adaptively determine the weight for each image. According to our experiments on public datasets, our approach achieves considerable high recognition accuracy without any tedious human's involvements, as compared with the state-of-the-art approaches.


BRBA: A Blocking-Based Association Rule Hiding Method

AAAI Conferences

Privacy preserving in association mining is an important research topic in the database security field. This paper has proposed a blocking-based method to solve the association rule hiding problem for data sharing. It aims at reducing undesirable side effects and increasing desirable side effects, while ensuring to conceal all sensitive rules. The candidate transactions are selected for sanitization based on their relations with border rules. Comparative experiments on real datasets demonstrate that the proposed method can achieve its goals.


Write-righter: An Academic Writing Assistant System

AAAI Conferences

Writing academic articles in English is a challenging task for non-native speakers, as more effort has to be spent to enhance their language expressions. This paper presents an academic writing assistant system called Write-righter, which can provide real-time hint and recommendation by analyzing the input context. To achieve this goal, some novel strategies, e.g., semantic extension based sentence retrieval and LDA based sentence structure identification have been proposed. Write-righter is expected to help people express their ideas correctly by recommending top N most possible expressions.


Coupled Dictionary Learning for Unsupervised Feature Selection

AAAI Conferences

Hence, manifold regularization terminals and social networks, mountains of highdimensional is used in unsupervised feature selection algorithms to preserve data explosively emerge and grow. Curse of dimensionality sample similarity (Li et al. 2012; Tang and Liu 2012; leads to great storage burden, high time complexity Wang, Tang, and Liu 2015). Similar to the class labels in supervised and failure of the classic learning machines (Wolf cases, cluster structure indicates the affiliation relations and Shashua 2005). Feature selection searches the most of samples, and it can be discovered by spectral clustering representative and discriminative features by keeping the (SPEC (Zhao and Liu 2007), MCFS (Cai, Zhang, and He data properties and removing the redundancy. According to 2010), matrix factorization (NDFS (Li et al. 2012), RUFS the availability of the label information, feature selection (Qian and Zhai 2013), EUFS (Wang, Tang, and Liu 2015) can be categorized into unsupervised (He, Cai, and Niyogi) or linear predictors (UDFS (Yang et al. 2011), JELSR 2005), semi-supervised (Benabdeslem and Hindawi 2014), (Hou et al. 2011)).


User-Centric Affective Computing of Image Emotion Perceptions

AAAI Conferences

We propose to predict the personalized emotion perceptions of images for each viewer. Different factors that may influence emotion perceptions, including visual content, social context, temporal evolution, and location influence are jointly investigated via the presented rolling multi-task hypergraph learning. For evaluation, we set up a large scale image emotion dataset from Flickr, named Image-Emotion-Social-Net, with over 1 million images and about 8,000 users. Experiments conducted on this dataset demonstrate the superiority of the proposed method, as compared to state-of-the-art.


A Representation Learning Framework for Multi-Source Transfer Parsing

AAAI Conferences

Cross-lingual model transfer has been a promising approach for inducing dependency parsers for low-resource languages where annotated treebanks are not available. The major obstacles for the model transfer approach are two-fold: 1. Lexical features are not directly transferable across languages; 2. Target language-specific syntactic structures are difficult to be recovered. To address these two challenges, we present a novel representation learning framework for multi-source transfer parsing. Our framework allows multi-source transfer parsing using full lexical features straightforwardly. By evaluating on the Google universal dependency treebanks (v2.0), our best models yield an absolute improvement of 6.53% in averaged labeled attachment score, as compared with delexicalized multi-source transfer models. We also significantly outperform the state-of-the-art transfer system proposed most recently.


User Modeling with Neural Network for Review Rating Prediction

AAAI Conferences

We present a neural network method for review rating prediction in this paper. Existing neural network methods for sentiment prediction typically only capture the semantics of texts, but ignore the user who expresses the sentiment.This is not desirable for review rating prediction as each user has an influence on how to interpret the textual content of a review.For example, the same word (e.g. good) might indicate different sentiment strengths when written by different users. We address this issue by developing a new neural network that takes user information into account. The intuition is to factor in user-specific modification to the meaning of a certain word.Specifically, we extend the lexical semantic composition models and introduce a user-word composition vector model (UWCVM), which effectively captures how user acts as a function affecting the continuous word representation. We integrate UWCVM into a supervised learning framework for review rating prediction, andconduct experiments on two benchmark review datasets.Experimental results demonstrate the effectiveness of our method. It shows superior performances over several strong baseline methods.


Modeling Mention, Context and Entity with Neural Networks for Entity Disambiguation

AAAI Conferences

Given a query consisting of a mention (name string) and a background document,entity disambiguation calls for linking the mention to an entity from reference knowledge base like Wikipedia.Existing studies typically use hand-crafted features to represent mention, context and entity, which is labor-intensive and weak to discover explanatory factors of data.In this paper, we address this problem by presenting a new neural network approach.The model takes consideration of the semantic representations of mention, context and entity, encodes them in continuous vector space and effectively leverages them for entity disambiguation.Specifically, we model variable-sized contexts with convolutional neural network, and embed the positions of context words to factor in the distance between context word and mention.Furthermore, we employ neural tensor network to model the semantic interactions between context and mention.We conduct experiments for entity disambiguation on two benchmark datasets from TAC-KBP 2009 and 2010.Experimental results show that our method yields state-of-the-art performances on both datasets.