Plotting

 Hefei University of Technology


Movie Question Answering: Remembering the Textual Cues for Layered Visual Contents

AAAI Conferences

Movies provide us with a mass of visual content as well as attracting stories. Existing methods have illustrated that understanding movie stories through only visual content is still a hard problem. In this paper, for answering questions about movies, we put forward a Layered Memory Network (LMN) that represents frame-level and clip-level movie content by the Static Word Memory module and the Dynamic Subtitle Memory module, respectively. Particularly, we firstly extract words and sentences from the training movie subtitles. Then the hierarchically formed movie representations, which are learned from LMN, not only encode the correspondence between words and visual content inside frames, but also encode the temporal alignment between sentences and frames inside movie clips. We also extend our LMN model into three variant frameworks to illustrate the good extendable capabilities. We conduct extensive experiments on the MovieQA dataset. With only visual content as inputs, LMN with frame-level representation obtains a large performance improvement. When incorporating subtitles into LMN to form the clip-level representation, we achieve the state-of-the-art performance on the online evaluation task of 'Video+Subtitles'. The good performance successfully demonstrates that the proposed framework of LMN is effective and the hierarchically formed movie representations have good potential for the applications of movie question answering.


Hierarchical LSTM for Sign Language Translation

AAAI Conferences

Continuous Sign Language Translation (SLT) is a challenging task due to its specific linguistics under sequential gesture variation without word alignment. Current hybrid HMM and CTC (Connectionist temporal classification) based models are proposed to solve frame or word level alignment. They may fail to tackle the cases with messing word order corresponding to visual content in sentences. To solve the issue, this paper proposes a hierarchical-LSTM (HLSTM) encoder-decoder model with visual content and word embedding for SLT. It tackles different granularities by conveying spatio-temporal transitions among frames, clips and viseme units. It firstly explores spatio-temporal cues of video clips by 3D CNN and packs appropriate visemes by online key clip mining with adaptive variable-length. After pooling on recurrent outputs of the top layer of HLSTM, a temporal attention-aware weighting mechanism is proposed to balance the intrinsic relationship among viseme source positions. At last, another two LSTM layers are used to separately recurse viseme vectors and translate semantic. After preserving original visual content by 3D CNN and the top layer of HLSTM, it shortens the encoding time step of the bottom two LSTM layers with less computational complexity while attaining more nonlinearity. Our proposed model exhibits promising performance on singer-independent test with seen sentences and also outperforms the comparison algorithms on unseen sentences.


Keyphrase Extraction with Sequential Pattern Mining

AAAI Conferences

Existing studies show that extracting a complete keyphrase candidate set is the first and crucial step to extract high quality keyphrases from documents. Based on a common sense that words do not repeatedly appear in an effective keyphrase, we propose a novel algorithm named KCSP for document-specific keyphrase candidate search using sequential pattern mining with gap constraints, which only needs to scan a document once and automatically specifies appropriate gap constraints for words without usersโ€™ participation. The experimental results confirm that it helps improve the quality of keyphrase extraction.


Semantic Connection Based Topic Evolution

AAAI Conferences

Contrary to previous studies on topic evolution that directly extract topics by topic modeling and preset the number of topics, we propose a method of topic evolution based on semantic connection for an adaptive number of topics and rapid responses to the changes of contents. Semantic connection not only indicates the content similarity between documents but also shows the time decay, so semantic connection features can be used to visualize topic evolution, which makes the analyses of changes much easier. Preliminary experimental results demonstrate that our method performs well compared to a state-of-the-art baseline on both qualities of topics and the sensitivity of changes.


Text Simplification Using Neural Machine Translation

AAAI Conferences

Text simplification (TS) is the technique of reducing the lexical, syntactical complexity of text. Existing automatic TS systems can simplify text only by lexical simplification or by manually defined rules. Neural Machine Translation (NMT) is a recently proposed approach for Machine Translation (MT) that is receiving a lot of research interest. In this paper, we regard original English and simplified English as two languages, and apply a NMT modelโ€“Recurrent Neural Network (RNN) encoder-decoder on TS to make the neural network to learn text simplification rules by itself. Then we discuss challenges and strategies about how to apply a NMT model to the task of text simplification.


Fortune Teller: Predicting Your Career Path

AAAI Conferences

People go to fortune tellers in hopes of learning things about their future. A future career path is one of the topics most frequently discussed. But rather than rely on "black arts" to make predictions, in this work we scientifically and systematically study the feasibility of career path prediction from social network data. In particular, we seamlessly fuse information from multiple social networks to comprehensively describe a user and characterize progressive properties of his or her career path. This is accomplished via a multi-source learning framework with fused lasso penalty, which jointly regularizes the source and career-stage relatedness. Extensive experiments on real-world data confirm the accuracy of our model.


Imbalanced Multiple Noisy Labeling for Supervised Learning

AAAI Conferences

When labeling objects via Internet-based outsourcing systems, the labelers may have bias, because they lack expertise, dedication and personal preference. These reasons cause Imbalanced Multiple Noisy Labeling. To deal with the imbalance labeling issue, we propose an agnostic algorithm PLAT (Positive LAbel frequency Threshold) which does not need any information about quality of labelers and underlying class distribution. Simulations on eight real-world datasets with different underlying class distributions demonstrate that PLAT not only effectively deals with the imbalanced multiple noisy labeling problem that off-the-shelf agnostic methods cannot cope with, but also performs nearly the same as majority voting under the circumstances that labelers have no bias.


Subchloroplast Location Prediction via Homolog Knowledge Transfer and Feature Selection

AAAI Conferences

The accuracy of subchloroplast location prediction algorithms often depends on predictive and succinct features derived from proteins. Thus, to improve the prediction accuracy, this paper proposes a novel SubChloroplast location prediction method, called SCHOTS, which integrates the HOmolog knowledge Transfer and feature Selection methods. SCHOTS contains two stages. First, discriminating features are generated by WS-LCHI, a Weighted Gene Ontology (GO) transfer model based on bit-Score of proteins and Logarithmic transformation of CHI-square. Second, the more informative GO terms are selected from the features. Extensive studies conducted on three real datasets demonstrate that SCHOTS outperforms three off-the-shelf subchloroplast prediction methods.


Learning from Concept Drifting Data Streams with Unlabeled Data

AAAI Conferences

Contrary to the previous beliefs that all arrived streaming data are labeled and the class labels are immediately availa- ble, we propose a Semi-supervised classification algorithm for data streams with concept drifts and UNlabeled data, called SUN. SUN is based on an evolved decision tree. In terms of deviation between history concept clusters and new ones generated by a developed clustering algorithm of k-Modes, concept drifts are distinguished from noise at leaves. Extensive studies on both synthetic and real data demonstrate that SUN performs well compared to several known online algorithms on unlabeled data. A conclusion is hence drawn that a feasible reference framework is provided for tackling concept drifting data streams with unlabeled data.