Tu, Wenting
Attention Based LSTM for Target Dependent Sentiment Classification
Yang, Min (The University of Hong Kong) | Tu, Wenting (The University of Hong Kong) | Wang, Jingxuan (The University of Hong Kong) | Xu, Fei (Chinese Academy of Sciences) | Chen, Xiaojun (Shenzhen University)
We present an attention-based bidirectional LSTM approach to improve the target-dependent sentiment classification. Our method learns the alignment between the target entities and the most distinguishing features. We conduct extensive experiments on a real-life dataset. The experimental results show that our model achieves state-of-the-art results.
Time-Sensitive Opinion Mining for Prediction
Tu, Wenting (The University of Hong Kong) | Cheung, David (The University of Hong Kong) | Mamoulis, Nikos (The University of Hong Kong)
Users commonly use Web 2.0 platforms to post their opinions and their predictions about future events (e.g., the movement of astock). Therefore, opinion mining can be used as a tool for predicting future events. Previous work on opinion mining extracts from the text only the polarity of opinions as sentiment indicators. We observe that a typical opinion post also contains temporal references which can improve prediction. This short paper presents our preliminary work on extracting reference time tagsand integrating them into an opinion mining model, in order to improvethe accuracy of future event prediction. We conduct anexperimental evaluation using a collection of microblogs posted by investors to demonstrate the effectiveness of our approach.
Improving Microblog Retrieval from Exterior Corpus by Automatically Constructing Microblogging Corpus
Tu, Wenting (The University of Hong Kong) | Cheung, David (The University of Hong Kong) | Mamoulis, Nikos (The University of Hong Kong)
A large-scale training corpus consisting of microblogs belonging to a desired category is important for high-accuracy microblog retrieval. Obtaining such a large-scale microblgging corpus manually is very time and labor-consuming. Therefore, some models for the automatic retrieval of microblogs froman exterior corpus have been proposed. However, these approaches may fail in considering microblog-specific features. To alleviate this issue, we propose a methodology that constructs a simulated microblogging corpus rather than directly building a model from the exterior corpus. The performance of our model is better since the microblog-special knowledge of the microblogging corpus is used in the end by the retrieval model. Experimental results on real-world microblogs demonstrate the superiority of our technique compared to the previous approaches.
Ordering-Sensitive and Semantic-Aware Topic Modeling
Yang, Min (The University of Hong Kong) | Cui, Tianyi (Zhejiang University) | Tu, Wenting (The University of Hong Kong)
Topic modeling of textual corpora is an important and challenging problem. In most previous work, the “bag-of-words” assumption is usually made which ignores the ordering of words. This assumption simplifies the computation, but it unrealistically loses the ordering information and the semantic of words in the context. In this paper, we present a Gaussian Mixture Neural Topic Model (GMNTM) which incorporates both the ordering of words and the semantic meaning of sentences into topic modeling. Specifically, we represent each topic as a cluster of multi-dimensional vectors and embed the corpus into a collection of vectors generated by the Gaussian mixture model. Each word is affected not only by its topic, but also by the embedding vector of its surrounding words and the context. The Gaussian mixture components and the topic of documents, sentences and words can be learnt jointly. Extensive experiments show that our model can learn better topics and more accurate word distributions for each topic. Quantitatively, comparing to state-of-the-art topic modeling approaches, GMNTM obtains significantly better performance in terms of perplexity, retrieval accuracy and classification accuracy.