Goto

Collaborating Authors

 Li, Lin


The combination of context information to enhance simple question answering

arXiv.org Artificial Intelligence

Abstract--With the rapid development of knowledge base, question answering based on knowledge base has been a hot research issue. In this paper, we focus on answering singlerelation factoid questions based on knowledge base. We build a question answering system and study the effect of context information on fact selection, such as entity's notable type, outdegree. Experimental results show that context information can improve the result of simple question answering. Question answering (QA) is a classic natural language processing task, which aims at building systems that automatically answer questions formulated in natural language [1]. In recent years, several large-scale general purpose knowledge bases (KBs) have been constructed, including Freebase [2], YAGO [3], DBpedia [4] and Wikidata [5] .


Enhancing RNN Based OCR by Transductive Transfer Learning From Text to Images

AAAI Conferences

This paper presents a novel approach for optical character recognition (OCR) on acceleration and to avoid underfitting by text. Previously proposed OCR models typically take much time in the training phase and require large amount of labelled data to avoid underfitting. In contrast, our method does not require such condition. This is a challenging task related to transferring the character sequential relationship from text to OCR. We build a model based on transductive transfer learning to achieve domain adaptation from text to image. We thoroughly evaluate our approach on different datasets, including a general one and a relatively small one. We also compare the performance of our model with the general OCR model on different circumstances. We show that (1) our approach accelerates the training phase 20-30% on time cost; and (2) our approach can avoid underfitting while model is trained on a small dataset.


Modeling Group Dynamics Using Probabilistic Tensor Decompositions

arXiv.org Machine Learning

We propose a probabilistic modeling framework for learning the dynamic patterns in the collective behaviors of social agents and developing profiles for different behavioral groups, using data collected from multiple information sources. The proposed model is based on a hierarchical Bayesian process, in which each observation is a finite mixture of an set of latent groups and the mixture proportions (i.e., group probabilities) are drawn randomly. Each group is associated with some distributions over a finite set of outcomes. Moreover, as time evolves, the structure of these groups also changes; we model the change in the group structure by a hidden Markov model (HMM) with a fixed transition probability. We present an efficient inference method based on tensor decompositions and the expectation-maximization (EM) algorithm for parameter estimation.


Sensing Subjective Well-being from Social Media

arXiv.org Artificial Intelligence

Subjective Well-being(SWB), which refers to how people experience the quality of their lives, is of great use to public policy-makers as well as economic, sociological research, etc. Traditionally, the measurement of SWB relies on time-consuming and costly self-report questionnaires. Nowadays, people are motivated to share their experiences and feelings on social media, so we propose to sense SWB from the vast user generated data on social media. By utilizing 1785 users' social media data with SWB labels, we train machine learning models that are able to "sense" individual SWB from users' social media. Our model, which attains the state-by-art prediction accuracy, can then be used to identify SWB of large population of social media users in time with very low cost.


Identifying Domain-Dependent Influential Microblog Users: A Post-Feature Based Approach

AAAI Conferences

Users of a social network like to follow the posts published by influential users. Such posts usually are delivered quickly and thus will produce a strong influence on public opinions. In this paper, we focus on the problem of identifying domain-dependent influential users(or topic experts). Some of traditional approaches are based on the post contents of users user’s to identify influential users, which may be biased by spammers who try to make posts related to some topics through a simple copy and paste. Others make use of user authentication information given by a service platform or user self description (introduction or label) in finding influential users. However, what users have published is not necessarily related to what they have registed and described. In addition, if there is no comments from other users, it’s less objective to assess a user’s post quality. To improve effectiveness of recognizing influential users in a topic of microblogs, we propose a post-feature based approach which is supplementary to post-content based approaches. Our experimental results show that the post-feature based approach produces relatively higher precision than that of the content based approach.


Recommending Related Microblogs: A Comparison Between Topic and WordNet based Approaches

AAAI Conferences

Computing similarity between short microblogs is an important step in microblog recommendation. In this paper, we investigate a topic based approach and a WordNet based approach to estimate similarity scores between microblogs and recommend top related ones to users. Empirical study is conducted to compare their recommendation effectiveness using two evaluation measures. The results show that the WordNet based approach has relatively higher precision than that of the topic based approach using 548 tweets as dataset. In addition, the Kendall tau distance between two lists recommended by WordNet and topic approaches is calculated. Its average of all the 548 pair lists tells us the two approaches have the relative high disaccord in the ranking of related tweets.