Goto

Collaborating Authors

 West Virginia University


Utilizing Social Media to Combat Opioid Addiction Epidemic: Automatic Detection of Opioid Users from Twitter

AAAI Conferences

Opioid (e.g., heroin and morphine) addiction has become one of the largest and deadliest epidemics in the United States. To combat such deadly epidemic, in this paper, we propose a novel framework named AutoOPU to automatically detect the opioid users from Twitter, which will assist in sharpening our understanding toward the behavioral process of opioid addiction and treatment. In AutoOPU, to model the users and posted tweets as well as their rich relationships, we first introduce a heterogeneous information network (HIN) for representation. Then we use meta-structure based approach to characterize the semantic relatedness over users. Afterwards, we integrate content-based similarity and relatedness depicted by each meta-structure to formulate a similarity measure over users. Further, we aggregate different similarities using multi-kernel learning, each of which is automatically weighted by the learning algorithm to make predictions. To the best of our knowledge, this is the first work to use multi-kernel learning based on meta-structures over HIN for biomedical knowledge mining, especially in drug-addiction domain. Comprehensive experiments on real sample collections from Twitter are conducted to validate the effectiveness of our developed system AutoOPU in opioid user detection by comparisons with other alternative methods.


Instilling Social to Physical: Co-Regularized Heterogeneous Transfer Learning

AAAI Conferences

Ubiquitous computing tasks, such as human activity recognition (HAR), are enabling a wide spectrum of applications, ranging from healthcare to environment monitoring. The success of a ubiquitous computing task relies on suf๏ฌcient physical sensor data with groundtruth labels, which are always scarce due to the expensive annotating process. Meanwhile, social media platforms provide a lot of social or semantic context information. People share what they are doing and where they are frequently in the messages they post. This rich set of socially shared activities motivates us to transfer knowledge from social media to address the sparsity issue of labelled physical sensor data. In order to transfer the knowledge of social and semantic context, we propose a Co-Regularized Heterogeneous Transfer Learning (CoHTL) model, which builds a common semantic space derived from two heterogeneous domains. Our proposed method outperforms state-of-the-art methods on two ubiquitous computing tasks, namely human activity recognition and region function discovery.


Text Classification with Heterogeneous Information Network Kernels

AAAI Conferences

Text classification is an important problem with many applications. Traditional approaches represent text as a bag-of-words and build classifiers based on this representation. Rather than words, entity phrases, the relations between the entities, as well as the types of the entities and relations carry much more information to represent the texts. This paper presents a novel text as network classification framework, which introduces 1) a structured and typed heterogeneous information networks (HINs) representation of texts, and 2) a meta-path based approach to link texts. We show that with the new representation and links of texts, the structured and typed information of entities and relations can be incorporated into kernels. Particularly, we develop both simple linear kernel and indefinite kernel based on meta-paths in the HIN representation of texts, where we call them HIN-kernels. Using Freebase, a well-known world knowledge base, to construct HIN for texts, our experiments on two benchmark datasets show that the indefinite HIN kernel based on weighted meta-paths outperforms the state-of-the-art methods and other HIN-kernels.


Tracking Idea Flows between Social Groups

AAAI Conferences

In many applications, ideas that are described by a set of words often flow between different groups. To facilitate users in analyzing the flow, we present a method to model the flow behaviors that aims at identifying the lead-lag relationships between word clusters of different user groups. In particular, an improved Bayesian conditional cointegration based on dynamic time warping is employed to learn links between words in different groups. A tensor-based technique is developed to cluster these linked words into different clusters (ideas) and track the flow of ideas. The main feature of the tensor representation is that we introduce two additional dimensions to represent both time and lead-lag relationships. Experiments on both synthetic and real datasets show that our method is more effective than methods based on traditional clustering techniques and achieves better accuracy. A case study was conducted to demonstrate the usefulness of our method in helping users understand the flow of ideas between different user groups on social media.