Social media provide a platform for users to express their opinions and share information. Understanding public health opinions on social media, such as Twitter, offers a unique approach to characterizing common health issues such as diabetes, diet, exercise, and obesity (DDEO), however, collecting and analyzing a large scale conversational public health data set is a challenging research task. The goal of this research is to analyze the characteristics of the general public's opinions in regard to diabetes, diet, exercise and obesity (DDEO) as expressed on Twitter. A multi-component semantic and linguistic framework was developed to collect Twitter data, discover topics of interest about DDEO, and analyze the topics. From the extracted 4.5 million tweets, 8% of tweets discussed diabetes, 23.7% diet, 16.6% exercise, and 51.7% obesity. The strongest correlation among the topics was determined between exercise and obesity. Other notable correlations were: diabetes and obesity, and diet and obesity DDEO terms were also identified as subtopics of each of the DDEO topics. The frequent subtopics discussed along with Diabetes, excluding the DDEO terms themselves, were blood pressure, heart attack, yoga, and Alzheimer. The non-DDEO subtopics for Diet included vegetarian, pregnancy, celebrities, weight loss, religious, and mental health, while subtopics for Exercise included computer games, brain, fitness, and daily plan. Non-DDEO subtopics for Obesity included Alzheimer, cancer, and children. With 2.67 billion social media users in 2016, publicly available data such as Twitter posts can be utilized to support clinical providers, public health experts, and social scientists in better understanding common public opinions in regard to diabetes, diet, exercise, and obesity.
Social media analytics allows us to extract, analyze, and establish semantic from user-generated contents in social media platforms. This study utilized a mixed method including a three-step process of data collection, topic modeling, and data annotation for recognizing exercise related patterns. Based on the findings, 86% of the detected topics were identified as meaningful topics after conducting the data annotation process. The most discussed exercise-related topics were physical activity (18.7%), lifestyle behaviors (6.6%), and dieting (4%). The results from our experiment indicate that the exploratory data analysis is a practical approach to summarizing the various characteristics of text data for different health and medical applications.
Analyzing user messages in social media can measure different population characteristics, including public health measures. For example, recent work has correlated Twitter messages with influenza rates in the United States; but this has largely been the extent of mining Twitter for public health. In this work, we consider a broader range of public health applications for Twitter. We apply the recently introduced Ailment Topic Aspect Model to over one and a half million health related tweets and discover mentions of over a dozen ailments, including allergies, obesity and insomnia. We introduce extensions to incorporate prior knowledge into this model and apply it to several tasks: tracking illnesses over times (syndromic surveillance), measuring behavioral risk factors, localizing illnesses by geographic region, and analyzing symptoms and medication usage. We show quantitative correlations with public health data and qualitative evaluations of model output. Our results suggest that Twitter has broad applicability for public health research.
A lack of information exists about the health issues of lesbian, gay, bisexual, transgender, and queer (LGBTQ) people who are often excluded from national demographic assessments, health studies, and clinical trials. As a result, medical experts and researchers lack a holistic understanding of the health disparities facing these populations. Fortunately, publicly available social media data such as Twitter data can be utilized to support the decisions of public health policy makers and managers with respect to LGBTQ people. This research employs a computational approach to collect tweets from gay users on health-related topics and model these topics. To determine the nature of health-related information shared by men who have sex with men on Twitter, we collected thousands of tweets from 177 active users. We sampled these tweets using a framework that can be applied to other LGBTQ sub-populations in future research. We found 11 diseases in 7 categories based on ICD 10 that are in line with the published studies and official reports.
Although there are millions of transgender people in the world, a lack of information exists about their health issues. This issue has consequences for the medical field, which only has a nascent understanding of how to identify and meet this population's health-related needs. Social media sites like Twitter provide new opportunities for transgender people to overcome these barriers by sharing their personal health experiences. Our research employs a computational framework to collect tweets from self-identified transgender users, detect those that are health-related, and identify their information needs. This framework is significant because it provides a macro-scale perspective on an issue that lacks investigation at national or demographic levels. Our findings identified 54 distinct health-related topics that we grouped into 7 broader categories. Further, we found both linguistic and topical differences in the health-related information shared by transgender men (TM) as com-pared to transgender women (TW). These findings can help inform medical and policy-based strategies for health interventions within transgender communities. Also, our proposed approach can inform the development of computational strategies to identify the health-related information needs of other marginalized populations.