Knowledge Graphs (KGs) have gained considerable attention recently from both academia and industry. In fact, incorporating graph technology and the copious of various graph datasets have led the research community to build sophisticated graph analytics tools. Therefore, the application of KGs has extended to tackle a plethora of real-life problems in dissimilar domains. Despite the abundance of the currently proliferated generic KGs, there is a vital need to construct domain-specific KGs. Further, quality and credibility should be assimilated in the process of constructing and augmenting KGs, particularly those propagated from mixed-quality resources such as social media data. This paper presents a novel credibility domain-based KG Embedding framework. This framework involves capturing a fusion of data obtained from heterogeneous resources into a formal KG representation depicted by a domain ontology. The proposed approach makes use of various knowledge-based repositories to enrich the semantics of the textual contents, thereby facilitating the interoperability of information. The proposed framework also embodies a credibility module to ensure data quality and trustworthiness. The constructed KG is then embedded in a low-dimension semantically-continuous space using several embedding techniques. The utility of the constructed KG and its embeddings is demonstrated and substantiated on link prediction, clustering, and visualisation tasks.
While social networks can provide an ideal platform for up-to-date information from individuals across the world, it has also proved to be a place where rumours fester and accidental or deliberate misinformation often emerges. In this article, we aim to support the task of making sense from social media data, and specifically, seek to build an autonomous message-classifier that filters relevant and trustworthy information from Twitter. For our work, we collected about 100 million public tweets, including users' past tweets, from which we identified 72 rumours (41 true, 31 false). We considered over 80 trustworthiness measures including the authors' profile and past behaviour, the social network connections (graphs), and the content of tweets themselves. We ran modern machine-learning classifiers over those measures to produce trustworthiness scores at various time windows from the outbreak of the rumour. Such time-windows were key as they allowed useful insight into the progression of the rumours. From our findings, we identified that our model was significantly more accurate than similar studies in the literature. We also identified critical attributes of the data that give rise to the trustworthiness scores assigned. Finally we developed a software demonstration that provides a visual user interface to allow the user to examine the analysis.
With the increasing use of online social networks as a source of news and information, the propensity for a rumor to disseminate widely and quickly poses a great concern, especially in disaster situations where users do not have enough time to fact-check posts before making the informed decision to react to a post that appears to be credible. In this study, we explore the propagation pattern of rumors on Twitter by exploring the dynamics of microscopic-level misinformation spread, based on the latent message and user interaction attributes. We perform supervised learning for feature selection and prediction. Experimental results with real-world data sets give the models' prediction accuracy at about 90\% for the diffusion of both True and False topics. Our findings confirm that rumor cascades run deeper and that rumor masked as news, and messages that incite fear, will diffuse faster than other messages. We show that the models for True and False message propagation differ significantly, both in the prediction parameters and in the message features that govern the diffusion. Finally, we show that the diffusion pattern is an important metric in identifying the credibility of a tweet.
Nowadays, Internet is a primary source of attaining health information. Massive fake health news which is spreading over the Internet, has become a severe threat to public health. Numerous studies and research works have been done in fake news detection domain, however, few of them are designed to cope with the challenges in health news. For instance, the development of explainable is required for fake health news detection. To mitigate these problems, we construct a comprehensive repository, FakeHealth, which includes news contents with rich features, news reviews with detailed explanations, social engagements and a user-user social network. Moreover, exploratory analyses are conducted to understand the characteristics of the datasets, analyze useful patterns and validate the quality of the datasets for health fake news detection. We also discuss the novel and potential future research directions for the health fake news detection.
As research on hate speech becomes more and more relevant every day, most of it is still focused on hate speech detection. By attempting to replicate a hate speech detection experiment performed on an existing Twitter corpus annotated for hate speech, we highlight some issues that arise from doing research in the field of hate speech, which is essentially still in its infancy. We take a critical look at the training corpus in order to understand its biases, while also using it to venture beyond hate speech detection and investigate whether it can be used to shed light on other facets of research, such as popularity of hate tweets.