Mohtarami, Mitra
Interpretable Propaganda Detection in News Articles
Yu, Seunghak, Martino, Giovanni Da San, Mohtarami, Mitra, Glass, James, Nakov, Preslav
Online users today are exposed to misleading and propagandistic news articles and media posts on a daily basis. To counter thus, a number of approaches have been designed aiming to achieve a healthier and safer online news and media consumption. Automatic systems are able to support humans in detecting such content; yet, a major impediment to their broad adoption is that besides being accurate, the decisions of such systems need also to be interpretable in order to be trusted and widely adopted by users. Since misleading and propagandistic content influences readers through the use of a number of deception techniques, we propose to detect and to show the use of such techniques as a way to offer interpretability. In particular, we define qualitatively descriptive features and we analyze their suitability for detecting deception techniques. We further show that our interpretable features can be easily combined with pre-trained language models, yielding state-of-the-art results.
Automatic Fact-Checking Using Context and Discourse Information
Atanasova, Pepa, Nakov, Preslav, Mร rquez, Lluรญs, Barrรณn-Cedeรฑo, Alberto, Karadzhov, Georgi, Mihaylova, Tsvetomila, Mohtarami, Mitra, Glass, James
We study the problem of automatic fact-checking, paying special attention to the impact of contextual and discourse information. We address two related tasks: (i) detecting check-worthy claims, and (ii) fact-checking claims. We develop supervised systems based on neural networks, kernel-based support vector machines, and combinations thereof, which make use of rich input representations in terms of discourse cues and contextual features. For the check-worthiness estimation task, we focus on political debates, and we model the target claim in the context of the full intervention of a participant and the previous and the following turns in the debate, taking into account contextual meta information. For the fact-checking task, we focus on answer verification in a community forum, and we model the veracity of the answer with respect to the entire question--answer thread in which it occurs as well as with respect to other related posts from the entire forum. We develop annotated datasets for both tasks and we run extensive experimental evaluation, confirming that both types of information ---but especially contextual features--- play an important role.
FAKTA: An Automatic End-to-End Fact Checking System
Nadeem, Moin, Fang, Wei, Xu, Brian, Mohtarami, Mitra, Glass, James
Then, the stance detection component detects the With the rapid increase of fake news in social media stance/perspective of each relevant document with and its negative influence on people and public respect to the claim, typically modeled using labels opinion (Mihaylov et al., 2015; Mihaylov and such as agree, disagree and discuss. This Nakov, 2016; Vosoughi et al., 2018), various organizations component further provides rationales at the sentence are now performing manual fact checking level for explaining model predictions (see on suspicious claims.
SemEval-2019 Task 8: Fact Checking in Community Question Answering Forums
Mihaylova, Tsvetomila, Karadjov, Georgi, Atanasova, Pepa, Baly, Ramy, Mohtarami, Mitra, Nakov, Preslav
We present SemEval-2019 Task 8 on Fact Checking in Community Question Answering Forums, which features two subtasks. Subtask A is about deciding whether a question asks for factual information vs. an opinion/advice vs. just socializing. Subtask B asks to predict whether an answer to a factual question is true, false or not a proper answer. We received 17 official submissions for subtask A and 11 official submissions for Subtask B. For subtask A, all systems improved over the majority class baseline. For Subtask B, all systems were below a majority class baseline, but several systems were very close to it. The leaderboard and the data from the competition can be found at http://competitions.codalab.org/competitions/20022
Team QCRI-MIT at SemEval-2019 Task 4: Propaganda Analysis Meets Hyperpartisan News Detection
Saleh, Abdelrhman, Baly, Ramy, Barrรณn-Cedeรฑo, Alberto, Martino, Giovanni Da San, Mohtarami, Mitra, Nakov, Preslav, Glass, James
In this paper, we describe our submission to SemEval-2019 Task 4 on Hyperpartisan News Detection. Our system relies on a variety of engineered features originally used to detect propaganda. This is based on the assumption that biased messages are propagandistic in the sense that they promote a particular political cause or viewpoint. We trained a logistic regression model with features ranging from simple bag-of-words to vocabulary richness and text readability features. Our system achieved 72.9% accuracy on the test data that is annotated manually and 60.8% on the test data that is annotated with distant supervision. Additional experiments showed that significant performance improvements can be achieved with better feature pre-processing.
Adversarial Domain Adaptation for Stance Detection
Xu, Brian, Mohtarami, Mitra, Glass, James
This paper studies the problem of stance detection which aims to predict the perspective (or stance) of a given document with respect to a given claim. Stance detection is a major component of automated fact checking. As annotating stances in different domains is a tedious and costly task, automatic methods based on machine learning are viable alternatives. In this paper, we focus on adversarial domain adaptation for stance detection where we assume there exists sufficient labeled data in the source domain and limited labeled data in the target domain. Extensive experiments on publicly available datasets show the effectiveness of our domain adaption model in transferring knowledge for accurate stance detection across domains.
Fact Checking in Community Forums
Mihaylova, Tsvetomila (Sofia University "St. Kliment Ohridski") | Nakov, Preslav ( Qatar Computing Research Institute, HBKU ) | Mร rquez, Lluรญs (Qatar Computing Research Institute, HBKU) | Barrรณn-Cedeรฑo, Alberto (Qatar Computing Research Institute, HBKU) | Mohtarami, Mitra (Massachusetts Institute of Technology) | Karadzhov, Georgi (Sofia University "St. Kliment Ohridski") | Glass, James (Massachusetts Institute of Technology)
Community Question Answering (cQA) forums are very popular nowadays, as they represent effective means for communities around particular topics to share information. Unfortunately, this information is not always factual. Thus, here we explore a new dimension in the context of cQA, which has been ignored so far: checking the veracity of answers to particular questions in cQA forums. As this is a new problem, we create a specialized dataset for it. We further propose a novel multi-faceted model, which captures information from the answer content (what is said and how), from the author profile (who says it), from the rest of the community forum (where it is said), and from external authoritative sources of information (external support). Evaluation results show a MAP value of 86.54, which is 21 points absolute above the baseline.
From Semantic to Emotional Space in Probabilistic Sense Sentiment Analysis
Mohtarami, Mitra (National University of Singapore) | Lan, Man (Institute for Infocomm Research) | Tan, Chew Lim (National University of Singapore)
This paper proposes an effective approach to model the emotional space of words to infer their Sense Sentiment Similarity (SSS). SSS reflects the distance between the words regarding their senses and underlying sentiments. We propose a probabilistic approach that is built on a hidden emotional model in which the basic human emotions are considered as hidden. This leads to predict a vector of emotions for each sense of the words, and then to infer the sense sentiment similarity. The effectiveness of the proposed approach is investigated in two Natural Language Processing tasks: Indirect yes/no Question Answer Pairs Inference and Sentiment Orientation Prediction.
Sense Sentiment Similarity: An Analysis
Mohtarami, Mitra (National University of Singapore) | Amiri, Hadi (National University of Singapore) | Lan, Man (Institute for Infocomm Research) | Tran, Thanh Phu (National University of Singapore) | Tan, Chew Lim (National University of Singapore)
This paper describes an emotion-based approach to acquire sentiment similarity of word pairs with respect to their senses. Sentiment similarity indicates the similarity between two words from their underlying sentiments. Our approach is built on a model which maps from senses of words to vectors of twelve basic emotions. The emotional vectors are used to measure the sentiment similarity of word pairs. We show the utility of measuring sentiment similarity in two main natural language processing tasks, namely, indirect yes/no question answer pairs (IQAP) Inference and sentiment orientation (SO) prediction. Extensive experiments demonstrate that our approach can effectively capture the sentiment similarity of word pairs and utilize this information to address the above mentioned tasks.