Goto

Collaborating Authors

Restricted Boltzmann Machines for Robust and Fast Latent Truth Discovery

arXiv.org Machine Learning

We address the problem of latent truth discovery, LTD for short, where the goal is to discover the underlying true values of entity attributes in the presence of noisy, conflicting or incomplete information. Despite a multitude of algorithms to address the LTD problem that can be found in literature, only little is known about their overall performance with respect to effectiveness (in terms of truth discovery capabilities), efficiency and robustness. A practical LTD approach should satisfy all these characteristics so that it can be applied to heterogeneous datasets of varying quality and degrees of cleanliness. We propose a novel algorithm for LTD that satisfies the above requirements. The proposed model is based on Restricted Boltzmann Machines, thus coined LTD-RBM. In extensive experiments on various heterogeneous and publicly available datasets, LTD-RBM is superior to state-of-the-art LTD techniques in terms of an overall consideration of effectiveness, efficiency and robustness.


Combining Restricted Boltzmann Machines with Neural Networks for Latent Truth Discovery

arXiv.org Artificial Intelligence

Latent truth discovery, LTD for short, refers to the problem of aggregating ltiple claims from various sources in order to estimate the plausibility of atements about entities. In the absence of a ground truth, this problem is highly challenging, when some sources provide conflicting claims and others no claims at all. In this work we provide an unsupervised stochastic inference procedure on top of a model that combines restricted Boltzmann machines with feed-forward neural networks to accurately infer the reliability of sources as well as the plausibility of statements about entities. In comparison to prior work our approach stands out (1) by allowing the incorporation of arbitrary features about sources and claims, (2) by generalizing from reliability per source towards a reliability function, and thus (3) enabling the estimation of source reliability even for sources that have provided no or very few claims, (4) by building on efficient and scalable stochastic inference algorithms, and (5) by outperforming the state-of-the-art by a considerable margin.


Using Social Network Information in Bayesian Truth Discovery

arXiv.org Machine Learning

We investigate the problem of truth discovery based on opinions from multiple agents who may be unreliable or biased. We consider the case where agents' reliabilities or biases are correlated if they belong to the same community, which defines a group of agents with similar opinions regarding a particular event. An agent can belong to different communities for different events, and these communities are unknown \emph{a priori}. We incorporate knowledge of the agents' social network in our truth discovery framework and develop Laplace variational inference methods to estimate agents' reliabilities, communities, and the event states. We also develop a stochastic variational inference method to scale our model to large social networks. Simulations and experiments on real data suggest that when observations are sparse, our proposed methods perform better than several other inference methods, including majority voting, the popular Bayesian Classifier Combination (BCC) method, and the Community BCC method.


Theme-Relevant Truth Discovery on Twitter: An Estimation Theoretic Approach

AAAI Conferences

Twitter has emerged as a new application paradigm of sensing the physical environment by using human as sensors. These human sensed observations are often viewed as binary claims (either true or false). A fundamental challenge on Twitter is how to ascertain the credibility of claims and the reliability of sources without the prior knowledge on either of them beforehand. This challenge is referred to as truth discovery. An important limitation exists in the current Twitter-based truth discovery solutions: they did not explore the theme relevance aspect of claims and the correct claims identified by their solutions can be completely irrelevant to the theme of interests. In this paper, we present a new analytical model that explicitly considers the theme relevance feature of claims in the solutions of truth discovery problem on Twitter. The new model solves a bi-dimensional estimation problem to jointly estimate the correctness and theme relevance of claims as well as the reliability and theme awareness of sources. The new model is compared with the discovery solutions in current literature using three real world datasets collected from Twitter during recent disastrous and emergent events: Paris attack, Oregon shooting, and Baltimore riots, all in 2015. The new model was shown to be effective in terms of finding both correct and relevant claims.


Even the best AI for spotting fake news is still terrible

MIT Technology Review

When Facebook chief executive Mark Zuckerberg promised Congress that AI would help solve the problem of fake news, he revealed little in the way of how. New research brings us one step closer to figuring that out. In an extensive study that will be presented at a conference later this month, researchers from MIT, Qatar Computing Research Institute (QCRI), and Sofia University in Bulgaria tested over 900 possible variables for predicting a media outlet's trustworthiness--probably the largest set ever proposed. The researchers then trained a machine-learning model on different combinations of the variables to see which would produce the most accurate results. The best model accurately labeled news outlets with "low," "medium," or "high" factuality just 65% of the time.