Godbout, Jean-François
Epistemic Integrity in Large Language Models
Ghafouri, Bijean, Mohammadzadeh, Shahrad, Zhou, James, Nair, Pratheeksha, Tian, Jacob-Junqi, Goel, Mayank, Rabbany, Reihaneh, Godbout, Jean-François, Pelrine, Kellin
Large language models are increasingly relied upon as sources of information, but their propensity for generating false or misleading statements with high confidence poses risks for users and society. In this paper, we confront the critical problem of epistemic miscalibration -- where a model's linguistic assertiveness fails to reflect its true internal certainty. We introduce a new human-labeled dataset and a novel method for measuring the linguistic assertiveness of Large Language Models (LLMs) which cuts error rates by over 50% relative to previous benchmarks. Validated across multiple datasets, our method reveals a stark misalignment between how confidently models linguistically present information and their actual accuracy. Further human evaluations confirm the severity of this miscalibration. This evidence underscores the urgent risk of the overstated certainty LLMs hold which may mislead users on a massive scale. Our framework provides a crucial step forward in diagnosing this miscalibration, offering a path towards correcting it and more trustworthy AI across domains. Large Language Models (LLMs) have markedly transformed how humans seek and consume information, becoming integral across diverse fields such as public health (Ali et al., 2023), coding (Zambrano et al., 2023), and education (Whalen & et al., 2023). Despite their growing influence, LLMs are not without shortcomings. One notable issue is the potential for generating responses that, while convincing, may be inaccurate or nonsensical--a long-standing phenomenon often referred to as "hallucinations" (Jo, 2023; Huang et al., 2023; Zhou et al., 2024b). This raises concerns about the reliability and trustworthiness of these models. A critical aspect of trustworthiness in LLMs is epistemic calibration, which represents the alignment between a model's internal confidence in its outputs and the way it expresses that confidence through natural language. Misalignment between internal certainty and external expression can lead to users being misled by overconfident or underconfident statements, posing significant risks in high-stakes domains such as legal advice, medical diagnosis, and misinformation detection. While of great normative concern, how LLMs express linguistic uncertainty has received relatively little attention to date (Sileo & Moens, 2023; Belem et al., 2024). Figures 1 and 5 illustrate the issue of epistemic calibration providing insights into the operation of certainty in the context of human interactions with LLMs. Distinct Roles of Certainty: Internal certainty and linguistic assertiveness have distinct functions within LLM interactions that shape individual beliefs. Human access to LLM certainty: Linguistic assertiveness holds a critical role as the primary form of certainty available to users.
A Guide to Misinformation Detection Datasets
Thibault, Camille, Peloquin-Skulski, Gabrielle, Tian, Jacob-Junqi, Laflamme, Florence, Guan, Yuxiang, Rabbany, Reihaneh, Godbout, Jean-François, Pelrine, Kellin
Misinformation is a complex societal issue, and mitigating solutions are difficult to create due to data deficiencies. To address this problem, we have curated the largest collection of (mis)information datasets in the literature, totaling 75. From these, we evaluated the quality of all of the 36 datasets that consist of statements or claims. We assess these datasets to identify those with solid foundations for empirical work and those with flaws that could result in misleading and non-generalizable results, such as insufficient label quality, spurious correlations, or political bias. We further provide state-of-the-art baselines on all these datasets, but show that regardless of label quality, categorical labels may no longer give an accurate evaluation of detection model performance. We discuss alternatives to mitigate this problem. Overall, this guide aims to provide a roadmap for obtaining higher quality data and conducting more effective evaluations, ultimately improving research in misinformation detection. All datasets and other artifacts are available at https://misinfo-datasets.complexdatalab.com/.
A Simulation System Towards Solving Societal-Scale Manipulation
Touzel, Maximilian Puelma, Sarangi, Sneheel, Welch, Austin, Krishnakumar, Gayatri, Zhao, Dan, Yang, Zachary, Yu, Hao, Kosak-Hine, Ethan, Gibbs, Tom, Musulan, Andreea, Thibault, Camille, Gurbuz, Busra Tugce, Rabbany, Reihaneh, Godbout, Jean-François, Pelrine, Kellin
The rise of AI-driven manipulation poses significant risks to societal trust and democratic processes. Yet, studying these effects in real-world settings at scale is ethically and logistically impractical, highlighting a need for simulation tools that can model these dynamics in controlled settings to enable experimentation with possible defenses. We present a simulation environment designed to address this. We elaborate upon the Concordia framework that simulates offline, 'real life' activity by adding online interactions to the simulation through social media with the integration of a Mastodon server. We improve simulation efficiency and information flow, and add a set of measurement tools, particularly longitudinal surveys. We demonstrate the simulator with a tailored example in which we track agents' political positions and show how partisan manipulation of agents can affect election results.
Combining Confidence Elicitation and Sample-based Methods for Uncertainty Quantification in Misinformation Mitigation
Rivera, Mauricio, Godbout, Jean-François, Rabbany, Reihaneh, Pelrine, Kellin
Large Language Models have emerged as prime candidates to tackle misinformation mitigation. However, existing approaches struggle with hallucinations and overconfident predictions. We propose an uncertainty quantification framework that leverages both direct confidence elicitation and sampled-based consistency methods to provide better calibration for NLP misinformation mitigation solutions. We first investigate the calibration of sample-based consistency methods that exploit distinct features of consistency across sample sizes and stochastic levels. Next, we evaluate the performance and distributional shift of a robust numeric verbalization prompt across single vs. two-step confidence elicitation procedure. We also compare the performance of the same prompt with different versions of GPT and different numerical scales. Finally, we combine the sample-based consistency and verbalized methods to propose a hybrid framework that yields a better uncertainty estimation for GPT models. Overall, our work proposes novel uncertainty quantification methods that will improve the reliability of Large Language Models in misinformation mitigation applications.
Uncertainty Resolution in Misinformation Detection
Orlovskiy, Yury, Thibault, Camille, Imouza, Anne, Godbout, Jean-François, Rabbany, Reihaneh, Pelrine, Kellin
Misinformation poses a variety of risks, such as undermining public trust and distorting factual discourse. Large Language Models (LLMs) like GPT-4 have been shown effective in mitigating misinformation, particularly in handling statements where enough context is provided. However, they struggle to assess ambiguous or context-deficient statements accurately. This work introduces a new method to resolve uncertainty in such statements. We propose a framework to categorize missing information and publish category labels for the LIAR-New dataset, which is adaptable to cross-domain content with missing information. We then leverage this framework to generate effective user queries for missing context. Compared to baselines, our method improves the rate at which generated questions are answerable by the user by 38 percentage points and classification performance by over 10 percentage points macro F1. Thus, this approach may provide a valuable component for future misinformation mitigation pipelines.
Towards Reliable Misinformation Mitigation: Generalization, Uncertainty, and GPT-4
Pelrine, Kellin, Imouza, Anne, Thibault, Camille, Reksoprodjo, Meilina, Gupta, Caleb, Christoph, Joel, Godbout, Jean-François, Rabbany, Reihaneh
Misinformation poses a critical societal challenge, and current approaches have yet to produce an effective solution. We propose focusing on generalization, uncertainty, and how to leverage recent large language models, in order to create more practical tools to evaluate information veracity in contexts where perfect classification is impossible. We first demonstrate that GPT-4 can outperform prior methods in multiple settings and languages. Next, we explore generalization, revealing that GPT-4 and RoBERTa-large exhibit differences in failure modes. Third, we propose techniques to handle uncertainty that can detect impossible examples and strongly improve outcomes. We also discuss results on other language models, temperature, prompting, versioning, explainability, and web retrieval, each one providing practical insights and directions for future research. Finally, we publish the LIAR-New dataset with novel paired English and French misinformation data and Possibility labels that indicate if there is sufficient context for veracity evaluation. Overall, this research lays the groundwork for future tools that can drive real-world progress to combat misinformation.
Party Prediction for Twitter
Pelrine, Kellin, Imouza, Anne, Yang, Zachary, Tian, Jacob-Junqi, Lévy, Sacha, Desrosiers-Brisebois, Gabrielle, Feizi, Aarash, Amadoro, Cécile, Blais, André, Godbout, Jean-François, Rabbany, Reihaneh
A large number of studies on social media compare the behaviour of users from different political parties. As a basic step, they employ a predictive model for inferring their political affiliation. The accuracy of this model can change the conclusions of a downstream analysis significantly, yet the choice between different models seems to be made arbitrarily. In this paper, we provide a comprehensive survey and an empirical comparison of the current party prediction practices and propose several new approaches which are competitive with or outperform state-of-the-art methods, yet require less computational resources. Party prediction models rely on the content generated by the users (e.g., tweet texts), the relations they have (e.g., who they follow), or their activities and interactions (e.g., which tweets they like). We examine all of these and compare their signal strength for the party prediction task. This paper lets the practitioner select from a wide range of data types that all give strong performance. Finally, we conduct extensive experiments on different aspects of these methods, such as data collection speed and transfer capabilities, which can provide further insights for both applied and methodological research.