Russo, Giuseppe
Self-Recognition in Language Models
Davidson, Tim R., Surkov, Viacheslav, Veselovsky, Veniamin, Russo, Giuseppe, West, Robert, Gulcehre, Caglar
A rapidly growing number of applications rely on a small set of closed-source language models (LMs). This dependency might introduce novel security risks if LMs develop self-recognition capabilities. Inspired by human identity verification methods, we propose a novel approach for assessing self-recognition in LMs using model-generated "security questions". Our test can be externally administered to keep track of frontier models as it does not require access to internal model parameters or output probabilities. We use our test to examine self-recognition in ten of the most capable open- and closed-source LMs currently publicly available. Our extensive experiments found no empirical evidence of general or consistent self-recognition in any examined LM. Instead, our results suggest that given a set of alternatives, LMs seek to pick the "best" answer, regardless of its origin. Moreover, we find indications that preferences about which models produce the best answers are consistent across LMs. We additionally uncover novel insights on position bias considerations for LMs in multiple-choice settings.
ACTI at EVALITA 2023: Overview of the Conspiracy Theory Identification Task
Russo, Giuseppe, Stoehr, Niklas, Ribeiro, Manoel Horta
Automatic Conspiracy Theory Identification (ACTI) is a new shared task proposed for the first time at the EVALITA 2023 evaluation campaign. ACTI is based on a new, manually labeled dataset of comments scraped from conspiratorial Telegram channels and consists of two subtasks: (1) identifying conspiratorial content (conspiratorial content classification); and (2) classifying content into specific conspiracy theories (conspiratorial category classification). A total of 15 teams participated in the task with 81 submissions. In this task summary, we discuss the data and task, and outline the bestperforming approaches that are largely based on large language models. We conclude with a brief discussion of the application of large language models to counter the spread of misinformation on online platforms.
Understanding Online Migration Decisions Following the Banning of Radical Communities
Russo, Giuseppe, Ribeiro, Manoel Horta, Casiraghi, Giona, Verginer, Luca
The proliferation of radical online communities and their violent offshoots has sparked great societal concern. However, the current practice of banning such communities from mainstream platforms has unintended consequences: (I) the further radicalization of their members in fringe platforms where they migrate; and (ii) the spillover of harmful content from fringe back onto mainstream platforms. Here, in a large observational study on two banned subreddits, r/The\_Donald and r/fatpeoplehate, we examine how factors associated with the RECRO radicalization framework relate to users' migration decisions. Specifically, we quantify how these factors affect users' decisions to post on fringe platforms and, for those who do, whether they continue posting on the mainstream platform. Our results show that individual-level factors, those relating to the behavior of users, are associated with the decision to post on the fringe platform. Whereas social-level factors, users' connection with the radical community, only affect the propensity to be coactive on both platforms. Overall, our findings pave the way for evidence-based moderation policies, as the decisions to migrate and remain coactive amplify unintended consequences of community bans.
A Framework to Induce Self-Regulation Through a Metacognitive Tutor
Cannella, Vincenzo (University of Palermo) | Pipitone, Arianna ( University of Palermo ) | Russo, Giuseppe (University of Palermo) | Pirrone, Roberto (University of Palermo)
A new architectural framework for a metacognitive tutoring system is presented that is aimed to stimulate self-regulatory behavior in the learner.The new framework extends the cognitive architecture of TutorJ that has been already proposed by some of the authors. TutorJ relies mainly on dialogic interaction with the user, and makes use of a statistical dialogue planner implemented through a Partially Observable Markov Decision Process (POMDP). A suitable two-level structure has been designed for the statistical reasoner to cope with measuring and stimulating metacognitive skills in the user. Suitable actions have been designed to this purpose starting from the analysis of the main questionnaires proposed in the literature. Our reasoner has been designed to model the relation between each item in a questionnaire and the related metacognitive skill, so the proper action can be selected by the tutoring agent. The complete framework is detailed, the reasoner structure is discussed, and a simple application scenario is presented.
Acquisition Of New Knowledge In TutorJ
Russo, Giuseppe (University of Palermo DINFO) | Pirrone, Roberto | Pipitone, Arianna
This paper presents a methodology to acquire new knowledge in TutorJ using external information sources. TutorJ is an ITS whose architecture is inspired to the HIPM cognitive model, while meta-cognition principles have been used to design the knowledge acquisition process. The system behavior is intended to increase its own knowledge as a consequence of the interaction with users. The implemented methodology uses external links and services to capture new knowledge from contents related to discussion topics and transforms these contents into structured knowledge that is stored inside an ontology. The purpose of the proposed methodology is to lower the effort of system scaffolding creation and to increase the level of interaction with users. The focus is on self-regulated learners while meta-cognitive strategies have to bee defined to adapt and to increase the effectiveness of tutoring actions.