Goto

Collaborating Authors

 Ciampaglia, Giovanni Luca


Factuality Challenges in the Era of Large Language Models

arXiv.org Artificial Intelligence

The emergence of tools based on Large Language Models (LLMs), such as OpenAI's ChatGPT, Microsoft's Bing Chat, and Google's Bard, has garnered immense public attention. These incredibly useful, natural-sounding tools mark significant advances in natural language generation, yet they exhibit a propensity to generate false, erroneous, or misleading content -- commonly referred to as "hallucinations." Moreover, LLMs can be exploited for malicious applications, such as generating false but credible-sounding content and profiles at scale. This poses a significant challenge to society in terms of the potential deception of users and the increasing dissemination of inaccurate information. In light of these risks, we explore the kinds of technological innovations, regulatory reforms, and AI literacy initiatives needed from fact-checkers, news organizations, and the broader research and policy communities. By identifying the risks, the imminent threats, and some viable solutions, we seek to shed light on navigating various aspects of veracity in the era of generative AI.


HONEM: Network Embedding Using Higher-Order Patterns in Sequential Data

arXiv.org Machine Learning

Representation learning offers a powerful alternative to the oft painstaking process of manual feature engineering, and as a result, has enjoyed considerable success in recent years. This success is especially striking in the context of graph mining, since networks can take advantage of vast troves of sequential data to encode information about interactions between entities of interest. But how do we learn embeddings on networks that have higher-order and sequential dependencies? Existing network embedding methods naively assume the Markovian property (first-order dependency) for node interactions, which may not capture the time-dependent and longer-range underlying complex interactions of the raw data. To address the limitation of current methods, we propose a network embedding method for higher-order networks (HON). We demonstrate that the higher-order network embedding (HONEM) method is able to extract higher-order dependencies from HON to construct the higher-order neighborhood matrix of the network, while existing methods are not able to capture these higher-order dependencies. We show that our method outperforms other state-of-the-art methods in node classification, network reconstruction, link prediction, and visualization.


Research Challenges of Digital Misinformation: Toward a Trustworthy Web

AI Magazine

The deluge of online and offline misinformation is overloading the exchange of ideas upon which democracies depend. Fake news, conspiracy theories, and deceptive social bots proliferate, facilitating the manipulation of public opinion. Countering misinformation while protecting freedom of speech will require collaboration across industry, journalism, and academia. The Workshop on Digital Misinformation — held in May 2017 in conjunction with the International Conference on Web and Social Media in Montréal, Québec, Canada — was intended to foster these efforts. The meeting brought together more than 100 stakeholders from academia, media, and tech companies to discuss the research challenges implicit in building a trustworthy Web. Below we outline the main findings from the discussion.


Empirical Analysis of User Participation in Online Communities: the Case of Wikipedia

AAAI Conferences

We study the distribution of the activity period of users in five of the largest localized versions of the free, on- line encyclopedia Wikipedia. We find it to be consis- tent with a mixture of two truncated log-normal distri- butions. Using this model, the temporal evolution of these systems can be analyzed, showing that the statis- tical description is consistent over time.