AITopics | Neves, Leonardo

Collaborating Authors

Neves, Leonardo

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Enhancing Item Tokenization for Generative Recommendation through Self-Improvement

Chen, Runjin, Ju, Mingxuan, Bui, Ngoc, Antypas, Dimosthenis, Cai, Stanley, Wu, Xiaopeng, Neves, Leonardo, Wang, Zhangyang, Shah, Neil, Zhao, Tong

arXiv.org Artificial IntelligenceDec-22-2024

Generative recommendation systems, driven by large language models (LLMs), present an innovative approach to predicting user preferences by modeling items as token sequences and generating recommendations in a generative manner. A critical challenge in this approach is the effective tokenization of items, ensuring that they are represented in a form compatible with LLMs. Current item tokenization methods include using text descriptions, numerical strings, or sequences of discrete tokens. While text-based representations integrate seamlessly with LLM tokenization, they are often too lengthy, leading to inefficiencies and complicating accurate generation. Numerical strings, while concise, lack semantic depth and fail to capture meaningful item relationships. Tokenizing items as sequences of newly defined tokens has gained traction, but it often requires external models or algorithms for token assignment. These external processes may not align with the LLM's internal pretrained tokenization schema, leading to inconsistencies and reduced model performance. To address these limitations, we propose a self-improving item tokenization method that allows the LLM to refine its own item tokenizations during training process. Our approach starts with item tokenizations generated by any external model and periodically adjusts these tokenizations based on the LLM's learned patterns. Such alignment process ensures consistency between the tokenization and the LLM's internal understanding of the items, leading to more accurate recommendations. Furthermore, our method is simple to implement and can be integrated as a plug-and-play enhancement into existing generative recommendation systems. Experimental results on multiple datasets and using various initial tokenization strategies demonstrate the effectiveness of our method, with an average improvement of 8\% in recommendation performance.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2412.17171

Country: North America > United States > Texas (0.28)

Genre:

Research Report > New Finding (0.68)
Research Report > Promising Solution (0.48)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Add feedback

Explainability and Hate Speech: Structured Explanations Make Social Media Moderators Faster

Calabrese, Agostina, Neves, Leonardo, Shah, Neil, Bos, Maarten W., Ross, Björn, Lapata, Mirella, Barbieri, Francesco

arXiv.org Artificial IntelligenceJun-6-2024

Content moderators play a key role in keeping the conversation on social media healthy. While the high volume of content they need to judge represents a bottleneck to the moderation pipeline, no studies have explored how models could support them to make faster decisions. There is, by now, a vast body of research into detecting hate speech, sometimes explicitly motivated by a desire to help improve content moderation, but published research using real content moderators is scarce. In this work we investigate the effect of explanations on the speed of real-world moderators. Our experiments show that while generic explanations do not affect their speed and are often ignored, structured explanations lower moderators' decision making time by 7.4%.

artificial intelligence, explanation, natural language, (17 more...)

arXiv.org Artificial Intelligence

2406.04106

Country:

North America > United States (0.14)
North America > Canada (0.14)
Europe > United Kingdom (0.14)
(2 more...)

Genre: Research Report > New Finding (0.68)

Industry:

Information Technology > Services (0.46)
Information Technology > Security & Privacy (0.46)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)

Add feedback

USE: Dynamic User Modeling with Stateful Sequence Models

Zhou, Zhihan, Fang, Qixiang, Neves, Leonardo, Barbieri, Francesco, Liu, Yozen, Liu, Han, Bos, Maarten W., Dotsch, Ron

arXiv.org Artificial IntelligenceMar-20-2024

User embeddings play a crucial role in user engagement forecasting and personalized services. Recent advances in sequence modeling have sparked interest in learning user embeddings from behavioral data. Yet behavior-based user embedding learning faces the unique challenge of dynamic user modeling. As users continuously interact with the apps, user embeddings should be periodically updated to account for users' recent and long-term behavior patterns. Existing methods highly rely on stateless sequence models that lack memory of historical behavior. They have to either discard historical data and use only the most recent data or reprocess the old and new data jointly. Both cases incur substantial computational overhead. To address this limitation, we introduce User Stateful Embedding (USE). USE generates user embeddings and reflects users' evolving behaviors without the need for exhaustive reprocessing by storing previous model states and revisiting them in the future. Furthermore, we introduce a novel training objective named future W-behavior prediction to transcend the limitations of next-token prediction by forecasting a broader horizon of upcoming user behaviors. By combining it with the Same User Prediction, a contrastive learning-based objective that predicts whether different segments of behavior sequences belong to the same user, we further improve the embeddings' distinctiveness and representativeness. We conducted experiments on 8 downstream tasks using Snapchat users' behavioral logs in both static (i.e., fixed user behavior sequences) and dynamic (i.e., periodically updated user behavior sequences) settings. We demonstrate USE's superior performance over established baselines. The results underscore USE's effectiveness and efficiency in integrating historical and recent user behavior sequences into user embeddings in dynamic user modeling.

behavior sequence, large language model, machine learning, (21 more...)

arXiv.org Artificial Intelligence

2403.13344

Country:

North America > United States (0.14)
Europe > Italy (0.14)
Asia > China (0.14)

Genre: Research Report > New Finding (0.66)

Industry: Information Technology > Services (1.00)

Technology:

Information Technology > Human Computer Interaction (1.00)
Information Technology > Data Science > Data Mining (1.00)
Information Technology > Communications > Social Media (1.00)
(4 more...)

Add feedback

Context-aware Adversarial Attack on Named Entity Recognition

Chen, Shuguang, Neves, Leonardo, Solorio, Thamar

arXiv.org Artificial IntelligenceFeb-2-2024

In recent years, large pre-trained language models (PLMs) have achieved remarkable performance on many natural language processing benchmarks. Despite their success, prior studies have shown that PLMs are vulnerable to attacks from adversarial examples. In this work, we focus on the named entity recognition task and study context-aware adversarial attack methods to examine the model's robustness. Specifically, we propose perturbing the most informative words for recognizing entities to create adversarial examples and investigate different candidate replacement methods to generate natural and plausible adversarial examples. Experiments and analyses show that our methods are more effective in deceiving the model into making wrong predictions than strong baselines.

artificial intelligence, computational linguistic, natural language, (16 more...)

arXiv.org Artificial Intelligence

2309.08999

Country:

Europe (1.00)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre: Research Report (0.82)

Industry:

Information Technology > Security & Privacy (0.65)
Government > Military (0.65)

Technology: Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)

Add feedback

Tweet Insights: A Visualization Platform to Extract Temporal Insights from Twitter

Loureiro, Daniel, Rezaee, Kiamehr, Riahi, Talayeh, Barbieri, Francesco, Neves, Leonardo, Anke, Luis Espinosa, Camacho-Collados, Jose

arXiv.org Artificial IntelligenceAug-4-2023

This paper introduces a large collection of time series data derived from Twitter, postprocessed using word embedding techniques, as well as specialized fine-tuned language models. This data comprises the past five years and captures changes in n-gram frequency, similarity, sentiment and topic distribution. The interface built on top of this data enables temporal analysis for detecting and characterizing shifts in meaning, including complementary information to trending metrics, such as sentiment and topic association over time. We release an online demo for easy experimentation, and we share code and the underlying aggregated data for future work. In this paper, we also discuss three case studies unlocked thanks to our platform, showcasing its potential for temporal linguistic analysis.

machine learning, natural language, tweet, (17 more...)

arXiv.org Artificial Intelligence

2308.02142

Country:

Europe > Middle East > Malta (0.28)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.14)

Genre: Research Report (0.50)

Industry:

Media (1.00)
Information Technology > Services (0.69)
Government > Regional Government (0.46)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Extraction (0.47)

Add feedback

SemEval 2023 Task 9: Multilingual Tweet Intimacy Analysis

Pei, Jiaxin, Silva, Vítor, Bos, Maarten, Liu, Yozon, Neves, Leonardo, Jurgens, David, Barbieri, Francesco

arXiv.org Artificial IntelligenceFeb-3-2023

R model trained over the twitter dataset (XLM-T) performs the best on 7 languages. While the Intimacy has long been viewed as a primary dimension pre-trained language models are able to achieve of human relationships and interpersonal promising performance, zero-shot prediction of unseen interactions (Maslow, 1981; Sullivan, 2013; Prager, languages remains challenging especially for 1995). Existing studies suggest that intimacy is an Korean and Hindi.

machine learning, natural language, tweet, (16 more...)

arXiv.org Artificial Intelligence

2210.01108

Country: North America > United States > Michigan (0.29)

Genre: Research Report (0.84)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

TimeLMs: Diachronic Language Models from Twitter

Loureiro, Daniel, Barbieri, Francesco, Neves, Leonardo, Anke, Luis Espinosa, Camacho-Collados, Jose

arXiv.org Artificial IntelligenceFeb-8-2022

Despite its importance, the time variable has been largely neglected in the NLP and language model literature. In this paper, we present TimeLMs, a set of language models specialized on diachronic Twitter data. We show that a continual learning strategy contributes to enhancing Twitter-based language models' capacity to deal with future and out-of-distribution tweets, while making them competitive with standardized and more monolithic benchmarks. We also perform a number of qualitative analyses showing how they cope with trends and peaks in activity involving specific named entities or concept drift.

artificial intelligence, information technology services, natural language, (18 more...)

arXiv.org Artificial Intelligence

2202.03829

Country: North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre: Research Report > New Finding (0.46)

Industry:

Information Technology > Services (0.48)
Health & Medicine (0.48)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Extraction (0.34)

Add feedback

Efficiently Mitigating Classification Bias via Transfer Learning

Jin, Xisen, Barbieri, Francesco, Davani, Aida Mostafazadeh, Kennedy, Brendan, Neves, Leonardo, Ren, Xiang

arXiv.org Machine LearningOct-24-2020

Prediction bias in machine learning models refers to unintended model behaviors that discriminate against inputs mentioning or produced by certain groups; for example, hate speech classifiers predict more false positives for neutral text mentioning specific social groups. Mitigating bias for each task or domain is inefficient, as it requires repetitive model training, data annotation (e.g., demographic information), and evaluation. In pursuit of a more accessible solution, we propose the Upstream Bias Mitigation for Downstream Fine-Tuning (UBM) framework, which mitigate one or multiple bias factors in downstream classifiers by transfer learning from an upstream model. In the upstream bias mitigation stage, explanation regularization and adversarial training are applied to mitigate multiple bias factors. In the downstream fine-tuning stage, the classifier layer of the model is re-initialized, and the entire model is fine-tuned to downstream tasks in potentially novel domains without any further bias mitigation. We expect downstream classifiers to be less biased by transfer learning from de-biased upstream models. We conduct extensive experiments varying the similarity between the source and target data, as well as varying the number of dimensions of bias (e.g., discrimination against specific social groups or dialects). Our results indicate the proposed UBM framework can effectively reduce bias in downstream classifiers.

artificial intelligence, bias mitigation, upstream oil & gas, (19 more...)

arXiv.org Machine Learning

2010.12864

Country: North America > United States > California (0.14)

Genre: Research Report > New Finding (0.66)

Industry: Energy > Oil & Gas (0.54)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Transfer Learning (0.83)

Add feedback

Data Augmentation for Graph Neural Networks

Zhao, Tong, Liu, Yozen, Neves, Leonardo, Woodford, Oliver, Jiang, Meng, Shah, Neil

arXiv.org Machine LearningJun-11-2020

Data augmentation has been widely used to improve generalizability of machine learning models. However, comparatively little work studies data augmentation for graphs. This is largely due to the complex, non-Euclidean structure of graphs, which limits possible manipulation operations. Augmentation operations commonly used in vision and language have no analogs for graphs. Our work studies graph data augmentation for graph neural networks (GNNs) in the context of improving semi-supervised node-classification. We discuss practical and theoretical motivations, considerations and strategies for graph data augmentation. Our work shows that neural edge predictors can effectively encode class-homophilic structure to promote intra-class edges and demote inter-class edges in given graph structure, and our main contribution introduces the GAug graph data augmentation framework, which leverages these insights to improve performance in GNN-based node classification via edge prediction. Extensive experiments on multiple benchmarks show that augmentation via GAug improves performance across GNN architectures and datasets.

graph, health & medicine, neural network, (16 more...)

arXiv.org Machine Learning

2006.0683

Country: North America > United States > California (0.14)

Genre: Research Report (1.00)

Industry:

Health & Medicine (0.67)
Information Technology (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback