Ponzetto, Simone Paolo
Can Demographic Factors Improve Text Classification? Revisiting Demographic Adaptation in the Age of Transformers
Hung, Chia-Chien, Lauscher, Anne, Hovy, Dirk, Ponzetto, Simone Paolo, Glavaš, Goran
Demographic factors (e.g., gender or age) shape our language. Previous work showed that incorporating demographic factors can consistently improve performance for various NLP tasks with traditional NLP models. In this work, we investigate whether these previous findings still hold with state-of-the-art pretrained Transformer-based language models (PLMs). We use three common specialization methods proven effective for incorporating external knowledge into pretrained Transformers (e.g., domain-specific or geographic knowledge). We adapt the language representations for the demographic dimensions of gender and age, using continuous language modeling and dynamic multi-task learning for adaptation, where we couple language modeling objectives with the prediction of demographic classes. Our results, when employing a multilingual PLM, show substantial gains in task performance across four languages (English, German, French, and Danish), which is consistent with the results of previous work. However, controlling for confounding factors - primarily domain and language proficiency of Transformer-based PLMs - shows that downstream performance gains from our demographic adaptation do not actually stem from demographic knowledge. Our results indicate that demographic specialization of PLMs, while holding promise for positive societal impact, still represents an unsolved problem for (modern) NLP.
On the Limitations of Sociodemographic Adaptation with Transformers
Hung, Chia-Chien, Lauscher, Anne, Hovy, Dirk, Ponzetto, Simone Paolo, Glavaš, Goran
Sociodemographic factors (e.g., gender or age) shape our language. Previous work showed that incorporating specific sociodemographic factors can consistently improve performance for various NLP tasks in traditional NLP models. We investigate whether these previous findings still hold with state-of-the-art pretrained Transformers. We use three common specialization methods proven effective for incorporating external knowledge into pretrained Transformers (e.g., domain-specific or geographic knowledge). We adapt the language representations for the sociodemographic dimensions of gender and age, using continuous language modeling and dynamic multi-task learning for adaptation, where we couple language modeling with the prediction of a sociodemographic class. Our results when employing a multilingual model show substantial performance gains across four languages (English, German, French, and Danish). These findings are in line with the results of previous work and hold promise for successful sociodemographic specialization. However, controlling for confounding factors like domain and language shows that, while sociodemographic adaptation does improve downstream performance, the gains do not always solely stem from sociodemographic knowledge. Our results indicate that sociodemographic specialization, while very important, is still an unresolved problem in NLP.
Large-scale Taxonomy Induction Using Entity and Word Embeddings
Ristoski, Petar, Faralli, Stefano, Ponzetto, Simone Paolo, Paulheim, Heiko
Taxonomies are an important ingredient of knowledge organization, and serve as a backbone for more sophisticated knowledge representations in intelligent systems, such as formal ontologies. However, building taxonomies manually is a costly endeavor, and hence, automatic methods for taxonomy induction are a good alternative to build large-scale taxonomies. In this paper, we propose TIEmb, an approach for automatic unsupervised class subsumption axiom extraction from knowledge bases using entity and text embeddings. We apply the approach on the WebIsA database, a database of subsumption relations extracted from the large portion of the World Wide Web, to extract class hierarchies in the Person and Place domain.
A General Framework for Implicit and Explicit Debiasing of Distributional Word Vector Spaces
Lauscher, Anne, Glavaš, Goran, Ponzetto, Simone Paolo, Vulić, Ivan
Distributional word vectors have recently been shown to encode many of the human biases, most notably gender and racial biases, and models for attenuating such biases have consequently been proposed. However, existing models and studies (1) operate on under-specified and mutually differing bias definitions, (2) are tailored for a particular bias (e.g., gender bias) and (3) have been evaluated inconsistently and non-rigorously. In this work, we introduce a general framework for debiasing word embeddings. We operationalize the definition of a bias by discerning two types of bias specification: explicit and implicit. We then propose three debiasing models that operate on explicit or implicit bias specifications, and that can be composed towards more robust debiasing. Finally, we devise a full-fledged evaluation framework in which we couple existing bias metrics with newly proposed ones. Experimental findings across three embedding methods suggest that the proposed debiasing models are robust and widely applicable: they often completely remove the bias both implicitly and explicitly, without degradation of semantic information encoded in any of the input distributional spaces. Moreover, we successfully transfer debiasing models, by means of crosslingual embedding spaces, and remove or attenuate biases in distributional word vector spaces of languages that lack readily available bias specifications.
Unsupervised Sense-Aware Hypernymy Extraction
Ustalov, Dmitry, Panchenko, Alexander, Biemann, Chris, Ponzetto, Simone Paolo
In this paper, we show how unsupervised sense representations can be used to improve hypernymy extraction. We present a method for extracting disambiguated hypernymy relationships that propagates hypernyms to sets of synonyms (synsets), constructs embeddings for these sets, and establishes sense-aware relationships between matching synsets. Evaluation on two gold standard datasets for English and Russian shows that the method successfully recognizes hypernymy relationships that cannot be found with standard Hearst patterns and Wiktionary datasets for the respective languages.
BabelRelate! A Joint Multilingual Approach to Computing Semantic Relatedness
Navigli, Roberto (Sapienza Università di Roma) | Ponzetto, Simone Paolo (Sapienza Università di Roma)
We present a knowledge-rich approach to computing semantic relatedness which exploits the joint contribution of different languages. Our approach is based on the lexicon and semantic knowledge of a wide-coverage multilingual knowledge base, which is used to compute semantic graphs in a variety of languages. Complementary information from these graphs is then combined to produce a 'core' graph where disambiguated translations are connected by means of strong semantic relations. We evaluate our approach on standard monolingual and bilingual datasets, and show that: i) we outperform a graph-based approach which does not use multilinguality in a joint way; ii) we achieve uniformly competitive results for both resource-rich and resource-poor languages.