Prompting Encoder Models for Zero-Shot Classification: A Cross-Domain Study in Italian

Auriemma, Serena, Miliani, Martina, Madeddu, Mauro, Bondielli, Alessandro, Passaro, Lucia, Lenci, Alessandro

arXiv.org Artificial Intelligence 

Pre-trained LMs have had a significant impact on Natural Language Processing (NLP), with the "pre-train and fine-tune" paradigm rapidly becoming the predominant approach to apply effective models on a wide variety of downstream tasks [1-3, inter alia]. However, one of the main concerns when working with LMs is the paucity of annotated data, especially for specific domains or low-resource languages, required to fine-tune the additional classification layer on top of these models for downstream tasks, such as classification. Recently, prompt-based tuning has started to affirm as a promising way to perform similar tasks, significantly reducing the need for annotated data. This approach has been proven to be very effective with Large Language Models (LLMs) [4]. However, it is often the case that LLMs are not available for low-resource languages, and that their performance drastically decreases when they are challenged on specific domains. Moreover, in the Digital Transformation era, businesses frequently need to integrate artificial intelligence systems into their application ecosystems. This requires them to utilize specialized, publicly available models while also employing effective methods to leverage these models in scenarios where annotated language resources are unavailable, thereby operating in a zero-shot mode. Hence, we decided to evaluate two smaller domain-specific encoder models: BureauBERTo [5], a LM further pre-trained on Italian bureaucratic texts (i.e., administrative acts, banking and insurance documents), and Italian Legal BERT [6] (henceforth referred to as Ita-Legal-BERT), a LM adapted to the Italian legal domain, on various classification tasks on domain-specific data exploiting a prompt-based technique in a zero-shot scenario. Additionally, we compared the performance of both models with that of a generic Italian model, UmBERTo.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found