Improving Social Meaning Detection with Pragmatic Masking and Surrogate Fine-Tuning
Zhang, Chiyu, Abdul-Mageed, Muhammad, Elmadany, AbdelRahim, Nagoudi, El Moatez Billah
–arXiv.org Artificial Intelligence
Masked language models (MLMs) are pretrained with a denoising objective that, while useful, is in a mismatch with the objective of downstream fine-tuning. We propose pragmatic masking and surrogate fine-tuning as two strategies that exploit social cues to drive pre-trained representations toward a broad set of concepts useful for a wide class of social meaning tasks. To test our methods, we introduce a new benchmark of 15 different Twitter datasets for social meaning detection. Our methods achieve 2.34% F1 over a competitive baseline, while outperforming other transfer learning methods such as multi-task learning and domain-specific language models pretrained on large datasets. With only 5% of training data (severely few-shot), our methods enable an impressive 68.74% average F1, and we observe promising results in a zero-shot setting involving six datasets from three different languages.
arXiv.org Artificial Intelligence
Jul-31-2021
- Country:
- North America
- United States
- Washington > King County
- Seattle (0.04)
- Minnesota > Hennepin County
- Minneapolis (0.14)
- Louisiana > Orleans Parish
- New Orleans (0.04)
- Colorado > Denver County
- Denver (0.04)
- California
- Los Angeles County > Long Beach (0.14)
- San Diego County > San Diego (0.04)
- Washington > King County
- Canada > British Columbia
- United States
- Europe
- Slovenia (0.04)
- Italy > Tuscany
- Florence (0.04)
- Ireland > Leinster
- County Dublin > Dublin (0.04)
- France > Provence-Alpes-Côte d'Azur
- Bouches-du-Rhône > Marseille (0.04)
- Denmark > Capital Region
- Copenhagen (0.04)
- Asia
- Africa > Ethiopia
- Addis Ababa > Addis Ababa (0.04)
- North America
- Genre:
- Research Report (1.00)
- Technology: