yanchenko
Few-shot learning for automated content analysis: Efficient coding of arguments and claims in the debate on arms deliveries to Ukraine
Rieger, Jonas, Yanchenko, Kostiantyn, Ruckdeschel, Mattes, von Nordheim, Gerret, Königslöw, Katharina Kleinen-von, Wiedemann, Gregor
Pre-trained language models (PLM) based on transformer neural networks developed in the field of natural language processing (NLP) offer great opportunities to improve automatic content analysis in communication science, especially for the coding of complex semantic categories in large datasets via supervised machine learning. However, three characteristics so far impeded the widespread adoption of the methods in the applying disciplines: the dominance of English language models in NLP research, the necessary computing resources, and the effort required to produce training data to fine-tune PLMs. In this study, we address these challenges by using a multilingual transformer model in combination with the adapter extension to transformers, and few-shot learning methods. We test our approach on a realistic use case from communication science to automatically detect claims and arguments together with their stance in the German news debate on arms deliveries to Ukraine. In three experiments, we evaluate (1) data preprocessing strategies and model variants for this task, (2) the performance of different few-shot learning methods, and (3) how well the best setup performs on varying training set sizes in terms of validity, reliability, replicability and reproducibility of the results. We find that our proposed combination of transformer adapters with pattern exploiting training provides a parameter-efficient and easily shareable alternative to fully fine-tuning PLMs. It performs on par in terms of validity, while overall, provides better properties for application in communication studies. The results also show that pre-fine-tuning for a task on a near-domain dataset leads to substantial improvement, in particular in the few-shot setting. Further, the results indicate that it is useful to bias the dataset away from the viewpoints of specific prominent individuals.
- Europe > Germany (0.46)
- Asia > Russia (0.46)
- Europe > Ukraine > Luhansk Oblast (0.14)
- (6 more...)
- Media > News (1.00)
- Government > Military (1.00)
- Government > Regional Government > Europe Government (0.45)
Machine Learning Meets the Maestros
Even if you can't name the tunes, you've probably heard them: from the iconic "dun-dun-dun-dunnnn" opening of Beethoven's Fifth Symphony to the melody of "Ode to Joy," the German composer's symphonies are some of the best known and widely performed in classical music. Just as enthusiasts can recognize stylistic differences between one orchestra's version of Beethoven's hits and another, now machines can, too. A Duke University team has developed a machine learning algorithm that "listens" to multiple performances of the same piece and can tell the difference between, say, the Berlin Philharmonic and the London Symphony Orchestra, based on subtle differences in how they interpret a score. In a study published in a recent issue of the journal Annals of Applied Statistics, the team set the algorithm loose on all nine Beethoven symphonies as performed by 10 different orchestras over nearly eight decades, from a 1939 recording of the NBC Symphony Orchestra conducted by Arturo Toscanini, to Simon Rattle's version with the Berlin Philharmonic in 2016. Although each follows the same fixed score -– the published reference left by Beethoven about how to play the notes -- every orchestra has a slightly different way of turning a score into sounds.
- North America > United States > North Carolina > Durham County > Durham (0.05)
- North America > United States > Massachusetts (0.05)
- Europe > Austria > Vienna (0.05)
- Media > Music (1.00)
- Leisure & Entertainment (1.00)