Using Language Models on Low-end Hardware

Ziegner, Fabian, Borst, Janos, Niekler, Andreas, Potthast, Martin

arXiv.org Artificial Intelligence 

The transition to neural networks as primary machine learning paradigm in natural language processing (NLP), and especially pre-training language models, became a major driver in NLP tasks within the Digital Humanities. Many applications in fields ranging, among other things, from Library Science, Literature Studies or Cultural Studies have been dramatically improved and automation of text based tasks is becoming widely possible. Current state-of-the-art approaches utilize pre-trained neural language models, which are fine-tuned to a given set of target variables (i.e., by training all parameters of the language model). Training neural networks requires calculating a gradient for every layer and batch element, thus easily tripling the required memory. Those complex and multi-step architectures often use specific hardware, for example Graphics processing units (GPU), in order to be efficiently trained.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found