Using Language Models on Low-end Hardware
Ziegner, Fabian, Borst, Janos, Niekler, Andreas, Potthast, Martin
–arXiv.org Artificial Intelligence
The transition to neural networks as primary machine learning paradigm in natural language processing (NLP), and especially pre-training language models, became a major driver in NLP tasks within the Digital Humanities. Many applications in fields ranging, among other things, from Library Science, Literature Studies or Cultural Studies have been dramatically improved and automation of text based tasks is becoming widely possible. Current state-of-the-art approaches utilize pre-trained neural language models, which are fine-tuned to a given set of target variables (i.e., by training all parameters of the language model). Training neural networks requires calculating a gradient for every layer and batch element, thus easily tripling the required memory. Those complex and multi-step architectures often use specific hardware, for example Graphics processing units (GPU), in order to be efficiently trained.
arXiv.org Artificial Intelligence
May-8-2023
- Country:
- Asia (0.93)
- Europe (1.00)
- North America > United States
- Minnesota > Hennepin County > Minneapolis (0.14)
- Genre:
- Research Report (1.00)
- Technology: