Advancing Neural Encoding of Portuguese with Transformer Albertina PT-*
Rodrigues, João, Gomes, Luís, Silva, João, Branco, António, Santos, Rodrigo, Cardoso, Henrique Lopes, Osório, Tomás
–arXiv.org Artificial Intelligence
In recent years, the field of Artificial Intelligence has come to successfully exploit the paradigm of deep learning, a machine learning approach based on large artificial neural networks [LeCun et al., 2015]. Applied to Natural Language Processing (NLP), deep learning gained outstanding traction with notable breakthroughs under the distributional semantics approach, namely with word embedding techniques [Mikolov et al., 2013] and the Transformer neural architecture [Vaswani et al., 2017]. These neural models acquire semantic representations from massive amounts of data in a self-supervised learning process that ultimately results in the so-called Foundation Models [Bommasani et al., 2021]. Self-supervision is accomplished in NLP through language modeling [Bengio et al., 2000] and was initially adopted in shallow neural network models such as Word2Vec [Mikolov et al., 2013] for the creation of word embeddings. Over time, this approach was scaled beyond the single-token level to sequence transduction with encoding-decoding models based on recurrent [Sutskever et al., 2014] or convolution neural networks and occasionally supported by attention mechanisms [Bahdanau et al., 2015]. A particular neural network architecture, the Transformer, has stood out among all others, showing superior performance by a large margin, sometimes even surpassing human-level performance [Wang et al., 2018, Wang et al., 2019], and became mainstream in virtually every NLP task and application [Bommasani et al., 2021]. Several variants have spun out from the base Transformer architecture (encoder-decoder), including the landmark encoder BERT [Devlin et al., 2019] and the outstanding decoder GPT [Brown et al., 2020], which have been most successfully adapted to downstream
arXiv.org Artificial Intelligence
Jun-20-2023
- Country:
- South America
- Brazil (0.05)
- Colombia > Meta Department
- Villavicencio (0.04)
- Europe > Portugal
- South America
- Genre:
- Research Report (0.64)
- Technology: