Release New tokenizer API, TensorFlow improvements, enhanced documentation & tutorials · huggingface/transformers
The tokenizers has evolved quickly in version 2, with the addition of rust tokenizers. It now has a simpler and more flexible API aligned between Python (slow) and Rust (fast) tokenizers. This new API let you control truncation and padding deeper allowing things like dynamic padding or padding to a multiple of 8. The MobileBERT from MobileBERT: a Compact Task-Agnostic BERT for Resource-Limited Devices by Zhiqing Sun, Hongkun Yu, Xiaodan Song, Renjie Liu, Yiming Yang, Denny Zhou, was added to the library for both PyTorch and TensorFlow. This model was first implemented in PyTorch by @lonePatient, ported to the library by @vshampor, then finalized and implemented in Tensorflow by @LysandreJik.
Jun-30-2020, 15:46:42 GMT