Back to Bytes: Revisiting Tokenization Through UTF-8

Open in new window