How do different tokenizers perform on downstream tasks in scriptio continua languages?: A case study in Japanese

Open in new window