Splintering Nonconcatenative Languages for Better Tokenization