Tokenization as Finite-State Transduction