Dynamic Word Tokenization with Regex Tokenizer