Rethinking Tokenization: Crafting Better Tokenizers for Large Language Models