Taming the Tail: Stable LLM Reinforcement Learning via Dynamic Vocabulary Pruning