SPINE: Token-Selective Test-Time Reinforcement Learning with Entropy-Band Regularization