Squeeze the Soaked Sponge: Efficient Off-policy Reinforcement Finetuning for Large Language Model