Spontaneous Style Text-to-Speech Synthesis with Controllable Spontaneous Behaviors Based on Language Models