Enhancing Spoken Discourse Modeling in Language Models Using Gestural Cues