On the Efficacy of Text-Based Input Modalities for Action Anticipation