Language Alignment via Nash-learning and Adaptive feedback