Interaction Dynamics as a Reward Signal for LLMs