Beyond Single Stationary Policies: Meta-Task Players as Naturally Superior Collaborators

Neural Information Processing Systems 

In human-AI collaborative tasks, the distribution of human behavior, influenced by mental models, is non-stationary, manifesting in various levels of initiative and different collaborative strategies. A significant challenge in human-AI collaboration is determining how to collaborate effectively with humans exhibiting non-stationary dynamics. Current collaborative agents involve initially running self-play (SP) multiple times to build a policy pool, followed by training the final adaptive policy against this pool. These agents themselves are a single policy network, which is insufficient for handling non-stationary human dynamics. We discern that despite the inherent diversity in human behaviors, the underlying meta-tasks within specific collaborative contexts tend to be strikingly similar.

Similar Docs  Excel Report  more

TitleSimilaritySource
None found