A Comprehensive Survey of Cross-Domain Policy Transfer for Embodied Agents

Niu, Haoyi, Hu, Jianming, Zhou, Guyue, Zhan, Xianyuan

arXiv.org Artificial Intelligence 

In some settings, although unbiased data from the target domain remains human demonstration videos can be easily recorded in a controllable a challenge due to costly data collection processes manner in the target environment, the distinct embodiment and stringent safety requirements. Consequently, from the target robot agents hinders their direct use researchers often resort to data from easily accessible in policy learning (Yu et al., 2018). Such intricate environment source domains, such as simulation and laboratory and embodiment discrepancies, also referred to as domain environments, for cost-effective data acquisition gaps, negatively impact policies trained on source domain and rapid model iteration. Nevertheless, the data and inevitably lead to their deployment failures in environments and embodiments of these source domains the target domains. The data bottlenecks in real-world tasks can be quite different from their target domain and the wide existence of domain gaps naturally stimulated counterparts, underscoring the need for effective cross-domain policy transfer studies, aiming to fully exploit cross-domain policy transfer approaches. In existing off-domain data to learn transferable policies.