Visualizing the Local Atomic Environment Features of Machine Learning Interatomic Potential

Shao, Xuqiang, Zhang, Yuqi, Zhang, Di, Gao, Tianxiang, Liu, Xinyuan, Gan, Zhiran, Meng, Fanshun, Li, Hao, Yang, Weijie

arXiv.org Artificial Intelligence 

This paper addresses the challenges of creating efficient and high-quality datasets for machine learning potential functions. We present a novel approach, termed DV-LAE (Difference Vectors based on Local Atomic Environments), which utilizes the properties of atomic local environments and employs histogram statistics to generate difference vectors. This technique facilitates dataset screening and optimization, effectively minimizing redundancy while maintaining data diversity. We have validated the optimized datasets in high-temperature and high-pressure hydrogen systems as well as the {\alpha}-Fe/H binary system, demonstrating a significant reduction in computational resource usage without compromising prediction accuracy. Additionally, our method has revealed new structures that emerge during simulations but were underrepresented in the initial training datasets. The redundancy in the datasets and the distribution of these new structures can be visually analyzed through the visualization of difference vectors. This approach enhances our understanding of the characteristics of these newly formed structures and their impact on physical processes.