Self-Supervised Learning for Ordered Three-Dimensional Structures

Spellings, Matthew, Martirossyan, Maya, Dshemuchadse, Julia

arXiv.org Artificial Intelligence 

Recent work on GPT [1], BERT [2], and related models has proven immensely successful, not only in direct language modeling tasks but also other domains including translation, question answering, and even code [3] and music [4] generation. In addition to directly performing transfer learning, prompt engineering has emerged as a promising method to leverage the power of large language models trained on diverse types of texts [5, 6]. The general strategy of pretraining large models on easily-gathered unlabeled data using self-supervised tasks and then fine-tuning on more relevant labeled data is especially appealing for many scientific domains where labeled data may be difficult to come by. In materials physics, it is well understood how structure plays a significant role in electrical, thermal, or mechanical properties of a material, and scientists target particular structures as they design new materials for desired applications. For crystals, "structure" typically refers to the basic building unit which is repeated along a periodic lattice to create a bulk crystal, but--particularly for aperiodic or non-crystalline materials--it can also refer to any symmetry or non-random ordering present in the arrangements of particles or atoms. Assessing order and its evolution in three-dimensional structures is a challenging, but critical method for understanding the self-assembly and growth of complex materials; particularly as the scope and magnitude of experiment and simulation data analysis continues to expand, machine learning techniques that are able to leverage large amounts of unlabeled data will become ever more crucial. In this work, we use self-supervised learning (SSL) tasks that can broadly be used to train models for quantifying order and distinguishing assemblies in non-idealized material structures. The choice of SSL for this application was inspired by previous work that has developed SSL tasks for three-dimensional point clouds, which are a natural choice for representing three-dimensional positional data. Thabet et al. [7] formulated self-supervised tasks in terms of a space-filling curve; Sharma and Kaul [8] trained deep networks to model data based on a three-dimensional cover tree; the method proposed in Eckart et al. [9] models simple, soft "patches" of 3D point clouds in order to reconstruct its inputs; and Pang et al. [10] spatially mask