Fine-Tuned Language Models Generate Stable Inorganic Materials as Text

Gruver, Nate, Sriram, Anuroop, Madotto, Andrea, Wilson, Andrew Gordon, Zitnick, C. Lawrence, Ulissi, Zachary

arXiv.org Artificial Intelligence 

We propose fine-tuning large language models for generation of stable materials. While unorthodox, fine-tuning large language models on text-encoded atomistic data is simple to implement yet reliable, with around 90% of sampled structures obeying physical constraints on atom positions and charges. Using energy above hull calculations from both learned ML potentials and gold-standard DFT calculations, we show that our strongest model (fine-tuned LLaMA-2 70B) can generate materials predicted to be metastable at about twice the rate (49% vs 28%) of CD-VAE, a competing diffusion model. Because of text prompting's inherent flexibility, our models can simultaneously be used for unconditional generation of stable material, infilling of partial structures and text-conditional generation. Finally, we show that language models' ability to capture key symmetries of crystal structures improves with model scale, suggesting that the biases of pretrained LLMs are surprisingly well-suited for atomistic data. Large language models (LLMs) are trained to compress large text datasets, but can also act as strong foundations for non-text data (Delétang et al., 2023). As compressors, LLMs extract common patterns and find simple programs that can produce them (Goldblum et al., 2023; Sutskever, 2023), regardless of the data's origin. Alongside generality, LLM pre-training also gives rise to sample efficiency, as in-context learning and fine-tuning require far fewer training examples to identify salient patterns than training a model from scratch (Brown et al., 2020). The generality and sample efficiency of LLMs make them particular promising for scientific problems, where data are often limited, collected from diverse sources, or challenging for non-experts to interpret. In materials science, for example, the number of known stable materials is relatively small, and the data describing each material are diverse, including composition, structure, and complex properties. LLMs can learn generalizable rules from a small number of examples (Zhu et al., 2023), combine modalities into a single model (Moon et al., 2023), and provide users with a text-based interface. A text interface, in particular, has the potential to improve access to scientific discovery (White, 2023); LLMs can use text to describe new observations, or, in design applications (e.g. In this work, we show that fine-tuned LLMs can generate the three-dimensional structure of stable crystals as text (Figure 1).