Preference Learning from Physics-Based Feedback: Tuning Language Models to Design BCC/B2 Superalloys
Ghosh, Satanu, Holgate, Collin, Brodnik, Neal R., Downey, Doug, Daly, Samantha, Pollock, Tresa M., Carton, Samuel
–arXiv.org Artificial Intelligence
We apply preference learning to the task of language model-guided design of novel structural alloys. In contrast to prior work that focuses on generating stable inorganic crystals, our approach targets the synthesizeability of a specific structural class: BCC/B2 superalloys, an underexplored family of materials with potential applications in extreme environments. Using three open-weight models (LLaMA-3.1, Gemma-2, and OLMo-2), we demonstrate that language models can be optimized for multiple design objectives using a single, unified reward signal through Direct Preference Optimization (DPO). Unlike prior approaches that rely on heuristic or human-in-the-loop feedback (costly), our reward signal is derived from thermodynamic phase calculations, offering a scientifically grounded criterion for model tuning. To our knowledge, this is the first demonstration of preference-tuning a language model using physics-grounded feedback for structural alloy design. The resulting framework is general and extensible, providing a path forward for intelligent design-space exploration across a range of physical science domains.
arXiv.org Artificial Intelligence
Nov-18-2025
- Country:
- Asia > Middle East
- Jordan (0.04)
- Republic of Türkiye > Karaman Province
- Karaman (0.04)
- Europe > Austria
- Vienna (0.04)
- North America > United States
- California > Santa Barbara County
- Santa Barbara (0.04)
- New Hampshire (0.04)
- Tennessee > Anderson County
- Oak Ridge (0.04)
- California > Santa Barbara County
- Asia > Middle East
- Genre:
- Research Report > New Finding (0.46)
- Technology: