polymorph
OXtal: An All-Atom Diffusion Model for Organic Crystal Structure Prediction
Jin, Emily, Nica, Andrei Cristian, Galkin, Mikhail, Rector-Brooks, Jarrid, Lee, Kin Long Kelvin, Miret, Santiago, Arnold, Frances H., Bronstein, Michael, Bose, Avishek Joey, Tong, Alexander, Liu, Cheng-Hao
Accurately predicting experimentally-realizable 3D molecular crystal structures from their 2D chemical graphs is a long-standing open challenge in computational chemistry called crystal structure prediction (CSP). Efficiently solving this problem has implications ranging from pharmaceuticals to organic semiconductors, as crystal packing directly governs the physical and chemical properties of organic solids. In this paper, we introduce OXtal, a large-scale 100M parameter all-atom diffusion model that directly learns the conditional joint distribution over intramolecular conformations and periodic packing. To efficiently scale OXtal, we abandon explicit equivariant architectures imposing inductive bias arising from crystal symmetries in favor of data augmentation strategies. We further propose a novel crystallization-inspired lattice-free training scheme, Stoichiometric Stochastic Shell Sampling ($S^4$), that efficiently captures long-range interactions while sidestepping explicit lattice parametrization -- thus enabling more scalable architectural choices at all-atom resolution. By leveraging a large dataset of 600K experimentally validated crystal structures (including rigid and flexible molecules, co-crystals, and solvates), OXtal achieves orders-of-magnitude improvements over prior ab initio machine learning CSP methods, while remaining orders of magnitude cheaper than traditional quantum-chemical approaches. Specifically, OXtal recovers experimental structures with conformer $\text{RMSD}_1<0.5$ Å and attains over 80\% packing similarity rate, demonstrating its ability to model both thermodynamic and kinetic regularities of molecular crystallization.
- South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
- North America > Canada > Quebec (0.04)
- Europe > United Kingdom > Wales (0.04)
- (2 more...)
- Information Technology > Artificial Intelligence > Natural Language (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Evolutionary Systems (0.67)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.67)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.45)
All that structure matches does not glitter
Martirossyan, Maya M., Egg, Thomas, Hoellmer, Philipp, Karypis, George, Transtrum, Mark, Roitberg, Adrian, Liu, Mingjie, Hennig, Richard G., Tadmor, Ellad B., Martiniani, Stefano
Generative models for materials, especially inorganic crystals, hold potential to transform the theoretical prediction of novel compounds and structures. Advancement in this field depends on robust benchmarks and minimal, information-rich datasets that enable meaningful model evaluation. This paper critically examines common datasets and reported metrics for a crystal structure prediction task$\unicode{x2014}$generating the most likely structures given the chemical composition of a material. We focus on three key issues: First, materials datasets should contain unique crystal structures; for example, we show that the widely-utilized carbon-24 dataset only contains $\approx$40% unique structures. Second, materials datasets should not be split randomly if polymorphs of many different compositions are numerous, which we find to be the case for the perov-5 and MP-20 datasets. Third, benchmarks can mislead if used uncritically, e.g., reporting a match rate metric without considering the structural variety exhibited by identical building blocks. To address these oft-overlooked issues, we introduce several fixes. We provide revised versions of the carbon-24 dataset: one with duplicates removed, one deduplicated and split by number of atoms $N$, one with enantiomorphs, and two containing only identical structures but with different unit cells. We also propose new splits for datasets with polymorphs, ensuring that polymorphs are grouped within each split subset, setting a more sensible standard for benchmarking model performance. Finally, we present METRe and cRMSE, new model evaluation metrics that can correct existing issues with the match rate metric.
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.28)
- North America > United States > Florida > Alachua County > Gainesville (0.14)
- North America > United States > Utah > Utah County > Provo (0.04)
- (2 more...)
- Health & Medicine (0.67)
- Materials > Chemicals (0.67)
FastCSP: Accelerated Molecular Crystal Structure Prediction with Universal Model for Atoms
Gharakhanyan, Vahe, Yang, Yi, Barroso-Luque, Luis, Shuaibi, Muhammed, Levine, Daniel S., Michel, Kyle, Bernat, Viachaslau, Dzamba, Misko, Fu, Xiang, Gao, Meng, Liu, Xingyu, Noori, Keian, Purvis, Lafe J., Rao, Tingling, Wood, Brandon M., Rizvi, Ammar, Uyttendaele, Matt, Ouderkirk, Andrew J., Daraio, Chiara, Zitnick, C. Lawrence, Boromand, Arman, Marom, Noa, Ulissi, Zachary W., Sriram, Anuroop
Crystal Structure Prediction (CSP) of molecular crystals plays a central role in applications, such as pharmaceuticals and organic electronics. CSP is challenging and computationally expensive due to the need to explore a large search space with sufficient accuracy to capture energy differences of a few kJ/mol between polymorphs. Dispersion-inclusive density functional theory (DFT) provides the required accuracy but its computational cost is impractical for a large number of putative structures. We introduce FastCSP, an open-source, high-throughput CSP workflow based on machine learning interatomic potentials (MLIPs). FastCSP combines random structure generation using Genarris 3.0 with geometry relaxation and free energy calculations powered entirely by the Universal Model for Atoms (UMA) MLIP. We benchmark FastCSP on a curated set of 28 mostly rigid molecules, demonstrating that our workflow consistently generates known experimental structures and ranks them within 5 kJ/mol per molecule of the global minimum. Our results demonstrate that universal MLIPs can be used across diverse compounds without requiring system-specific tuning. Moreover, the speed and accuracy afforded by UMA eliminate the need for classical force fields in the early stages of CSP and for final re-ranking with DFT. The open-source release of the entire FastCSP workflow significantly lowers the barrier to accessing CSP. CSP results for a single system can be obtained within hours on tens of modern GPUs, making high-throughput crystal structure prediction feasible for a broad range of scientific applications.
- North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.14)
- Africa > Togo (0.04)
- North America > United States > Texas (0.04)
- (3 more...)
Toward Routine CSP of Pharmaceuticals: A Fully Automated Protocol Using Neural Network Potentials
Glick, Zachary L., Metcalf, Derek P., Swarthout, Scott F.
Crystal structure prediction (CSP) is a useful tool in pharmaceutical development for identifying and assessing risks associated with polymorphism, yet widespread adoption has been hindered by high computational costs and the need for both manual specification and expert knowledge to achieve useful results. Here, we introduce a fully automated, high-throughput CSP protocol designed to overcome these barriers. The protocol's efficiency is driven by Lavo-NN, a novel neural network potential (NNP) architected and trained specifically for pharmaceutical crystal structure generation and ranking. This NNP-driven crystal generation phase is integrated into a scalable cloud-based workflow. We validate this CSP protocol on an extensive retrospective benchmark of 49 unique molecules, almost all of which are drug-like, successfully generating structures that match all 110 $Z' = 1$ experimental polymorphs. The average CSP in this benchmark is performed with approximately 8.4k CPU hours, which is a significant reduction compared to other protocols. The practical utility of the protocol is further demonstrated through case studies that resolve ambiguities in experimental data and a semi-blinded challenge that successfully identifies and ranks polymorphs of three modern drugs from powder X-ray diffraction patterns alone. By significantly reducing the required time and cost, the protocol enables CSP to be routinely deployed earlier in the drug discovery pipeline, such as during lead optimization. Rapid turnaround times and high throughput also enable CSP that can be run in parallel with experimental screening, providing chemists with real-time insights to guide their work in the lab.
- North America > United States (1.00)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Europe > Germany > Baden-Württemberg > Karlsruhe Region > Weinheim (0.04)
Polymorphism Crystal Structure Prediction with Adaptive Space Group Diversity Control
Omee, Sadman Sadeed, Wei, Lai, Dey, Sourin, Hu, Jianjun
Crystalline materials can form different structural arrangements (i.e. polymorphs) with the same chemical composition, exhibiting distinct physical properties depending on how they were synthesized or the conditions under which they operate. For example, carbon can exist as graphite (soft, conductive) or diamond (hard, insulating). Computational methods that can predict these polymorphs are vital in materials science, which help understand stability relationships, guide synthesis efforts, and discover new materials with desired properties without extensive trial-and-error experimentation. However, effective crystal structure prediction (CSP) algorithms for inorganic polymorph structures remain limited. We propose ParetoCSP2, a multi-objective genetic algorithm for polymorphism CSP that incorporates an adaptive space group diversity control technique, preventing over-representation of any single space group in the population guided by a neural network interatomic potential. Using an improved population initialization method and performing iterative structure relaxation, ParetoCSP2 not only alleviates premature convergence but also achieves improved convergence speed. Our results show that ParetoCSP2 achieves excellent performance in polymorphism prediction, including a nearly perfect space group and structural similarity accuracy for formulas with two polymorphs but with the same number of unit cell atoms. Evaluated on a benchmark dataset, it outperforms baseline algorithms by factors of 2.46-8.62 for these accuracies and improves by 44.8\%-87.04\% across key performance metrics for regular CSP. Our source code is freely available at https://github.com/usccolumbia/ParetoCSP2.
AI-Driven Robotic Crystal Explorer for Rapid Polymorph Identification
Lee, Edward C, Salley, Daniel, Sharma, Abhishek, Cronin, Leroy
Crystallisation is an important phenomenon which facilitates the purification as well as structural and bulk phase material characterisation using crystallographic methods. However, different conditions can lead to a vast set of different crystal structure polymorphs and these often exhibit different physical properties, allowing materials to be tailored to specific purposes. This means the high dimensionality that can result from variations in the conditions which affect crystallisation, and the interaction between them, means that exhaustive exploration is difficult, time-consuming, and costly to explore. Herein we present a robotic crystal search engine for the automated and efficient high-throughput approach to the exploration of crystallisation conditions. The system comprises a closed-loop computer crystal-vision system that uses machine learning to both identify crystals and classify their identity in a multiplexed robotic platform. By exploring the formation of a well-known polymorph, we were able to show how a robotic system could be used to efficiently search experimental space as a function of relative polymorph amount and efficiently create a high dimensionality phase diagram with minimal experimental budget and without expensive analytical techniques such as crystallography. In this way, we identify the set of polymorphs possible within a set of experimental conditions, as well as the optimal values of these conditions to grow each polymorph.
- North America > United States > New York (0.04)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
- Europe > Portugal > Braga > Braga (0.04)
- Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.04)
Modular, Multi-Robot Integration of Laboratories: An Autonomous Solid-State Workflow for Powder X-Ray Diffraction
Lunt, Amy. M., Fakhruldeen, Hatem, Pizzuto, Gabriella, Longley, Louis, White, Alexander, Rankin, Nicola, Clowes, Rob, Alston, Ben, Gigli, Lucia, Day, Graeme M., Cooper, Andrew I., Chong, Sam. Y.
Automation can transform productivity in research activities that use liquid handling, such as organic synthesis, but it has made less impact in materials laboratories, which require sample preparation steps and a range of solid-state characterization techniques. For example, powder X-ray diffraction (PXRD) is a key method in materials and pharmaceutical chemistry, but its end-to-end automation is challenging because it involves solid powder handling and sample processing. Here we present a fully autonomous solid-state workflow for PXRD experiments that can match or even surpass manual data quality. The workflow involves 12 steps performed by a team of three multipurpose robots, illustrating the power of flexible, modular automation to integrate complex, multitask laboratories.
- Europe > United Kingdom (0.05)
- North America > United States (0.04)
Predicting emergence of crystals from amorphous matter with deep learning
Aykol, Muratahan, Merchant, Amil, Batzner, Simon, Wei, Jennifer N., Cubuk, Ekin Dogus
Crystallization of the amorphous phases into metastable crystals plays a fundamental role in the formation of new matter, from geological to biological processes in nature to synthesis and development of new materials in the laboratory. Predicting the outcome of such phase transitions reliably would enable new research directions in these areas, but has remained beyond reach with molecular modeling or ab-initio methods. Here, we show that crystallization products of amorphous phases can be predicted in any inorganic chemistry by sampling the crystallization pathways of their local structural motifs at the atomistic level using universal deep learning potentials. We show that this approach identifies the crystal structures of polymorphs that initially nucleate from amorphous precursors with high accuracy across a diverse set of material systems, including polymorphic oxides, nitrides, carbides, fluorides, chlorides, chalcogenides, and metal alloys. Our results demonstrate that Ostwald's rule of stages can be exploited mechanistically at the molecular level to predictably access new metastable crystals from the amorphous phase in material synthesis.
A Father's Quest for an Accessible Game Controller
In June, 8BitDo, known for creating third-party controllers and adapters, announced their latest controller for the Nintendo Switch and Android devices. The Lite SE, created through collaborative efforts with father and son team Andreas and Oskar Karlsson, is designed specifically for physically disabled players with limited strength and mobility. The launch of this controller not only marks the culmination of years of hard work by Andreas to search for an affordable and accessible controller for his son, but it also expands the market of accessible gaming tech. At a young age Oskar was diagnosed with spinal muscular atrophy type II, a neuromuscular disorder that progressively weakens muscles over time. Despite playing games throughout his life, his father regularly adapted standard controllers to meet his son's needs.
- Leisure & Entertainment > Games > Computer Games (1.00)
- Health & Medicine (1.00)
- Information Technology > Communications > Mobile (0.56)
- Information Technology > Artificial Intelligence (0.37)
Polymorph: A Model for Dynamic Level Generation
Jennings-Teats, Martin (University of California, Santa Cruz) | Smith, Gillian (University of California, Santa Cruz) | Wardrip-Fruin, Noah (University of California, Santa Cruz)
Players begin games at different skill levels and develop their skill at different rates—so that even the best-designed games are uninterestingly easy for some players and frustratingly difficult for others. A proposed answer to this challenge is Dynamic Difficulty Adjustment (DDA), a general category of approaches that alter games during play, in response to player performance. However, nearly all these techniques are focused on basic parameter tweaking, while the difficulty of many games is connected to aspects that are more challenging to adjust dynamically, such as level design. Further, most DDA techniques are based on designer intuition, which may not reflect actual play patterns. Responding to these challenges, we have created Polymorph, which employs techniques from level generation and machine learning to understand level difficulty and player skill, dynamically constructing levels for a 2D platformer game with continually-appropriate challenge. We present the results of the user study on which Polymorph's model of level difficulty is based, as well as a discussion of the unique features of the model. We believe Polymorph creates a play experience that is unique because the changes are both personalized and structural, while also providing an example of a new application of machine learning to aid game design.
- North America > United States > California > Santa Cruz County > Santa Cruz (0.14)
- North America > United States > New York > New York County > New York City (0.04)