AITopics | crystallography

Collaborating Authors

crystallography

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Completion of partial structures using Patterson maps with the CrysFormer machine learning model

Pan, Tom, Dramko, Evan, Miller, Mitchell D., Kyrillidis, Anastasios, Phillips, George N. Jr

arXiv.org Artificial IntelligenceNov-14-2025

Protein structure determination has long been one of the primary challenges of structural biology, to which deep machine learning (ML)-based approaches have increasingly been applied. However, these ML models generally do not incorporate the experimental measurements directly, such as X-ray crystallographic diffraction data. To this end, we explore an approach that more tightly couples these traditional crystallographic and recent ML-based methods, by training a hybrid 3-d vision transformer and convolutional network on inputs from both domains. We make use of two distinct input constructs / Patterson maps, which are directly obtainable from crystallographic data, and "partial structure" template maps derived from predicted structures deposited in the AlphaFold Protein Structure Database with subsequently omitted residues. With these, we predict electron density maps that are then post-processed into atomic models through standard crystallographic refinement processes. Introducing an initial dataset of small protein fragments taken from Protein Data Bank entries and placing them in hypothetical crystal settings, we demonstrate that our method is effective at both improving the phases of the crystallographic structure factors and completing the regions missing from partial structure templates, as well as improving the agreement of the electron density maps with the ground truth atomic structures. This work has been accepted in Acta Crystallographic section D.

bioinformatics, machine learning, prediction, (18 more...)

arXiv.org Artificial Intelligence

2511.1044

Country: North America > United States (0.29)

Genre: Research Report (0.50)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Biomedical Informatics > Translational Bioinformatics (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Rethinking Crystal Symmetry Prediction: A Decoupled Perspective

Yu, Liheng, Zhao, Zhe, Wang, Xucong, Wu, Di, Wang, Pengkun

arXiv.org Artificial IntelligenceNov-11-2025

Efficiently and accurately determining the symmetry is a crucial step in the structural analysis of crystalline materials. Existing methods usually mindlessly apply deep learning models while ignoring the underlying chemical rules. More importantly, experiments show that they face a serious sub-property confusion (SPC) problem. To address the above challenges, from a decoupled perspective, we introduce the XRDecoupler framework, a problem-solving arsenal specifically designed to tackle the SPC problem. Imitating the thinking process of chemists, we innovatively incorporate multidimensional crystal symmetry information as superclass guidance to ensure that the model's prediction process aligns with chemical intuition. We further design a hierarchical PXRD pattern learning model and a multi-objective optimization approach to achieve high-quality representation and balanced optimization. Comprehensive evaluations on three mainstream databases (e.g., CCDC, CoREMOF, and InorganicData) demonstrate that XRDecoupler excels in performance, interpretability, and generalization.

artificial intelligence, machine learning, space group, (21 more...)

arXiv.org Artificial Intelligence

2511.06976

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

OPENXRD: A Comprehensive Benchmark and Enhancement Framework for LLM/MLLM XRD Question Answering

Vosoughi, Ali, Shahnazari, Ayoub, Xi, Yufeng, Zhang, Zeliang, Hess, Griffin, Xu, Chenliang, Abdolrahim, Niaz

arXiv.org Artificial IntelligenceJul-15-2025

This work presents OPENXRD, an open-book pipeline designed for crystallography question answering, which integrates textual prompts with concise supporting content generated by GPT-4.5. Instead of using scanned textbooks, which may lead to copyright issues, OPENXRD generates compact, domain-specific references that help smaller models understand key concepts in X-ray diffraction (XRD). We evaluate OPENXRD on a well-defined set of 217 expert-level XRD questions by comparing different vision-language models, including GPT-4 and LLaVA-based frameworks such as Mistral, LLaMA, and QWEN, under both closed-book (without supporting material) and open-book (with supporting material) conditions. Our experimental results show significant accuracy improvements in models that use the GPT-4.5-generated summaries, particularly those with limited prior training in crystallography. OPENXRD uses knowledge from larger models to fill knowledge gaps in crystallography and shows that AI-generated texts can help smaller models reason more effectively in scientific tasks. While the current version of OPENXRD focuses on text-based inputs, we also explore future extensions such as adding real crystal diagrams or diffraction patterns to improve interpretation in specialized materials science contexts. Overall, OPENXRD shows that specialized open-book systems can be useful in materials science and provides a foundation for broader natural language processing (NLP) tools in critical scientific fields.

crystallography, large language model, machine learning, (21 more...)

arXiv.org Artificial Intelligence

2507.09155

Country: North America > United States (0.28)

Genre: Research Report > New Finding (0.48)

Industry:

Health & Medicine (0.68)
Government (0.46)
Energy (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Towards End-to-End Structure Solutions from Information-Compromised Diffraction Data via Generative Deep Learning

Guo, Gabe, Goldfeder, Judah, Lan, Ling, Ray, Aniv, Yang, Albert Hanming, Chen, Boyuan, Billinge, Simon JL, Lipson, Hod

arXiv.org Artificial IntelligenceDec-22-2023

The revolution in materials in the past century was built on a knowledge of the atomic arrangements and the structure-property relationship. The sine qua non for obtaining quantitative structural information is single crystal crystallography. However, increasingly we need to solve structures in cases where the information content in our input signal is significantly degraded, for example, due to orientational averaging of grains, finite size effects due to nanostructure, and mixed signals due to sample heterogeneity. Understanding the structure property relationships in such situations is, if anything, more important and insightful, yet we do not have robust approaches for accomplishing it. In principle, machine learning (ML) and deep learning (DL) are promising approaches since they augment information in the degraded input signal with prior knowledge learned from large databases of already known structures. Here we present a novel ML approach, a variational query-based multi-branch deep neural network that has the promise to be a robust but general tool to address this problem end-to-end. We demonstrate the approach on computed powder x-ray diffraction (PXRD), along with partial chemical composition information, as input. We choose as a structural representation a modified electron density we call the Cartesian mapped electron density (CMED), that straightforwardly allows our ML model to learn material structures across different chemistries, symmetries and crystal systems. When evaluated on theoretically simulated data for the cubic and trigonal crystal systems, the system achieves up to $93.4\%$ average similarity with the ground truth on unseen materials, both with known and partially-known chemical composition information, showing great promise for successful structure solution even from degraded and incomplete input data.

diffraction pattern, information, prediction, (17 more...)

arXiv.org Artificial Intelligence

2312.15136

Country:

North America > United States > New York > New York County > New York City (0.04)
North America > United States > North Carolina > Durham County > Durham (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
(2 more...)

Genre: Research Report (1.00)

Industry: Government > Regional Government (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Crystal-GFN: sampling crystals with desirable properties and constraints

AI4Science, Mila, Hernandez-Garcia, Alex, Duval, Alexandre, Volokhova, Alexandra, Bengio, Yoshua, Sharma, Divya, Carrier, Pierre Luc, Benabed, Yasmine, Koziarski, Michał, Schmidt, Victor

arXiv.org Artificial IntelligenceDec-13-2023

Accelerating material discovery holds the potential to greatly help mitigate the climate crisis. Discovering new solid-state materials such as electrocatalysts, super-ionic conductors or photovoltaic materials can have a crucial impact, for instance, in improving the efficiency of renewable energy production and storage. In this paper, we introduce Crystal-GFN, a generative model of crystal structures that sequentially samples structural properties of crystalline materials, namely the space group, composition and lattice parameters. This domain-inspired approach enables the flexible incorporation of physical and structural hard constraints, as well as the use of any available predictive model of a desired physicochemical property as an objective function. To design stable materials, one must target the candidates with the lowest formation energy. Here, we use as objective the formation energy per atom of a crystal structure predicted by a new proxy machine learning model trained on MatBench. The results demonstrate that Crystal-GFN is able to sample highly diverse crystals with low (median -3.1 eV/atom) predicted formation energy.

crystal-gfn, formation energy, space group, (16 more...)

arXiv.org Artificial Intelligence

2310.04925

Country: North America > Canada > Quebec > Montreal (0.04)

Genre: Research Report > New Finding (0.66)

Industry: Energy > Renewable (0.86)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Constraint-Based Reasoning (0.88)

Add feedback

Weakly supervised learning for pattern classification in serial femtosecond crystallography

Xie, Jianan, Liu, Ji, Zhang, Chi, Chen, Xihui, Huai, Ping, Zheng, Jie, Zhang, Xiaofeng

arXiv.org Artificial IntelligenceSep-21-2023

Serial femtosecond crystallography at X-ray free electron laser facilities opens a new era for the determination of crystal structure. However, the data processing of those experiments is facing unprecedented challenge, because the total number of diffraction patterns needed to determinate a high-resolution structure is huge. Machine learning methods are very likely to play important roles in dealing with such a large volume of data. Convolutional neural networks have made a great success in the field of pattern classification, however, training of the networks need very large datasets with labels. Th is heavy dependence on labeled datasets will seriously restrict the application of networks, because it is very costly to annotate a large number of diffraction patterns. In this article we present our job on the classification of diffraction pattern by weakly supervised algorithms, with the aim of reducing as much as possible the size of the labeled dataset required for training. Our result shows that weakly supervised methods can significantly reduce the need for the number of labeled patterns while achieving comparable accuracy to fully supervised methods.

convolutional base, dataset, diffraction pattern, (16 more...)

arXiv.org Artificial Intelligence

doi: 10.1364/OE.492311

2309.04474

Country:

Asia > China > Shanghai > Shanghai (0.06)
North America > United States (0.04)
Asia > China > Beijing > Beijing (0.04)

Genre: Research Report > New Finding (1.00)

Industry: Information Technology (0.48)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.88)

Add feedback

PeakNet: An Autonomous Bragg Peak Finder with Deep Neural Networks

Wang, Cong, Li, Po-Nan, Thayer, Jana, Yoon, Chun Hong

arXiv.org Artificial IntelligenceJun-29-2023

Serial crystallography at X-ray free electron laser (XFEL) and synchrotron facilities has experienced tremendous progress in recent times enabling novel scientific investigations into macromolecular structures and molecular processes. However, these experiments generate a significant amount of data posing computational challenges in data reduction and real-time feedback. Bragg peak finding algorithm is used to identify useful images and also provide real-time feedback about hit-rate and resolution. Shot-to-shot intensity fluctuations and strong background scattering from buffer solution, injection nozzle and other shielding materials make this a time-consuming optimization problem. Here, we present PeakNet, an autonomous Bragg peak finder that utilizes deep neural networks. The development of this system 1) eliminates the need for manual algorithm parameter tuning, 2) reduces false-positive peaks by adjusting to shot-to-shot variations in strong background scattering in real-time, 3) eliminates the laborious task of manually creating bad pixel masks and the need to store these masks per event since these can be regenerated on demand. PeakNet also exhibits exceptional runtime efficiency, processing a 1920-by-1920 pixel image around 90 ms on an NVIDIA 1080 Ti GPU, with the potential for further enhancements through parallelized analysis or GPU stream processing. PeakNet is well-suited for expert-level real-time serial crystallography data analysis at high data rates.

artificial intelligence, machine learning, peaknet, (17 more...)

arXiv.org Artificial Intelligence

2303.15301

Country:

Europe > Slovenia > Drava > Municipality of Benedikt > Benedikt (0.04)
North America > United States > Nevada > Clark County > Las Vegas (0.04)
North America > United States > California > Santa Clara County > Stanford (0.04)
(4 more...)

Genre:

Research Report (0.64)
Workflow (0.46)

Industry:

Energy (0.68)
Government > Regional Government > North America Government > United States Government (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

AlphaFold2 @ CASP14: "It feels like one's child has left home."

#artificialintelligenceDec-13-2020, 02:46:12 GMT

The past week was a momentous occasion for protein structure prediction, structural biology at large, and in due time, may prove to be so for the whole of life sciences. CASP14, the conference for the biennial competition for the prediction of protein structure from sequence, took place virtually over multiple remote working platforms. DeepMind, Google's premier AI research group, entered the competition as they did the previous time, when they upended expectations of what an industrial research lab can do. The outcome this time was very, very different however. At CASP13 DeepMind made an impressive showing with AlphaFold but was ultimately within the bounds of the usual expectations of academic progress, albeit at an accelerated rate. At CASP14 DeepMind produced an advance so thorough it compelled CASP organizers to declare the protein structure prediction problem for single protein chains to be solved. In my read of most CASP14 attendees (virtual as it was), I sense that this was ...

deepmind, prediction, protein, (15 more...)

#artificialintelligence

Genre: Personal (0.93)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.79)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.69)

Add feedback

DeepFreak: Learning Crystallography Diffraction Patterns with Automated Machine Learning

Souza, Artur, Oliveira, Leonardo B., Hollatz, Sabine, Feldman, Matt, Olukotun, Kunle, Holton, James M., Cohen, Aina E., Nardi, Luigi

arXiv.org Machine LearningApr-26-2019

Serial crystallography is the field of science that studies the structure and properties of crystals via diffraction patterns. In this paper, we introduce a new serial crystallography dataset comprised of real and synthetic images; the synthetic images are generated through the use of a simulator that is both scalable and accurate. The resulting dataset is called DiffraNet, and it is composed of 25,457 512x512 grayscale labeled images. We explore several computer vision approaches for classification on DiffraNet such as standard feature extraction algorithms associated with Random Forests and Support Vector Machines but also an end-to-end CNN topology dubbed DeepFreak tailored to work on this new dataset. All implementations are publicly available and have been fine-tuned using off-the-shelf AutoML optimization tools for a fair comparison. Our best model achieves 98.5% accuracy on synthetic images and 94.51% accuracy on real images. We believe that the DiffraNet dataset and its classification methods will have in the long term a positive impact in accelerating discoveries in many disciplines, including chemistry, geology, biology, materials science, metallurgy, and physics.

artificial intelligence, dataset, machine learning, (18 more...)

arXiv.org Machine Learning

1904.11834

Genre: Research Report (1.00)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.69)

Add feedback

Crystallographic Protein Model Building Using AI and Pattern Recognition

AI MagazineJan-4-2018, 12:38:47 GMT

TEXTAL is a computer program that automaticallyinterprets electron density maps to determine the atomic structures of proteins through X-ray crystallography. Electron density maps are traditionally interpreted by visually fitting atoms into density patterns. This manual process can be time-consuming and error prone, even for expert crystallographers. To automate the process, TEXTAL employs a variety of AI and pattern-recognition techniques that emulate the decision-making processes of domain experts. In this article, we discuss the various ways AI technology is used in TEXTAL, including neural networks, case-based reasoning, nearest neighbor learning and linear discriminant analysis.

artificial intelligence, health & medicine, textal, (19 more...)

AI Magazine

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Case-Based Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback