AITopics | reference structure

Collaborating Authors

reference structure

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

All that structure matches does not glitter

Neural Information Processing SystemsJun-16-2026, 09:57:45 GMT

artificial intelligence, machine learning, natural language, (19 more...)

Neural Information Processing Systems

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.93)

Industry:

Health & Medicine (0.93)
Materials > Chemicals (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.67)

Add feedback

Unlocking hidden biomolecular conformational landscapes in diffusion models at inference time

Richman, Daniel D., Karaguesian, Jessica, Suomivuori, Carl-Mikael, Dror, Ron O.

arXiv.org Artificial IntelligenceDec-4-2025

The function of biomolecules such as proteins depends on their ability to interconvert between a wide range of structures or "conformations." Researchers have endeavored for decades to develop computational methods to predict the distribution of conformations, which is far harder to determine experimentally than a static folded structure. We present ConforMix, an inference-time algorithm that enhances sampling of conformational distributions using a combination of classifier guidance, filtering, and free energy estimation. Our approach upgrades diffusion models -- whether trained for static structure prediction or conformational generation -- to enable more efficient discovery of conformational variability without requiring prior knowledge of major degrees of freedom. ConforMix is orthogonal to improvements in model pretraining and would benefit even a hypothetical model that perfectly reproduced the Boltzmann distribution. Remarkably, when applied to a diffusion model trained for static structure prediction, ConforMix captures structural changes including domain motion, cryptic pocket flexibility, and transporter cycling, while avoiding unphysical states. Case studies of biologically critical proteins demonstrate the scalability, accuracy, and utility of this method.

artificial intelligence, machine learning, protein, (16 more...)

arXiv.org Artificial Intelligence

2512.03312

Country: North America > United States (0.28)

Genre: Research Report > New Finding (0.46)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

All that structure matches does not glitter

Martirossyan, Maya M., Egg, Thomas, Hoellmer, Philipp, Karypis, George, Transtrum, Mark, Roitberg, Adrian, Liu, Mingjie, Hennig, Richard G., Tadmor, Ellad B., Martiniani, Stefano

arXiv.org Artificial IntelligenceDec-4-2025

Generative models for materials, especially inorganic crystals, hold potential to transform the theoretical prediction of novel compounds and structures. Advancement in this field depends on robust benchmarks and minimal, information-rich datasets that enable meaningful model evaluation. This paper critically examines common datasets and reported metrics for a crystal structure prediction task$\unicode{x2014}$generating the most likely structures given the chemical composition of a material. We focus on three key issues: First, materials datasets should contain unique crystal structures; for example, we show that the widely-utilized carbon-24 dataset only contains $\approx$40% unique structures. Second, materials datasets should not be split randomly if polymorphs of many different compositions are numerous, which we find to be the case for the perov-5 and MP-20 datasets. Third, benchmarks can mislead if used uncritically, e.g., reporting a match rate metric without considering the structural variety exhibited by identical building blocks. To address these oft-overlooked issues, we introduce several fixes. We provide revised versions of the carbon-24 dataset: one with duplicates removed, one deduplicated and split by number of atoms $N$, one with enantiomorphs, and two containing only identical structures but with different unit cells. We also propose new splits for datasets with polymorphs, ensuring that polymorphs are grouped within each split subset, setting a more sensible standard for benchmarking model performance. Finally, we present METRe and cRMSE, new model evaluation metrics that can correct existing issues with the match rate metric.

artificial intelligence, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2509.12178

Country: North America > United States > Minnesota > Hennepin County > Minneapolis (0.28)

Genre: Research Report > New Finding (0.46)

Industry:

Health & Medicine (0.67)
Materials > Chemicals (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Add feedback

XDXD: End-to-end crystal structure determination with low resolution X-ray diffraction

Zhao, Jiale, Liu, Cong, Zhang, Yuxuan, Gong, Chengyue, Zhang, Zhenyi, Jin, Shifeng, Liu, Zhenyu

arXiv.org Artificial IntelligenceOct-22-2025

Determining crystal structures from X-ray diffraction data is fundamental across diverse scientific fields, yet remains a significant challenge when data is limited to low resolution. While recent deep learning models have made breakthroughs in solving the crystallographic phase problem, the resulting low-resolution electron density maps are often ambiguous and difficult to interpret. To overcome this critical bottleneck, we introduce XDXD, to our knowledge, the first end-to-end deep learning framework to determine a complete atomic model directly from low-resolution single-crystal X-ray diffraction data. Our diffusion-based generative model bypasses the need for manual map interpretation, producing chemically plausible crystal structures conditioned on the diffraction pattern. We demonstrate that XDXD achieves a 70.4\% match rate for structures with data limited to 2.0~Å resolution, with a root-mean-square error (RMSE) below 0.05. Evaluated on a benchmark of 24,000 experimental structures, our model proves to be robust and accurate. Furthermore, a case study on small peptides highlights the model's potential for extension to more complex systems, paving the way for automated structure solution in previously intractable cases.

artificial intelligence, machine learning, structure determination, (17 more...)

arXiv.org Artificial Intelligence

2510.17936

Country:

Europe (0.46)
North America > United States (0.28)

Genre: Research Report (0.50)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

EHVC: Efficient Hierarchical Reference and Quality Structure for Neural Video Coding

Liao, Junqi, Wu, Yaojun, Lin, Chaoyi, Deng, Zhipin, Li, Li, Liu, Dong, Sun, Xiaoyan

arXiv.org Artificial IntelligenceSep-5-2025

Neural video codecs (NVCs), leveraging the power of end-to-end learning, have demonstrated remarkable coding efficiency improvements over traditional video codecs. Recent research has begun to pay attention to the quality structures in NVCs, optimizing them by introducing explicit hierarchical designs. However, less attention has been paid to the reference structure design, which fundamentally should be aligned with the hierarchical quality structure. In addition, there is still significant room for further optimization of the hierarchical quality structure. To address these challenges in NVCs, we propose EHVC, an efficient hierarchical neural video codec featuring three key innovations: (1) a hierarchical multi-reference scheme that draws on traditional video codec design to align reference and quality structures, thereby addressing the reference-quality mismatch; (2) a lookahead strategy to utilize an encoder-side context from future frames to enhance the quality structure; (3) a layer-wise quality scale with random quality training strategy to stabilize quality structures during inference. With these improvements, EHVC achieves significantly superior performance to the state-of-the-art NVCs. Code will be released in: https://github.com/bytedance/NEVC.

artificial intelligence, quality structure, reference structure, (16 more...)

arXiv.org Artificial Intelligence

2509.04118

Country: Asia > China (0.48)

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence (0.47)

Add feedback

Neural Graph Matching Improves Retrieval Augmented Generation in Molecular Machine Learning

Wang, Runzhong, Wang, Rui-Xi, Manjrekar, Mrunali, Coley, Connor W.

arXiv.org Artificial IntelligenceFeb-25-2025

Molecular machine learning has gained popularity with the advancements of geometric deep learning. In parallel, retrieval-augmented generation has become a principled approach commonly used with language models. However, the optimal integration of retrieval augmentation into molecular machine learning remains unclear. Graph neural networks stand to benefit from clever matching to understand the structural alignment of retrieved molecules to a query molecule. Neural graph matching offers a compelling solution by explicitly modeling node and edge affinities between two structural graphs while employing a noise-robust, end-to-end neural network to learn affinity metrics. We apply this approach to mass spectrum simulation and introduce MARASON, a novel model that incorporates neural graph matching to enhance a fragmentation-based neural network. Experimental results highlight the effectiveness of our design, with MARASON achieving 28% top-1 accuracy, a substantial improvement over the non-retrieval state-of-the-art accuracy of 19%. Moreover, MARASON outperforms both naive retrieval-augmented generation methods and traditional graph matching approaches.

fragment, graph, retrieval augmented generation, (16 more...)

arXiv.org Artificial Intelligence

2502.17874

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
Asia > Singapore (0.04)

Genre: Research Report > Promising Solution (0.34)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Therapeutic Area > Oncology (0.92)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

LAGUNA: LAnguage Guided UNsupervised Adaptation with structured spaces

Diko, Anxhelo, Furnari, Antonino, Cinque, Luigi, Farinella, Giovanni Maria

arXiv.org Artificial IntelligenceNov-27-2024

Unsupervised domain adaptation remains a critical challenge in enabling the knowledge transfer of models across unseen domains. Existing methods struggle to balance the need for domain-invariant representations with preserving domain-specific features, which is often due to alignment approaches that impose the projection of samples with similar semantics close in the latent space despite their drastic domain differences. We introduce LAGUNA - LAnguage Guided UNsupervised Adaptation with structured spaces, a novel approach that shifts the focus from aligning representations in absolute coordinates to aligning the relative positioning of equivalent concepts in latent spaces. LAGUNA defines a domain-agnostic structure upon the semantic/geometric relationships between class labels in language space and guides adaptation, ensuring that the organization of samples in visual space reflects reference inter-class relationships while preserving domain-specific characteristics. We empirically demonstrate LAGUNA's superiority in domain adaptation tasks across four diverse images and video datasets. Remarkably, LAGUNA surpasses previous works in 18 different adaptation scenarios across four diverse image and video datasets with average accuracy improvements of +3.32% on DomainNet, +5.75% in GeoPlaces, +4.77% on GeoImnet, and +1.94% mean class accuracy improvement on EgoExo4D.

artificial intelligence, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2411.15557

Country:

North America > United States (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report > Promising Solution (0.48)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.67)

Add feedback

Improving Inverse Folding for Peptide Design with Diversity-regularized Direct Preference Optimization

Park, Ryan, Hsu, Darren J., Roland, C. Brian, Korshunova, Maria, Tessler, Chen, Mannor, Shie, Viessmann, Olivia, Trentini, Bruno

arXiv.org Artificial IntelligenceOct-25-2024

Inverse folding models play an important role in structure-based design by predicting amino acid sequences that fold into desired reference structures. Models like ProteinMPNN, a message-passing encoder-decoder model, are trained to reliably produce new sequences from a reference structure. However, when applied to peptides, these models are prone to generating repetitive sequences that do not fold into the reference structure. To address this, we fine-tune ProteinMPNN to produce diverse and structurally consistent peptide sequences via Direct Preference Optimization (DPO). We derive two enhancements to DPO: online diversity regularization and domain-specific priors. Additionally, we develop a new understanding on improving diversity in decoder models. When conditioned on Open-Fold generated structures, our fine-tuned models achieve state-of-the-art structural similarity scores, improving base ProteinMPNN by at least 8%. Compared to standard DPO, our regularized method achieves up to 20% higher sequence diversity with no loss in structural similarity score. Engineering biopolymers that fold into desired 3D structures, a computational challenge known as inverse protein folding problem, has broad applications in drug discovery and material science (Yang et al., 2023; Dill et al., 2008; Abascal & Regan, 2018). Several approaches for inverse folding have been adopted over the past decades, from molecular dynamics simulations to machine learning approaches (Dauparas et al., 2022b; Shanker et al., 2023; Hsu et al., 2022a; Yi et al., 2023; Correa, 1990).

artificial intelligence, machine learning, sequence, (17 more...)

arXiv.org Artificial Intelligence

2410.19471

Country:

North America > United States > California > Santa Clara County > Palo Alto (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Asia > Middle East > Israel (0.04)

Genre: Research Report (0.50)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)

Add feedback

AssemblyComplete: 3D Combinatorial Construction with Deep Reinforcement Learning

Chen, Alan, Liu, Changliu

arXiv.org Artificial IntelligenceOct-20-2024

A critical goal in robotics and autonomy is to teach robots to adapt to real-world collaborative tasks, particularly in automatic assembly. The ability of a robot to understand the original intent of an incomplete assembly and complete missing features without human instruction is valuable but challenging. This paper introduces 3D combinatorial assembly completion, which is demonstrated using combinatorial unit primitives (i.e., Lego bricks). Combinatorial assembly is challenging due to the possible assembly combinations and complex physical constraints (e.g., no brick collisions, structure stability, inventory constraints, etc.). To address these challenges, we propose a two-part deep reinforcement learning (DRL) framework that tackles teaching the robot to understand the objective of an incomplete assembly and learning a construction policy to complete the assembly. The robot queries a stable object library to facilitate assembly inference and guide learning. In addition to the robot policy, an action mask is developed to rule out invalid actions that violate physical constraints for object-oriented construction. We demonstrate the proposed framework's feasibility and robustness in a variety of assembly scenarios in which the robot satisfies real-life assembly with respect to both solution and runtime quality. Furthermore, results demonstrate that the proposed framework effectively infers and assembles incomplete structures for unseen and unique object types.

artificial intelligence, machine learning, reinforcement learning, (20 more...)

arXiv.org Artificial Intelligence

2410.15469

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.14)
North America > United States > Texas > Travis County > Austin (0.04)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

On-the-fly machine learning for parametrization of the effective Hamiltonian

Ma, Xingyue, Bellaiche, L., Wu, Di, Yang, Yurong

arXiv.org Artificial IntelligenceJul-17-2023

The first-principles-based effective Hamiltonian is widely used to predict and simulate the properties of ferroelectrics and relaxor ferroelectrics. However, the parametrization method of the effective Hamiltonian is complicated and hardly can resolve the systems with complex interactions and/or complex components. Here, we developed an on-the-fly machine learning approach to parametrize the effective Hamiltonian based on Bayesian linear regression. The parametrization is completed in molecular dynamics simulations, with the energy, forces and stress predicted at each step along with their uncertainties. First-principles calculations are executed when the uncertainties are large to retrain the parameters. This approach provides a universal and automatic way to compute the effective Hamiltonian parameters for any considered systems including complex systems which previous methods can not handle. BaTiO3 and Pb(Sc,Ta)O3 are taken as examples to show the accurateness of this approach comparing with conventional first-principles parametrization method.

artificial intelligence, calculation, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2307.08929

Country:

North America > United States > Arkansas > Washington County > Fayetteville (0.14)
Asia > China > Jiangsu Province > Nanjing (0.04)
North America > United States > New York (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.34)

Add feedback