candidate structure
- North America > United States > Illinois > Cook County > Chicago (0.04)
- North America > Canada (0.04)
- North America > United States > Illinois > Cook County > Chicago (0.04)
- North America > Canada (0.04)
Grand canonical generative diffusion model for crystalline phases and grain boundaries
Lei, Bo, Chen, Enze, Kwon, Hyuna, Hsu, Tim, Sadigh, Babak, Lordi, Vincenzo, Frolov, Timofey, Zhou, Fei
The diffusion model has emerged as a powerful tool for generating atomic structures for materials science. This work calls attention to the deficiency of current particle-based diffusion models, which represent atoms as a point cloud, in generating even the simplest ordered crystalline structures. The problem is attributed to particles being trapped in local minima during the score-driven simulated annealing of the diffusion process, similar to the physical process of force-driven simulated annealing. We develop a solution, the grand canonical diffusion model, which adopts an alternative voxel-based representation with continuous rather than fixed number of particles. The method is applied towards generation of several common crystalline phases as well as the technologically important and challenging problem of grain boundary structures.
- North America > United States > New York (0.04)
- North America > United States > Massachusetts > Suffolk County > Boston (0.04)
- North America > United States > California > Alameda County > Livermore (0.04)
- (2 more...)
Learning from Protein Structure with Geometric Vector Perceptrons
Jing, Bowen, Eismann, Stephan, Suriana, Patricia, Townshend, Raphael J. L., Dror, Ron
Learning on 3D structures of large biomolecules is emerging as a distinct area in machine learning, but there has yet to emerge a unifying network architecture that simultaneously leverages the graph-structured and geometric aspects of the problem domain. To address this gap, we introduce geometric vector perceptrons, which extend standard dense layers to operate on collections of Euclidean vectors. Graph neural networks equipped with such layers are able to perform both geometric and relational reasoning on efficient and natural representations of macromolecular structure. We demonstrate our approach on two important problems in learning from protein structure: model quality assessment and computational protein design. Our approach improves over existing classes of architectures, including state-of-the-art graph-based and voxel-based methods.
Geometric Prediction: Moving Beyond Scalars
Townshend, Raphael J. L., Townshend, Brent, Eismann, Stephan, Dror, Ron O.
Many quantities we are interested in predicting are geometric tensors; we refer to this class of problems as geometric prediction. Attempts to perform geometric prediction in real-world scenarios have been limited to approximating them through scalar predictions, leading to losses in data efficiency. In this work, we demonstrate that equivariant networks have the capability to predict real-world geometric tensors without the need for such approximations. We show the applicability of this method to the prediction of force fields and then propose a novel formulation of an important task, biomolecular structure refinement, as a geometric prediction problem, improving state-of-the-art structural candidates. In both settings, we find that our equivariant network is able to generalize to unseen systems, despite having been trained on small sets of examples. This novel and data-efficient ability to predict real-world geometric tensors opens the door to addressing many problems through the lens of geometric prediction, in areas such as 3D vision, robotics, and molecular and structural biology.
- Health & Medicine > Pharmaceuticals & Biotechnology (0.94)
- Education (0.68)
- Health & Medicine > Diagnostic Medicine > Imaging (0.46)
Fully Automated Computational NMR Interpretation – Straight From Spectrometer to Structure
Every organic chemist has had to solve problems of structure elucidation, such as determining the structure of a biologically-active natural product or understanding the products of a reaction. These problems are often difficult and may be a bottleneck of chemical discovery. Structural misassignment leads to the waste of time and resources. In the last two decades computational tools have become increasingly useful in tackling these problems, with the DP4 Probability developed by the Goodman Lab being a key contribution to this toolkit (https://doi.org/10.1021/ja105035r). By comparing experimental NMR spectra and those computed for candidate structures, DP4 quantifies confidence in structural assignment, enabling chemists to use their resources more effectively.
Chemical Structure Elucidation from Mass Spectrometry by Matching Substructures
Lim, Jing, Wong, Joshua, Wong, Minn Xuan, Tan, Lee Han Eric, Chieu, Hai Leong, Choo, Davin, Neo, Neng Kai Nigel
Chemical structure elucidation is a serious bottleneck in analytical chemistry today. We address the problem of identifying an unknown chemical threat given its mass spectrum and its chemical formula, a task which might take well trained chemists several days to complete. Given a chemical formula, there could be over a million possible candidate structures. We take a data driven approach to rank these structures by using neural networks to predict the presence of substructures given the mass spectrum, and matching these substructures to the candidate structures. Empirically, we evaluate our approach on a data set of chemical agents built for unknown chemical threat identification. We show that our substructure classifiers can attain over 90% micro F1-score, and we can find the correct structure among the top 20 candidates in 88% and 71% of test cases for two compound classes.
- Asia > Singapore (0.04)
- North America > United States > California > Santa Clara County > Palo Alto (0.04)
- North America > Canada > Quebec > Montreal (0.04)
- (3 more...)
- Law Enforcement & Public Safety > Crime Prevention & Enforcement (0.69)
- Government (0.47)
- Materials > Chemicals (0.46)
Learning the Dimensionality of Hidden Variables
A serious problem in learning probabilistic models is the presence of hidden variables. These variables are not observed, yet interact with several of the observed variables. Detecting hidden variables poses two problems: determining the relations to other variables in the model and determining the number of states of the hidden variable. In this paper, we address the latter problem in the context of Bayesian networks. We describe an approach that utilizes a score-based agglomerative state-clustering. As we show, this approach allows us to efficiently evaluate models with a range of cardinalities for the hidden variable. We show how to extend this procedure to deal with multiple interacting hidden variables. We demonstrate the effectiveness of this approach by evaluating it on synthetic and real-life data. We show that our approach learns models with hidden variables that generalize better and have better structure than previous approaches.
- North America > United States > California > San Francisco County > San Francisco (0.04)
- Asia > Middle East > Israel (0.04)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.94)
Discovering Hidden Variables: A Structure-Based Approach
Elidan, Gal, Lotner, Noam, Friedman, Nir, Koller, Daphne
A serious problem in learning probabilistic models is the presence of hidden variables. These variables are not observed, yet interact with several of the observed variables. As such, they induce seemingly complex dependencies among the latter. In recent years, much attention has been devoted to the development of algorithms for learning parameters, and in some cases structure, in the presence of hidden variables. In this paper, we address the related problem of detecting hidden variables that interact with the observed variables.
- North America > United States > California > Santa Clara County > Palo Alto (0.04)
- North America > United States > California > San Francisco County > San Francisco (0.04)
- Asia > Japan > Honshū > Chūbu > Ishikawa Prefecture > Kanazawa (0.04)
- Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.46)
- Banking & Finance > Trading (0.46)
Discovering Hidden Variables: A Structure-Based Approach
Elidan, Gal, Lotner, Noam, Friedman, Nir, Koller, Daphne
A serious problem in learning probabilistic models is the presence of hidden variables.These variables are not observed, yet interact with several of the observed variables. As such, they induce seemingly complex dependencies amongthe latter. In recent years, much attention has been devoted to the development of algorithms for learning parameters, and in some cases structure, in the presence of hidden variables. In this paper, weaddress the related problem of detecting hidden variables that interact with the observed variables. This problem is of interest both for improving our understanding of the domain and as a preliminary step that guides the learning procedure towards promising models.
- North America > United States > California > Santa Clara County > Palo Alto (0.04)
- North America > United States > California > San Francisco County > San Francisco (0.04)
- Asia > Japan > Honshū > Chūbu > Ishikawa Prefecture > Kanazawa (0.04)
- Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.46)
- Banking & Finance > Trading (0.46)