AITopics | Miller, Thomas F. III

Collaborating Authors

Miller, Thomas F. III

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

NeuralPLexer3: Accurate Biomolecular Complex Structure Prediction with Flow Models

Qiao, Zhuoran, Ding, Feizhi, Dresselhaus, Thomas, Rosenfeld, Mia A., Han, Xiaotian, Howell, Owen, Iyengar, Aniketh, Opalenski, Stephen, Christensen, Anders S., Sirumalla, Sai Krishna, Manby, Frederick R., Miller, Thomas F. III, Welborn, Matthew

arXiv.org Artificial IntelligenceDec-18-2024

Structure determination is essential to a mechanistic understanding of diseases and the development of novel therapeutics. Machine-learning-based structure prediction methods have made significant advancements by computationally predicting protein and bioassembly structures from sequences and molecular topology alone. Despite substantial progress in the field, challenges remain to deliver structure prediction models to real-world drug discovery. Here, we present NeuralPLexer3 -- a physics-inspired flow-based generative model that achieves state-of-the-art prediction accuracy on key biomolecular interaction types and improves training and sampling efficiency compared to its predecessors and alternative methodologies. Examined through newly developed benchmarking strategies, NeuralPLexer3 excels in vital areas that are crucial to structure-based drug design, such as physical validity and ligand-induced conformational changes.

bioinformatics, machine learning, prediction, (19 more...)

arXiv.org Artificial Intelligence

2412.10743

Country: North America > United States (0.46)

Genre: Research Report > New Finding (0.68)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Biomedical Informatics > Translational Bioinformatics (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

State-specific protein-ligand complex structure prediction with a multi-scale deep generative model

Qiao, Zhuoran, Nie, Weili, Vahdat, Arash, Miller, Thomas F. III, Anandkumar, Anima

arXiv.org Artificial IntelligenceApr-19-2023

The binding complexes formed by proteins and small molecule ligands are ubiquitous and critical to life. Despite recent advancements in protein structure prediction, existing algorithms are so far unable to systematically predict the binding ligand structures along with their regulatory effects on protein folding. To address this discrepancy, we present NeuralPLexer, a computational approach that can directly predict protein-ligand complex structures solely using protein sequence and ligand molecular graph inputs. NeuralPLexer adopts a deep generative model to sample the 3D structures of the binding complex and their conformational changes at an atomistic resolution. The model is based on a diffusion process that incorporates essential biophysical constraints and a multi-scale geometric deep learning system to iteratively sample residue-level contact maps and all heavy-atom coordinates in a hierarchical manner. NeuralPLexer achieves state-of-the-art performance compared to all existing methods on benchmarks for both protein-ligand blind docking and flexible binding site structure recovery. Moreover, owing to its specificity in sampling both ligand-free-state and ligand-bound-state ensembles, NeuralPLexer consistently outperforms AlphaFold2 in terms of global protein structure accuracy on both representative structure pairs with large conformational changes (average TM-score=0.93) and recently determined ligand-binding proteins (average TM-score=0.89). Case studies reveal that the predicted conformational variations are consistent with structure determination experiments for important targets, including human KRAS$^\textrm{G12C}$, ketol-acid reductoisomerase, and purine GPCRs. Our study suggests that a data-driven approach can capture the structural cooperativity between proteins and small molecules, showing promise in accelerating the design of enzymes, drug molecules, and beyond.

artificial intelligence, machine learning, prediction, (19 more...)

arXiv.org Artificial Intelligence

2209.15171

Country: North America > United States > California > Los Angeles County (0.14)

Genre:

Research Report > Experimental Study (0.48)
Research Report > New Finding (0.46)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.45)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.60)

Add feedback

Accurate Molecular-Orbital-Based Machine Learning Energies via Unsupervised Clustering of Chemical Space

Cheng, Lixue, Sun, Jiace, Miller, Thomas F. III

arXiv.org Artificial IntelligenceApr-20-2022

We introduce an unsupervised clustering algorithm to improve training efficiency and accuracy in predicting energies using molecular-orbital-based machine learning (MOB-ML). This work determines clusters via the Gaussian mixture model (GMM) in an entirely automatic manner and simplifies an earlier supervised clustering approach [J. Chem. Theory Comput., 15, 6668 (2019)] by eliminating both the necessity for user-specified parameters and the training of an additional classifier. Unsupervised clustering results from GMM have the advantage of accurately reproducing chemically intuitive groupings of frontier molecular orbitals and having improved performance with an increasing number of training examples. The resulting clusters from supervised or unsupervised clustering is further combined with scalable Gaussian process regression (GPR) or linear regression (LR) to learn molecular energies accurately by generating a local regression model in each cluster. Among all four combinations of regressors and clustering methods, GMM combined with scalable exact Gaussian process regression (GMM/GPR) is the most efficient training protocol for MOB-ML. The numerical tests of molecular energy learning on thermalized datasets of drug-like molecules demonstrate the improved accuracy, transferability, and learning efficiency of GMM/GPR over not only other training protocols for MOB-ML, i.e., supervised regression-clustering combined with GPR(RC/GPR) and GPR without clustering. GMM/GPR also provide the best molecular energy predictions compared with the ones from literature on the same benchmark datasets. With a lower scaling, GMM/GPR has a 10.4-fold speedup in wall-clock training time compared with scalable exact GPR with a training size of 6500 QM7b-T molecules.

artificial intelligence, machine learning, molecule, (18 more...)

arXiv.org Artificial Intelligence

doi: 10.1021/acs.jctc.2c00396

2204.09831

Country: North America > United States > California (0.14)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Therapeutic Area (0.46)
Government > Regional Government (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)

Add feedback

Informing Geometric Deep Learning with Electronic Interactions to Accelerate Quantum Chemistry

Qiao, Zhuoran, Christensen, Anders S., Welborn, Matthew, Manby, Frederick R., Anandkumar, Anima, Miller, Thomas F. III

arXiv.org Artificial IntelligenceApr-1-2022

Predicting electronic energies, densities, and related chemical properties can facilitate the discovery of novel catalysts, medicines, and battery materials. By developing a physics-inspired equivariant neural network, we introduce a method to learn molecular representations based on the electronic interactions among atomic orbitals. Our method, OrbNet-Equi, leverages efficient tight-binding simulations and learned mappings to recover high fidelity quantum chemical properties. OrbNet-Equi models a wide spectrum of target properties with an accuracy consistently better than standard machine learning methods and a speed orders of magnitude greater than density functional theory. Despite only using training samples collected from readily available small-molecule libraries, OrbNet-Equi outperforms traditional methods on comprehensive downstream benchmarks that encompass diverse main-group chemical processes. Our method also describes interactions in challenging charge-transfer complexes and open-shell systems. We anticipate that the strategy presented here will help to expand opportunities for studies in chemistry and materials science, where the acquisition of experimental or reference training data is costly.

artificial intelligence, informing geometric deep learning, machine learning, (13 more...)

arXiv.org Artificial Intelligence

doi: 10.1073/pnas.2205221119

2105.14655

Country:

North America > United States (0.14)
Europe (0.14)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Energy > Oil & Gas (1.00)
Materials > Chemicals > Commodity Chemicals > Petrochemicals (0.45)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Molecular Energy Learning Using Alternative Blackbox Matrix-Matrix Multiplication Algorithm for Exact Gaussian Process

Sun, Jiace, Cheng, Lixue, Miller, Thomas F. III

arXiv.org Artificial IntelligenceSep-20-2021

We present an application of the blackbox matrix-matrix multiplication (BBMM) algorithm to scale up the Gaussian Process (GP) training of molecular energies in the molecular-orbital based machine learning (MOB-ML) framework. An alternative implementation of BBMM (AltBBMM) is also proposed to train more efficiently (over four-fold speedup) with the same accuracy and transferability as the original BBMM implementation. The training of MOB-ML was limited to 220 molecules, and BBMM and AltBBMM scale the training of MOB-ML up by over 30 times to 6500 molecules (more than a million pair energies). The accuracy and transferability of both algorithms are examined on the benchmark datasets of organic molecules with 7 and 13 heavy atoms. These lower-scaling implementations of the GP preserve the state-of-the-art learning efficiency in the low-data regime while extending it to the large-data regime with better accuracy than other available machine learning works on molecular energies.

artificial intelligence, machine learning, molecule, (14 more...)

arXiv.org Artificial Intelligence

2109.09817

Country: North America > United States > California (0.15)

Genre: Research Report (0.40)

Industry: Government > Regional Government > North America Government > United States Government (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Regression-clustering for Improved Accuracy and Training Cost with Molecular-Orbital-Based Machine Learning

Cheng, Lixue, Kovachki, Nikola B., Welborn, Matthew, Miller, Thomas F. III

arXiv.org Artificial IntelligenceOct-23-2019

Machine learning (ML) in the representation of molecular-orbital-based (MOB) features has been shown to be an accurate and transferable approach to the prediction of post-Hartree-Fock correlation energies. Previous applications of MOB-ML employed Gaussian Process Regression (GPR), which provides good prediction accuracy with small training sets; however, the cost of GPR training scales cubically with the amount of data and becomes a computational bottleneck for large training sets. In the current work, we address this problem by introducing a clustering/regression/classification implementation of MOB-ML. In a first step, regression clustering (RC) is used to partition the training data to best fit an ensemble of linear regression (LR) models; in a second step, each cluster is regressed independently, using either LR or GPR; and in a third step, a random forest classifier (RFC) is trained for the prediction of cluster assignments based on MOB feature values. Upon inspection, RC is found to recapitulate chemically intuitive groupings of the frontier molecular orbitals, and the combined RC/LR/RFC and RC/GPR/RFC implementations of MOB-ML are found to provide good prediction accuracy with greatly reduced wall-clock training times. For a dataset of thermalized geometries of 7211 organic molecules of up to seven heavy atoms, both implementations reach chemical accuracy (1 kcal/mol error) with only 300 training molecules, while providing 35000-fold and 4500-fold reductions in the wall-clock training time, respectively, compared to MOB-ML without clustering. The resulting models are also demonstrated to retain transferability for the prediction of large-molecule energies with only small-molecule training data. Finally, it is shown that capping the number of training datapoints per cluster leads to further improvements in prediction accuracy with negligible increases in wall-clock training time.

artificial intelligence, machine learning, molecule, (17 more...)

arXiv.org Artificial Intelligence

doi: 10.1021/acs.jctc.9b00884

1909.02041

Country: North America > United States > California (0.14)

Genre: Research Report (1.00)

Industry: Government > Military (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.94)

Add feedback