AITopics | Greenside, Peyton

Collaborating Authors

Greenside, Peyton

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Bayesian Optimization of Antibodies Informed by a Generative Model of Evolving Sequences

Amin, Alan Nawzad, Gruver, Nate, Kuang, Yilun, Li, Lily, Elliott, Hunter, McCarter, Calvin, Raghu, Aniruddh, Greenside, Peyton, Wilson, Andrew Gordon

arXiv.org Machine LearningDec-10-2024

To build effective therapeutics, biologists iteratively mutate antibody sequences to improve binding and stability. Proposed mutations can be informed by previous measurements or by learning from large antibody databases to predict only typical antibodies. Unfortunately, the space of typical antibodies is enormous to search, and experiments often fail to find suitable antibodies on a budget. We introduce Clone-informed Bayesian Optimization (CloneBO), a Bayesian optimization procedure that efficiently optimizes antibodies in the lab by teaching a generative model how our immune system optimizes antibodies. Our immune system makes antibodies by iteratively evolving specific portions of their sequences to bind their target strongly and stably, resulting in a set of related, evolving sequences known as a clonal family. We train a large language model, CloneLM, on hundreds of thousands of clonal families and use it to design sequences with mutations that are most likely to optimize an antibody within the human immune system. We propose to guide our designs to fit previous measurements with a twisted sequential Monte Carlo procedure. We show that CloneBO optimizes antibodies substantially more efficiently than previous methods in realistic in silico experiments and designs stronger and more stable binders in in vitro wet lab experiments.

artificial intelligence, machine learning, natural language, (19 more...)

arXiv.org Machine Learning

2412.07763

Country: North America > United States (0.28)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Therapeutic Area > Immunology (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.88)
Information Technology > Artificial Intelligence > Natural Language > Generation (0.61)

Add feedback

Generative Humanization for Therapeutic Antibodies

Gordon, Cade, Raghu, Aniruddh, Greenside, Peyton, Elliott, Hunter

arXiv.org Artificial IntelligenceDec-8-2024

Antibody therapies have been employed to address some of today's most challenging diseases, but must meet many criteria during drug development before reaching a patient. Humanization is a sequence optimization strategy that addresses one critical risk called immunogenicity -- a patient's immune response to the drug -- by making an antibody more'human-like' in the absence of a predictive lab-based test for immunogenicity. However, existing humanization strategies generally yield very few humanized candidates, which may have degraded biophysical properties or decreased drug efficacy. Here, we re-frame humanization as a conditional generative modeling task, where humanizing mutations are sampled from a language model trained on human antibody data. We describe a sampling process that incorporates models of therapeutic attributes, such as antigen binding affinity, to obtain candidate sequences that have both reduced immunogenicity risk and maintained or improved therapeutic properties, allowing this algorithm to be readily embedded into an iterative antibody optimization campaign. We demonstrate in silico and in lab validation that in real therapeutic programs our generative humanization method produces diverse sets of antibodies that are both (1) highly-human and (2) have favorable therapeutic properties, such as improved binding to target antigens. Antibodies are the fastest growing drug class, with approved molecules treating a breadth of disorders ranging from cancer to autoimmune disease to infectious disease (Carter & Lazar, 2018). Many candidate therapeutic antibodies are derived from non-human e.g., murine or camelid sources, and modern antibody formats such as multi-specifics or antibody-drug conjugates can require heavy sequence engineering after discovery. This increases the risk of immunogenicity, where Anti-Drug Antibodies (ADAs) result in either fast clearance of the drug or adverse events (Hwang & Foote, 2005). While antibody sequence humanness is only roughly correlated with immunogenicity, humanization is widely employed to decrease immunogenicity risk (Prihoda et al., 2022).

artificial intelligence, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2412.04737

Country: North America > United States (0.28)

Genre: Research Report (0.50)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Therapeutic Area > Immunology (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)

Add feedback

Inverse Protein Folding Using Deep Bayesian Optimization

Maus, Natalie, Zeng, Yimeng, Anderson, Daniel Allen, Maffettone, Phillip, Solomon, Aaron, Greenside, Peyton, Bastani, Osbert, Gardner, Jacob R.

arXiv.org Artificial IntelligenceMay-24-2023

Inverse protein folding -- the task of predicting a protein sequence from its backbone atom coordinates -- has surfaced as an important problem in the "top down", de novo design of proteins. Contemporary approaches have cast this problem as a conditional generative modelling problem, where a large generative model over protein sequences is conditioned on the backbone. While these generative models very rapidly produce promising sequences, independent draws from generative models may fail to produce sequences that reliably fold to the correct backbone. Furthermore, it is challenging to adapt pure generative approaches to other settings, e.g., when constraints exist. In this paper, we cast the problem of improving generated inverse folds as an optimization problem that we solve using recent advances in "deep" or "latent space" Bayesian optimization. Our approach consistently produces protein sequences with greatly reduced structural error to the target backbone structure as measured by TM score and RMSD while using fewer computational resources. Additionally, we demonstrate other advantages of an optimization-based approach to the problem, such as the ability to handle constraints.

artificial intelligence, machine learning, sequence, (15 more...)

arXiv.org Artificial Intelligence

2305.18089

Country:

North America > United States (0.14)
Europe (0.14)

Genre: Research Report (0.82)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Accelerating Bayesian Optimization for Biological Sequence Design with Denoising Autoencoders

Stanton, Samuel, Maddox, Wesley, Gruver, Nate, Maffettone, Phillip, Delaney, Emily, Greenside, Peyton, Wilson, Andrew Gordon

arXiv.org Machine LearningJul-12-2022

Bayesian optimization (BayesOpt) is a gold standard for query-efficient continuous optimization. However, its adoption for drug design has been hindered by the discrete, high-dimensional nature of the decision variables. We develop a new approach (LaMBO) which jointly trains a denoising autoencoder with a discriminative multi-task Gaussian process head, allowing gradient-based optimization of multi-objective acquisition functions in the latent space of the autoencoder. These acquisition functions allow LaMBO to balance the explore-exploit tradeoff over multiple design rounds, and to balance objective tradeoffs by optimizing sequences at many different points on the Pareto frontier. We evaluate LaMBO on two small-molecule design tasks, and introduce new tasks optimizing \emph{in silico} and \emph{in vitro} properties of large-molecule fluorescent proteins. In our experiments LaMBO outperforms genetic optimizers and does not require a large pretraining corpus, demonstrating that BayesOpt is practical and effective for biological sequence design.

artificial intelligence, machine learning, sequence, (14 more...)

arXiv.org Machine Learning

2203.12742

Country: North America > United States (0.93)

Genre: Research Report > New Finding (1.00)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

A Hierarchical Approach to Scaling Batch Active Search Over Structured Data

Myers, Vivek, Greenside, Peyton

arXiv.org Machine LearningJul-20-2020

Active search is the process of identifying high-value data points in a large and often high-dimensional parameter space that can be expensive to evaluate. Traditional active search techniques like Bayesian optimization trade off exploration and exploitation over consecutive evaluations, and have historically focused on single or small (<5) numbers of examples evaluated per round. As modern data sets grow, so does the need to scale active search to large data sets and batch sizes. In this paper, we present a general hierarchical framework based on bandit algorithms to scale active search to large batch sizes by maximizing information derived from the unique structure of each dataset. Our hierarchical framework, Hierarchical Batch Bandit Search (HBBS), strategically distributes batch selection across a learned embedding space by facilitating wide exploration of different structural elements within a dataset. We focus our application of HBBS on modern biology, where large batch experimentation is often fundamental to the research process, and demonstrate batch design of biological sequences (protein and DNA). We also present a new Gym environment to easily simulate diverse biological sequences and to enable more comprehensive evaluation of active search methods across heterogeneous data sets. The HBBS framework improves upon standard performance, wall-clock, and scalability benchmarks for batch search by using a broad exploration strategy across coarse partitions and fine-grained exploitation within each partition of structured data.

deep learning, sequence, upstream oil & gas, (18 more...)

arXiv.org Machine Learning

2007.10263

Country: North America > United States > California > Santa Clara County (0.14)

Genre: Research Report (0.40)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (0.71)
Energy > Oil & Gas > Upstream (0.69)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.69)

Add feedback