Goto

Collaborating Authors

 copolymer


Inverse Design of Copolymers Including Stoichiometry and Chain Architecture

arXiv.org Artificial Intelligence

The demand for innovative synthetic polymers with improved properties is high, but their structural complexity and vast design space hinder rapid discovery. Machine learning-guided molecular design is a promising approach to accelerate polymer discovery. However, the scarcity of labeled polymer data and the complex hierarchical structure of synthetic polymers make generative design particularly challenging. We advance the current state-of-the-art approaches to generate not only repeating units, but monomer ensembles including their stoichiometry and chain architecture. We build upon a recent polymer representation that includes stoichiometries and chain architectures of monomer ensembles and develop a novel variational autoencoder (VAE) architecture encoding a graph and decoding a string. Using a semi-supervised setup, we enable the handling of partly labelled datasets which can be benefitial for domains with a small corpus of labelled data. Our model learns a continuous, well organized latent space (LS) that enables de-novo generation of copolymer structures including different monomer stoichiometries and chain architectures. In an inverse design case study, we demonstrate our model for in-silico discovery of novel conjugated copolymer photocatalysts for hydrogen production using optimization of the polymer's electron affinity and ionization potential in the latent space.


Extrapolative ML Models for Copolymers

arXiv.org Artificial Intelligence

Machine learning models have been progressively used for predicting materials properties. These models can be built using pre-existing data and are useful for rapidly screening the physicochemical space of a material, which is astronomically large. However, ML models are inherently interpolative, and their efficacy for searching candidates outside a material's known range of property is unresolved. Moreover, the performance of an ML model is intricately connected to its learning strategy and the volume of training data. Here, we determine the relationship between the extrapolation ability of an ML model, the size and range of its training dataset, and its learning approach. We focus on a canonical problem of predicting the properties of a copolymer as a function of the sequence of its monomers. Tree search algorithms, which learn the similarity between polymer structures, are found to be inefficient for extrapolation. Conversely, the extrapolation capability of neural networks and XGBoost models, which attempt to learn the underlying functional correlation between the structure and property of polymers, show strong correlations with the volume and range of training data. These findings have important implications on ML-based new material development.


Machine Learning Approach to Polymerization Reaction Engineering: Determining Monomers Reactivity Ratios

arXiv.org Artificial Intelligence

The atom & bond features are ranked based on an average score of their influences to the reactivity ratio prediction of trained samples. Each dot represents the impact of the corresponding pair of monomers and copolymer in the training set. Red and blue color indicates the high and low impact on the reactivity ratios, respectively. A positive SHAP value indicates a positive influence on the prediction, and a negative SHAP value indicates a negative influence on the prediction.


When does deep learning fail and how to tackle it? A critical analysis on polymer sequence-property surrogate models

arXiv.org Artificial Intelligence

Deep learning models are gaining popularity and potency in predicting polymer properties. These models can be built using pre-existing data and are useful for the rapid prediction of polymer properties. However, the performance of a deep learning model is intricately connected to its topology and the volume of training data. There is no facile protocol available to select a deep learning architecture, and there is a lack of a large volume of homogeneous sequence-property data of polymers. These two factors are the primary bottleneck for the efficient development of deep learning models. Here we assess the severity of these factors and propose new algorithms to address them. We show that a linear layer-by-layer expansion of a neural network can help in identifying the best neural network topology for a given problem. Moreover, we map the discrete sequence space of a polymer to a continuous one-dimensional latent space using a machine learning pipeline to identify minimal data points for building a universal deep learning model. We implement these approaches for three representative cases of building sequence-property surrogate models, viz., the single-molecule radius of gyration of a copolymer, adhesive free energy of a copolymer, and copolymer compatibilizer, demonstrating the generality of the proposed strategies. This work establishes efficient methods for building universal deep learning models with minimal data and hyperparameters for predicting sequence-defined properties of polymers.


A multi-category inverse design neural network and its application to diblock copolymers

arXiv.org Artificial Intelligence

In this work, we design a multi-category inverse design neural network to map ordered periodic structure to physical parameters. The neural network model consists of two parts, a classifier and Structure-Parameter-Mapping (SPM) subnets. The classifier is used to identify structure, and the SPM subnets are used to predict physical parameters for desired structures. We also present an extensible reciprocal-space data augmentation method to guarantee the rotation and translation invariant of periodic structures. We apply the proposed network model and data augmentation method to two-dimensional diblock copolymers based on the Landau-Brazovskii model. Results show that the multi-category inverse design neural network is high accuracy in predicting physical parameters for desired structures. Moreover, the idea of multi-categorization can also be extended to other inverse design problems.


A graph representation of molecular ensembles for polymer property prediction

arXiv.org Artificial Intelligence

Synthetic polymers are versatile and widely used materials. Similar to small organic molecules, a large chemical space of such materials is hypothetically accessible. Computational property prediction and virtual screening can accelerate polymer design by prioritizing candidates expected to have favorable properties. However, in contrast to organic molecules, polymers are often not well-defined single structures but an ensemble of similar molecules, which poses unique challenges to traditional chemical representations and machine learning approaches. Here, we introduce a graph representation of molecular ensembles and an associated graph neural network architecture that is tailored to polymer property prediction. We demonstrate that this approach captures critical features of polymeric materials, like chain architecture, monomer stoichiometry, and degree of polymerization, and achieves superior accuracy to off-the-shelf cheminformatics methodologies. While doing so, we built a dataset of simulated electron affinity and ionization potential values for >40k polymers with varying monomer composition, stoichiometry, and chain architecture, which may be used in the development of other tailored machine learning approaches. The dataset and machine learning models presented in this work pave the path toward new classes of algorithms for polymer informatics and, more broadly, introduce a framework for the modeling of molecular ensembles.


Bioplastic Design using Multitask Deep Neural Networks

arXiv.org Artificial Intelligence

Non-degradable plastic waste stays for decades on land and in water, jeopardizing our environment; yet our modern lifestyle and current technologies are impossible to sustain without plastics. Bio-synthesized and biodegradable alternatives such as the polymer family of polyhydroxyalkanoates (PHAs) have the potential to replace large portions of the world's plastic supply with cradle-to-cradle materials, but their chemical complexity and diversity limit traditional resource-intensive experimentation. In this work, we develop multitask deep neural network property predictors using available experimental data for a diverse set of nearly 23000 homo- and copolymer chemistries. Using the predictors, we identify 14 PHA-based bioplastics from a search space of almost 1.4 million candidates which could serve as potential replacements for seven petroleum-based commodity plastics that account for 75% of the world's yearly plastic production. We discuss possible synthesis routes for these identified promising materials. The developed multitask polymer property predictors are made available as a part of the Polymer Genome project at https://PolymerGenome.org.


Harnessing AI and Robotics to Treat Spinal Cord Injuries

#artificialintelligence

By employing artificial intelligence (AI) and robotics to formulate therapeutic proteins, a team led by Rutgers researchers has successfully stabilized an enzyme able to degrade scar tissue resulting from spinal cord injuries and promote tissue regeneration. The study, recently published in Advanced Healthcare Materials, details the team's ground-breaking stabilization of the enzyme Chondroitinase ABC, (ChABC) offering new hope for patients coping with spinal cord injuries. "This study represents one of the first times artificial intelligence and robotics have been used to formulate highly sensitive therapeutic proteins and extend their activity by such a large amount. It's a major scientific achievement," says Adam Gormley, the project's principal investigator and an assistant professor of biomedical engineering at Rutgers School of Engineering (SOE) at Rutgers University-New Brunswick. Gormley expresses that his research is also motivated, in part, by a personal connection to spinal cord injury.