Goto

Collaborating Authors

 Kukharenko, Oleksandra


Data-driven identification and analysis of the glass transition in polymer melts

arXiv.org Artificial Intelligence

Understanding the nature of glass transition, as well as precise estimation of the glass transition temperature for polymeric materials, remain open questions in both experimental and theoretical polymer sciences. We propose a data-driven approach, which utilizes the high-resolution details accessible through the molecular dynamics simulation and considers the structural information of individual chains. It clearly identifies the glass transition temperature of polymer melts of weakly semiflexible chains. By combining principal component analysis and clustering, we identify the glass transition temperature in the asymptotic limit even from relatively short-time trajectories, which just reach into the Rouse-like monomer displacement regime. We demonstrate that fluctuations captured by the principal component analysis reflect the change in a chain's behaviour: from conformational rearrangement above to small rearrangements below the glass transition temperature. Our approach is straightforward to apply, and should be applicable to other polymeric glass-forming liquids.


Fast conformational clustering of extensive molecular dynamics simulation data

arXiv.org Artificial Intelligence

We present an unsupervised data processing workflow that is specifically designed to obtain a fast conformational clustering of long molecular dynamics simulation trajectories. In this approach we combine two dimensionality reduction algorithms (cc\_analysis and encodermap) with a density-based spatial clustering algorithm (HDBSCAN). The proposed scheme benefits from the strengths of the three algorithms while avoiding most of the drawbacks of the individual methods. Here the cc\_analysis algorithm is for the first time applied to molecular simulation data. Encodermap complements cc\_analysis by providing an efficient way to process and assign large amounts of data to clusters. The main goal of the procedure is to maximize the number of assigned frames of a given trajectory, while keeping a clear conformational identity of the clusters that are found. In practice we achieve this by using an iterative clustering approach and a tunable root-mean-square-deviation-based criterion in the final cluster assignment. This allows to find clusters of different densities as well as different degrees of structural identity. With the help of four test systems we illustrate the capability and performance of this clustering workflow: wild-type and thermostable mutant of the Trp-cage protein (TC5b and TC10b), NTL9 and Protein B. Each of these systems poses individual challenges to the scheme, which in total give a nice overview of the advantages, as well as potential difficulties that can arise when using the proposed method.