Goto

Collaborating Authors

 acceptor


Automated Model-Free Sorting of Single-Molecule Fluorescence Events Using a Deep Learning Based Hidden-State Model

Zeng, Wenqi, Zhou, Shuqi, Yao, Yuan, Chen, Chunlai

arXiv.org Artificial Intelligence

Single-molecule fluorescence assays enable high-resolution analysis of biomolecular dynamics, but traditional analysis pipelines are labor-intensive and rely on users' experience, limiting scalability and reproducibility. Recent deep learning models have automated aspects of data processing, yet many still require manual thresholds, complex architectures, or extensive labeled data. Therefore, we present DASH, a fully streamlined architecture for trace classification, state assignment, and automatic sorting that requires no user input. DASH demonstrates robust performance across users and experimental conditions both in equilibrium and non-equilibrium systems such as Cas12a-mediated DNA cleavage. This paper proposes a novel strategy for the automatic and detailed sorting of single-molecule fluorescence events. The dynamic cleavage process of Cas12a is used as an example to provide a comprehensive analysis. This approach is crucial for studying biokinetic structural changes at the single-molecule level.


Accelerating materials discovery for polymer solar cells: Data-driven insights enabled by natural language processing

Shetty, Pranav, Adeboye, Aishat, Gupta, Sonakshi, Zhang, Chao, Ramprasad, Rampi

arXiv.org Artificial Intelligence

We present a simulation of various active learning strategies for the discovery of polymer solar cell donor/acceptor pairs using data extracted from the literature spanning $\sim$20 years by a natural language processing pipeline. While data-driven methods have been well established to discover novel materials faster than Edisonian trial-and-error approaches, their benefits have not been quantified for material discovery problems that can take decades. Our approach demonstrates a potential reduction in discovery time by approximately 75 %, equivalent to a 15 year acceleration in material innovation. Our pipeline enables us to extract data from greater than 3300 papers which is $\sim$5 times larger and therefore more diverse than similar data sets reported by others. We also trained machine learning models to predict the power conversion efficiency and used our model to identify promising donor-acceptor combinations that are as yet unreported. We thus demonstrate a pipeline that goes from published literature to extracted material property data which in turn is used to obtain data-driven insights. Our insights include active learning strategies that can be used to train strong predictive models of material properties or be robust to the initial material system used. This work provides a valuable framework for data-driven research in materials science.


The Krohn-Rhodes Logics

Ronca, Alessandro

arXiv.org Artificial Intelligence

We present a new family of modal temporal logics of the past, obtained by extending Past LTL with a rich set of temporal operators based on the theory by Krohn and Rhodes for automata cascades. The theory says that every automaton can be expressed as a cascade of some basic automata called prime automata. They are the building blocks of all automata, analogously to prime numbers being the building blocks of all natural numbers. We show that Past LTL corresponds to cascades of one kind of prime automata called flip-flops. In particular, the temporal operators of Past LTL are captured by flip-flops, and they cannot capture any other prime automaton, confining the expressivity within the star-free regular languages. We propose novel temporal operators that can capture other prime automata, and hence extend the expressivity of Past LTL. Such operators are infinitely-many, and they yield an infinite number of logics capturing an infinite number of distinct fragments of the regular languages. The result is a yet unexplored landscape of extensions of Past LTL, that we call Krohn-Rhodes Logics, each of them with the potential of matching the expressivity required by specific applications.



Detecting lateral genetic material transfer

Calderón, C., Delaye, L., Mireles, V., Miramontes, P.

arXiv.org Artificial Intelligence

The bioinformatical methods to detect lateral gene transfer events are mainly based on functional coding DNA characteristics. In this paper, we propose the use of DNA traits not depending on protein coding requirements. We introduce several semilocal variables that depend on DNA primary sequence and that reflect thermodynamic as well as physico-chemical magnitudes that are able to tell apart the genome of different organisms. After combining these variables in a neural classificator, we obtain results whose power of resolution go as far as to detect the exchange of genomic material between bacteria that are phylogenetically close.