Goto

Collaborating Authors

 Ticino


A Model-Centric Review of Deep Learning for Protein Design

arXiv.org Artificial Intelligence

Deep learning has transformed protein design, enabling accurate structure prediction, sequence optimization, and de novo protein generation. Advances in single - chain protein structure prediction via AlphaFold2, RoseTTAFold, ESM Fold, and others have achieved near - experimental accuracy, inspiring successive work extended to biomolecular complexes via AlphaFold Multimer, RoseTTAFold All - Atom, AlphaFold 3, Chai - 1, Boltz - 1 and others . Generative models such as Prot GPT 2, ProteinMPNN, and RFdiffusion have enabled sequence and backbone design beyond natural evolution - based limitations . More recently, joint sequence - structure co - design models, including ESM 3, have integrated both modalities into a unified framework, resulting in improved designability. Despite these advances, challenges still exist pertaining to modeling sequence - structure - function relationships and ensuring robust generalization beyond the regions of protein space spanned by the training data . Future advances wi ll likely focus on joint sequence - structure - function co - design frameworks that are able to model the fitness landscape more effectively than models that treat these modalities independently . Current capabilities, coupled with the dizzying rate of progress, suggest that the field will soon enable rapid, rational design of proteins with tailored structures and functions that transcend the limitations imposed by natural evolution. In this review, we discuss the current capabilities of deep learning methods for protein design, f ocusing on some of the most revolutionary and capable models with respect to their functionality and the applications that they enable, leading up to the current challenges of the field and the optimal path forward.


An Artificial Intelligence-based model for cell killing prediction: development, validation and explainability analysis of the ANAKIN model

arXiv.org Artificial Intelligence

The present work develops ANAKIN: an Artificial iNtelligence bAsed model for (radiation induced) cell KIlliNg prediction. ANAKIN is trained and tested over 513 cell survival experiments with different types of radiation contained in the publicly available PIDE database. We show how ANAKIN accurately predicts several relevant biological endpoints over a wide broad range on ions beams and for a high number of cell--lines. We compare the prediction of ANAKIN to the only two radiobiological model for RBE prediction used in clinics, that is the Microdosimetric Kinetic Model (MKM) and the Local Effect Model (LEM version III), showing how ANAKIN has higher accuracy over the all considered biological endpoints. At last, via modern techniques of Explainable Artificial Intelligence (XAI), we show how ANAKIN predictions can be understood and explained, highlighting how ANAKIN is in fact able to reproduce relevant well-known biological patterns, such as the overkilling effect.


A shortest-path based clustering algorithm for joint human-machine analysis of complex datasets

arXiv.org Artificial Intelligence

Clustering is a technique for the analysis of datasets obtained by empirical studies in several disciplines with a major application for biomedical research. Essentially, clustering algorithms are executed by machines aiming at finding groups of related points in a dataset. However, the result of grouping depends on both metrics for point-to-point similarity and rules for point-to-group association. Indeed, non-appropriate metrics and rules can lead to undesirable clustering artifacts. This is especially relevant for datasets, where groups with heterogeneous structures co-exist. In this work, we propose an algorithm that achieves clustering by exploring the paths between points. This allows both, to evaluate the properties of the path (such as gaps, density variations, etc.), and expressing the preference for certain paths. Moreover, our algorithm supports the integration of existing knowledge about admissible and non-admissible clusters by training a path classifier. We demonstrate the accuracy of the proposed method on challenging datasets including points from synthetic shapes in publicly available benchmarks and microscopy data.