Goto

Collaborating Authors

 topological index


Fast, Accurate and Interpretable Graph Classification with Topological Kernels

Wesołowski, Adam, Wu, Ronin, Essafi, Karim

arXiv.org Artificial Intelligence

We introduce a novel class of explicit feature maps based on topological indices that represent each graph by a compact feature vector, enabling fast and interpretable graph classification. Using radial basis function kernels on these compact vectors, we define a measure of similarity between graphs. We perform evaluation on standard molecular datasets and observe that classification accuracies based on single topological-index feature vectors underperform compared to state-of-the-art substructure-based kernels. However, we achieve significantly faster Gram matrix evaluation -- up to $20\times$ faster -- compared to the Weisfeiler--Lehman subtree kernel. To enhance performance, we propose two extensions: 1) concatenating multiple topological indices into an \emph{Extended Feature Vector} (EFV), and 2) \emph{Linear Combination of Topological Kernels} (LCTK) by linearly combining Radial Basis Function kernels computed on feature vectors of individual topological graph indices. These extensions deliver up to $12\%$ percent accuracy gains across all the molecular datasets. A complexity analysis highlights the potential for exponential quantum speedup for some of the vector components. Our results indicate that LCTK and EFV offer a favourable trade-off between accuracy and efficiency, making them strong candidates for practical graph learning applications.


Linear to Neural Networks Regression: QSPR of Drugs via Degree-Distance Indices

Arani, M. J. Nadjafi, Sorgun, S., Mirzargar, M.

arXiv.org Artificial Intelligence

This study conducts a Quantitative Structure Property Relationship (QSPR) analysis to explore the correlation between the physical properties of drug molecules and their topological indices using machine learning techniques. While prior studies in drug design have focused on degree-based topological indices, this work analyzes a dataset of 166 drug molecules by computing degree-distance-based topological indices, incorporating vertex-edge weightings with respect to different six atomic properties (atomic number, atomic radius, atomic mass, density, electronegativity, ionization). Both linear models (Linear Regression, Lasso, and Ridge Regression) and nonlinear approaches (Random Forest, XGBoost, and Neural Networks) were employed to predict molecular properties. The results demonstrate the effectiveness of these indices in predicting specific physicochemical properties and underscore the practical relevance of computational methods in molecular property estimation. The study provides an innovative perspective on integrating topological indices with machine learning to enhance predictive accuracy, highlighting their potential application in drug discovery and development processes. This predictive may also explain that establishing a reliable relationship between topological indices and physical properties enables chemists to gain preliminary insights into molecular behavior before conducting experimental analyses, thereby optimizing resource utilization in cheminformatics research.


Classification of signaling proteins based on molecular star graph descriptors using Machine Learning models

Fernandez-Lozano, Carlos, Cuinas, Ruben F., Seoane, Jose A., Fernandez-Blanco, Enrique, Dorado, Julian, Munteanu, Cristian R.

arXiv.org Machine Learning

Signaling proteins are an important topic in drug development due to the increased importance of finding fast, accurate and cheap methods to evaluate new molecular targets involved in specific diseases. The complexity of the protein structure hinders the direct association of the signaling activity with the molecular structure. Therefore, the proposed solution involves the use of protein star graphs for the peptide sequence information encoding into specific topological indices calculated with S2SNet tool. The Quantitative Structure - Activity Relationship classification model obtained with Machine Learning techniques is able to predict new signaling peptides. The best classification model is the first signaling prediction model, which is based on eleven descriptors and it was obtained using the Support Vector Machines - Recursive Feature Elimination (SVM-RFE) technique with the Laplacian kernel (RFE-LAP) and an AUROC of 0.961. Testing a set of 3114 proteins of unknown function from the PDB database assessed the prediction performance of the model. Important signaling pathways are presented for three UniprotIDs (34 PDBs) with a signaling prediction greater than 98.0%.


Generative Modeling of Hidden Functional Brain Networks

Nandy, Shaurabh, Golden, Richard M.

arXiv.org Machine Learning

Resting-state functional connectivity fMRI data is a derivative of the unobservable neuronal functional network structure of the human brain. This data is subject to multiple sources of noise such as thermal noise, system noise, and physiological noise. Commonly used methods to infer the latent network structure, such as thresholding methods, make the implicit assumption that weak links are not as important as strong links, and that links are conditionally independent. However, such assumptions provide an incomplete description of the biology. Additionally, despite a core set of observations about functional networks such as smallworldness, modularity, exponentially truncated degree distributions, and presence of various types of hubs, very little is known about the computational principles which can give rise to these observations. This paper presents a Hidden Markov Random Field framework for the purpose of representing, estimating, and evaluating latent neuronal functional relationships using fMRI data. The main theoretical contributions of this paper are summarized as follows.