fingerprint
KANEL: Kolmogorov-Arnold Network Ensemble Learning Enables Early Hit Enrichment in High-Throughput Virtual Screening
Koptev, Pavel, Krainov, Nikita, Malkov, Konstantin, Tropsha, Alexander
Machine learning models of chemical bioactivity are increasingly used for prioritizing a small number of compounds in virtual screening libraries for experimental follow-up. In these applications, assessing model accuracy by early hit enrichment such as Positive Predicted Value (PPV) calculated for top N hits (PPV@N) is more appropriate and actionable than traditional global metrics such as AUC. We present KANEL, an ensemble workflow that combines interpretable Kolmogorov-Arnold Networks (KANs) with XGBoost, random forest, and multilayer perceptron models trained on complementary molecular representations (LillyMol descriptors, RDKit-derived descriptors, and Morgan fingerprints). Across five public PubChem BioAssay datasets (AIDs 485314, 485341, 504466, 624202, and 651820), Optuna-optimized weighted ensembles consistently outperformed the best single model in PPV@128 by 0.06-0.12
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.89)
- Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis (0.84)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.55)
- Asia > China > Shanghai > Shanghai (0.04)
- North America > United States > Hawaii > Honolulu County > Honolulu (0.04)
- Europe > Latvia > Riga Municipality > Riga (0.04)
- Asia > Singapore (0.04)
- North America > Canada > Alberta (0.14)
- North America > Canada > Ontario > Toronto (0.14)
- North America > United States > California > Los Angeles County > Long Beach (0.04)
- (5 more...)
- Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
- Law (0.67)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
- North America > Canada > Ontario > Toronto (0.14)
- Europe > Italy > Friuli Venezia Giulia > Trieste Province > Trieste (0.04)
- (10 more...)
- Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
- Materials > Chemicals > Commodity Chemicals > Petrochemicals (0.46)
- Information Technology > Artificial Intelligence > Natural Language (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.68)
Understanding the Limitations of Deep Models for Molecular property prediction: Insights and Solutions
Molecular Property Prediction (MPP) is a critical task in computational drug discovery, aimed at identifying molecules with desirable pharmacological and ADMET (absorption, distribution, metabolism, excretion, and toxicity) properties. Machine learning models have been widely used in this fast-growing field, with two types of models being commonly employed: traditional non-deep models and deep models.
- North America > Canada > Ontario > Toronto (0.04)
- Asia > China > Hubei Province > Wuhan (0.04)
- Asia > Middle East > Jordan (0.04)
- Asia > China (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Asia > Afghanistan > Parwan Province > Charikar (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- (2 more...)
- Health & Medicine > Pharmaceuticals & Biotechnology (0.46)
- Health & Medicine > Therapeutic Area > Oncology (0.40)
- North America > United States > Wisconsin > Dane County > Madison (0.14)
- Oceania > Australia > New South Wales > Sydney (0.04)
- North America > Canada (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)