Data Fusion of Deep Learned Molecular Embeddings for Property Prediction

Appleton, Robert J, Barnes, Brian C, Strachan, Alejandro

Oct-29-2025–arXiv.org Artificial Intelligence

Data - driven approaches such as deep learning can result in predictive models for material properties with exceptional accuracy and efficiency. However, in many applications, data is sparse, severely limiting their accuracy and applicability . To improve predictions, techniques such as transfer learning and multi - task learning have been used. T he performance of multi - task learning models depend s on the strength of the underlying correlations between tasks and the completeness of the dataset . S tandard multi - task models tend to underperform when trained on sparse datasets with weakly correlated properties. To address this gap, we fuse deep - learned embeddings generated by independent pre - trained single - task models, resulting in a multi - task model that inherit s rich, property - specific representations. By re - using (rather than re - training) these embeddings, the resulting fused model outperforms standard multi - task models and can be extended with fewer trainable parameters . We demonstrate this technique on a widely used benchmark dataset of quantum chemistry data for small molecules as well as a newly compiled sparse dataset of experimental data collected from literature and our own quant um chemistry and thermochemical calculations.

artificial intelligence, distribution statement, machine learning, (17 more...)

arXiv.org Artificial Intelligence

Oct-29-2025

arXiv.org PDF

Add feedback

Country:
- North America > United States > Indiana (0.28)

Genre:
- Research Report > New Finding (0.67)

Industry:
- Government
  - Military (0.68)
  - Regional Government (0.68)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found