Simulation-Based Pretraining and Domain Adaptation for Astronomical Time Series with Minimal Labeled Data
Gupta, Rithwik, Muthukrishna, Daniel, Audenaert, Jeroen
–arXiv.org Artificial Intelligence
Astronomical time-series analysis faces a critical limitation: the scarcity of labeled observational data. We present a pre-training approach that leverages simulations, significantly reducing the need for labeled examples from real observations. Our models, trained on simulated data from multiple astronomical surveys (ZTF and LSST), learn generalizable representations that transfer effectively to downstream tasks. Using classifier-based architectures enhanced with contrastive and adversarial objectives, we create domain-agnostic models that demonstrate substantial performance improvements over baseline methods in classification, redshift estimation, and anomaly detection when fine-tuned with minimal real data. Remarkably, our models exhibit effective zero-shot transfer capabilities, achieving comparable performance on future telescope (LSST) simulations when trained solely on existing telescope (ZTF) data. Furthermore, they generalize to very different astronomical phenomena (namely variable stars from NASA's \textit{Kepler} telescope) despite being trained on transient events, demonstrating cross-domain capabilities. Our approach provides a practical solution for building general models when labeled data is scarce, but domain knowledge can be encoded in simulations.
arXiv.org Artificial Intelligence
Oct-16-2025
- Country:
- Asia > Middle East
- Oman (0.04)
- North America
- Canada (0.04)
- United States
- California > Alameda County
- Fremont (0.04)
- Massachusetts > Middlesex County
- Cambridge (0.14)
- California > Alameda County
- Asia > Middle East
- Genre:
- Research Report (0.50)
- Industry:
- Technology: