Zhao, Yinan
Improving Earth-like planet detection in radial velocity using deep learning
Zhao, Yinan, Dumusque, Xavier, Cretignier, Michael, Cameron, Andrew Collier, Latham, David W., López-Morales, Mercedes, Mayor, Michel, Sozzetti, Alessandro, Cosentino, Rosario, Gómez-Vargas, Isidro, Pepe, Francesco, Udry, Stephane
Many novel methods have been proposed to mitigate stellar activity for exoplanet detection as the presence of stellar activity in radial velocity (RV) measurements is the current major limitation. Unlike traditional methods that model stellar activity in the RV domain, more methods are moving in the direction of disentangling stellar activity at the spectral level. The goal of this paper is to present a novel convolutional neural network-based algorithm that efficiently models stellar activity signals at the spectral level, enhancing the detection of Earth-like planets. We trained a convolutional neural network to build the correlation between the change in the spectral line profile and the corresponding RV, full width at half maximum (FWHM) and bisector span (BIS) values derived from the classical cross-correlation function. This algorithm has been tested on three intensively observed stars: Alpha Centauri B (HD128621), Tau ceti (HD10700), and the Sun. By injecting simulated planetary signals at the spectral level, we demonstrate that our machine learning algorithm can achieve, for HD128621 and HD10700, a detection threshold of 0.5 m/s in semi-amplitude for planets with periods ranging from 10 to 300 days. This threshold would correspond to the detection of a $\sim$4$\mathrm{M}_{\oplus}$ in the habitable zone of those stars. On the HARPS-N solar dataset, our algorithm is even more efficient at mitigating stellar activity signals and can reach a threshold of 0.2 m/s, which would correspond to a 2.2$\mathrm{M}_{\oplus}$ planet on the orbit of the Earth. To the best of our knowledge, it is the first time that such low detection thresholds are reported for the Sun, but also for other stars, and therefore this highlights the efficiency of our convolutional neural network-based algorithm at mitigating stellar activity in RV measurements.
The GPU Phase Folding and Deep Learning Method for Detecting Exoplanet Transits
Wang, Kaitlyn, Ge, Jian, Willis, Kevin, Wang, Kevin, Zhao, Yinan
This paper presents GPFC, a novel Graphics Processing Unit (GPU) Phase Folding and Convolutional Neural Network (CNN) system to detect exoplanets using the transit method. We devise a fast folding algorithm parallelized on a GPU to amplify low signal-to-noise ratio transit signals, allowing a search at high precision and speed. A CNN trained on two million synthetic light curves reports a score indicating the likelihood of a planetary signal at each period. While the GPFC method has broad applicability across period ranges, this research specifically focuses on detecting ultra-short-period planets with orbital periods less than one day. GPFC improves on speed by three orders of magnitude over the predominant Box-fitting Least Squares (BLS) method. Our simulation results show GPFC achieves $97%$ training accuracy, higher true positive rate at the same false positive rate of detection, and higher precision at the same recall rate when compared to BLS. GPFC recovers $100\%$ of known ultra-short-period planets in $\textit{Kepler}$ light curves from a blind search. These results highlight the promise of GPFC as an alternative approach to the traditional BLS algorithm for finding new transiting exoplanets in data taken with $\textit{Kepler}$ and other space transit missions such as K2, TESS and future PLATO and Earth 2.0.
Discovery of Small Ultra-short-period Planets Orbiting KG Dwarfs in Kepler Survey Using GPU Phase Folding and Deep Learning Detection System
Wang, Kaitlyn, Ge, Jian, Willis, Kevin, Wang, Kevin, Zhao, Yinan
Since the discovery of the first hot Jupiter orbiting a solar-type star, 51 Peg, in 1995, more than 4000 exoplanets have been identified using various observational techniques. The formation process of these sub-Earths remains elusive, and acquiring additional samples is essential for investigating this unique population. In our study, we employ a novel GPU Phase Folding algorithm combined with a Convolutional Neural Network, termed the GPFC method, on Kepler photometry data. This method enhances the transit search speed significantly over the traditional Box-fitting Least Squares method, allowing a complete search of the known KOI photometry data within hours using a commercial GPU card. To date, we have identified five promising sub-Earth short-period candidates: K00446.c, K01821.b, K01522.c, K03404.b, and K04978.b. A closer analysis reveals the following characteristics: K00446.c orbits a K dwarf on a 0.645091-day period. With a radius of $0.461R_\oplus$, it ranks as the second smallest USP discovered to date. K01821.b is a sub-Earth with a radius of $0.648R_\oplus$, orbiting a G dwarf over a 0.91978-day period. It is the second smallest USP among all confirmed USPs orbiting G dwarfs in the NASA Archive. K01522.c has a radius of $0.704 R_\oplus$ and completes an orbit around a Sun-like G dwarf in 0.64672 days; K03404.b, with a radius of $0.738 R_\oplus$, orbits a G dwarf on a 0.68074-day period; and K04978.b, with its planetary radius of $0.912 R_\oplus$, orbits a G dwarf, completing an orbit every 0.94197 days. Three of our finds, K01821.b, K01522.c and K03404.b, rank as the smallest planets among all confirmed USPs orbiting G dwarfs in the Kepler dataset. The discovery of these small exoplanets underscores the promising capability of the GPFC method for searching for small, new transiting exoplanets in photometry data from Kepler, TESS, and future space transit missions.
Open-Vocabulary Semantic Segmentation with Mask-adapted CLIP
Liang, Feng, Wu, Bichen, Dai, Xiaoliang, Li, Kunpeng, Zhao, Yinan, Zhang, Hang, Zhang, Peizhao, Vajda, Peter, Marculescu, Diana
Open-vocabulary semantic segmentation aims to segment an image into semantic regions according to text descriptions, which may not have been seen during training. Recent two-stage methods first generate class-agnostic mask proposals and then leverage pre-trained vision-language models, e.g., CLIP, to classify masked regions. We identify the performance bottleneck of this paradigm to be the pre-trained CLIP model, since it does not perform well on masked images. To address this, we propose to finetune CLIP on a collection of masked image regions and their corresponding text descriptions. We collect training data by mining an existing image-caption dataset (e.g., COCO Captions), using CLIP to match masked image regions to nouns in the image captions. Compared with the more precise and manually annotated segmentation labels with fixed classes (e.g., COCO-Stuff), we find our noisy but diverse dataset can better retain CLIP's generalization ability. Along with finetuning the entire model, we utilize the "blank" areas in masked images using a method we dub mask prompt tuning. Experiments demonstrate mask prompt tuning brings significant improvement without modifying any weights of CLIP, and it can further improve a fully finetuned model. In particular, when trained on COCO and evaluated on ADE20K-150, our best model achieves 29.6% mIoU, which is +8.5% higher than the previous state-of-the-art. For the first time, open-vocabulary generalist models match the performance of supervised specialist models in 2017 without dataset-specific adaptations.