kepler
Ultra-rare first edition book from Galileo heading to auction
Breakthroughs, discoveries, and DIY tips sent every weekday. A small library's worth of rare medieval and Renaissance books are heading to auction on July 9. The expansive lot includes a portable Magna Carta, an early scientific encyclopedia, a surgical codex, and one of the oldest surviving Sephardic Torah scrolls. But according to Christies's Auction House, one manuscript is the first of its kind to go up for sale in over a century: a copy of the first pseudonymous astronomical text co-written by Galileo Galilei. The evening of October 9, 1604, offered an unexpected and ultimately revolutionary moment for astronomy.
Deep Learning and LLM-based Methods Applied to Stellar Lightcurve Classification
Li, Yu-Yang, Bai, Yu, Wang, Cunshi, Qu, Mengwei, Lu, Ziteng, Soria, Roberto, Liu, Jifeng
Light curves serve as a valuable source of information on stellar formation and evolution. With the rapid advancement of machine learning techniques, it can be effectively processed to extract astronomical patterns and information. In this study, we present a comprehensive evaluation of deep-learning and large language model (LLM) based models for the automatic classification of variable star light curves, based on large datasets from the Kepler and K2 missions. Special emphasis is placed on Cepheids, RR Lyrae, and eclipsing binaries, examining the influence of observational cadence and phase distribution on classification precision. Employing AutoDL optimization, we achieve striking performance with the 1D-Convolution+BiLSTM architecture and the Swin Transformer, hitting accuracies of 94\% and 99\% correspondingly, with the latter demonstrating a notable 83\% accuracy in discerning the elusive Type II Cepheids-comprising merely 0.02\% of the total dataset.We unveil StarWhisper LightCurve (LC), an innovative Series comprising three LLM-based models: LLM, multimodal large language model (MLLM), and Large Audio Language Model (LALM). Each model is fine-tuned with strategic prompt engineering and customized training methods to explore the emergent abilities of these models for astronomical data. Remarkably, StarWhisper LC Series exhibit high accuracies around 90\%, significantly reducing the need for explicit feature engineering, thereby paving the way for streamlined parallel data processing and the progression of multifaceted multimodal models in astronomical applications. The study furnishes two detailed catalogs illustrating the impacts of phase and sampling intervals on deep learning classification accuracy, showing that a substantial decrease of up to 14\% in observation duration and 21\% in sampling points can be realized without compromising accuracy by more than 10\%.
- Asia > China (0.29)
- Oceania > Australia (0.14)
- North America > United States > New York (0.14)
- (3 more...)
- Energy > Oil & Gas (0.46)
- Information Technology (0.34)
ExoMiner++ on TESS with Transfer Learning from Kepler: Transit Classification and Vetting Catalog for 2-min Data
Valizadegan, Hamed, Martinho, Miguel J. S., Jenkins, Jon M., Twicken, Joseph D., Caldwell, Douglas A., Maynard, Patrick, Wei, Hongbo, Zhong, William, Yates, Charles, Donald, Sam, Collins, Karen A., Latham, David, Barkaoui, Khalid, Berlind, Perry, Calkins, Michael L., Carden, Kylee, Chazov, Nikita, Esquerdo, Gilbert A., Guillot, Tristan, Krushinsky, Vadim, Nowak, Grzegorz, Rackham, Benjamin V., Triaud, Amaury, Schwarz, Richard P., Stephens, Denise, Stockdale, Chris, Wang, Jiaqi, Watkins, Cristilyn N., Wilkin, Francis P.
We present ExoMiner++, an enhanced deep learning model that builds on the success of ExoMiner to improve transit signal classification in 2-minute TESS data. ExoMiner++ incorporates additional diagnostic inputs, including periodogram, flux trend, difference image, unfolded flux, and spacecraft attitude control data, all of which are crucial for effectively distinguishing transit signals from more challenging sources of false positives. To further enhance performance, we leverage transfer learning from high-quality labeled data from the Kepler space telescope, mitigating the impact of TESS's noisier and more ambiguous labels. ExoMiner++ achieves high accuracy across various classification and ranking metrics, significantly narrowing the search space for follow-up investigations to confirm new planets. To serve the exoplanet community, we introduce new TESS catalogs containing ExoMiner++ classifications and confidence scores for each transit signal. Among the 147,568 unlabeled TCEs, ExoMiner++ identifies 7,330 as planet candidates, with the remainder classified as false positives. These 7,330 planet candidates correspond to 1,868 existing TESS Objects of Interest (TOIs), 69 Community TESS Objects of Interest (CTOIs), and 50 newly introduced CTOIs. 1,797 out of the 2,506 TOIs previously labeled as planet candidates in ExoFOP are classified as planet candidates by ExoMiner++. This reduction in plausible candidates combined with the excellent ranking quality of ExoMiner++ allows the follow-up efforts to be focused on the most likely candidates, increasing the overall planet yield.
- North America > United States > California > Alameda County > Berkeley (0.14)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- Oceania > Australia (0.04)
- (17 more...)
- Research Report > New Finding (0.45)
- Research Report > Experimental Study (0.45)
- Government > Space Agency (0.68)
- Government > Regional Government > North America Government > United States Government (0.68)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
The Detection of KIC 1718360, A Rotating Variable with a Possible Companion, Using Machine Learning
This paper presents the detection of a periodic dimming event in the lightcurve of the G1.5IV-V type star KIC 1718360. This is based on visible-light observations conducted by both the TESS and Kepler space telescopes. Analysis of the data seems to point toward a high rotation rate in the star, with a rotational period of 2.938 days. The high variability seen within the star's lightcurve points toward classification as a rotating variable. The initial observation was made in Kepler Quarter 16 data using the One-Class SVM machine learning method. Subsequent observations by the TESS space telescope corroborated these findings. It appears that KIC 1718360 is a nearby rotating variable that appears in little to no major catalogs as such. A secondary, additional periodic dip is also present, indicating a possible exoplanetary companion.
LCEN: A Novel Feature Selection Algorithm for Nonlinear, Interpretable Machine Learning Models
Seber, Pedro, Braatz, Richard D.
Interpretable architectures can have advantages over black-box architectures, and interpretability is essential for the application of machine learning in critical settings, such as aviation or medicine. However, the simplest, most commonly used interpretable architectures, such as LASSO or elastic net (EN), are limited to linear predictions and have poor feature selection capabilities. In this work, we introduce the LASSO-Clip-EN (LCEN) algorithm for the creation of nonlinear, interpretable machine learning models. LCEN is tested on a wide variety of artificial and empirical datasets, frequently creating more accurate, sparser models than other architectures, including those for building sparse, nonlinear models. LCEN is robust against many issues typically present in datasets and modeling, including noise, multicollinearity, data scarcity, and hyperparameter variance. LCEN is also able to rediscover multiple physical laws from empirical data and, for processes with no known physical laws, LCEN achieves better results than many other dense and sparse methods -- including using 10.8-fold fewer features than dense methods and 8.1-fold fewer features than EN on one dataset, and is comparable to or better than ANNs on multiple datasets.
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
- North America > United States > New York (0.04)
- Oceania > Australia > Tasmania (0.04)
- North America > United States > Michigan > Macomb County > Warren (0.04)
- Energy (0.93)
- Transportation > Air (0.54)
Single Transit Detection In Kepler With Machine Learning And Onboard Spacecraft Diagnostics
Hansen, Matthew T., Dittmann, Jason A.
ABSTRACT Exoplanet discovery at long orbital periods requires reliably detecting individual transits without additional information about the system. Techniques like phase-folding of light curves and periodogram analysis of radial velocity data are more sensitive to planets with shorter orbital periods, leaving a dearth of planet discoveries at long periods. We present a novel technique using an ensemble of Convolutional Neural Networks incorporating the onboard spacecraft diagnostics of Kepler to classify transits within a light curve. We create a pipeline to recover the location of individual transits, and the period of the orbiting planet, which maintains > 80% transit recovery sensitivity out to an 800-day orbital period. Our neural network pipeline has the potential to discover additional planets in the Kepler dataset, and crucially, within the η-Earth regime. We report our first candidate from this pipeline, KOI 1271.02. KOI 1271.01 is known to exhibit strong Transit Timing Variations (TTVs), and so we jointly model the TTVs and transits of both transiting planets to constrain the orbital configuration and planetary parameters and conclude with a series of potential parameters for KOI 1271.02, as there is not enough data currently to uniquely constrain the system. We conclude that KOI 1271.02 has a radius of 5.32 0.20 R INTRODUCTION studies to measure masses and potentially detect their atmospheric composition. Since the discovery of the first exoplanets, there has Thousands of confirmed planets and thousands of been a rapid increase in the number of exoplanets discovered more planet candidate signals have been found within (Wolszczan & Frail 1992; Mayor & Queloz 1995; the Kepler field of view (Borucki et al. 2011; Batalha Charbonneau et al. 2000). With the discovery of more et al. 2013; Thompson et al. 2018; Morton et al. 2016) exoplanets, it became possible to perform demographic as well as within the current TESS sample Guerrero studies of exoplanets and dissect the population along et al. (2021). These discoveries have enabled statistical other axes (such as stellar metallicity, for example). Of particular interest is the occurrence observed roughly 150,000 stars photometrically during rate of Earth-like planets around Sun-like stars (i.e. - its main mission Borucki et al. (2010). Kepler continued η-Earth) (Fressin et al. 2013; Catanzarite & Shao 2011; to observe the sky after two of its reaction wheels broke Petigura et al. 2013; Foreman-Mackey et al. 2014; Farr as the K2 mission Howell et al. (2014). Kepler was a statistical et al. 2014; Silburt et al. 2015; Burke et al. 2015; Traub mission aimed at finding the frequency of Earthlike 2015; Garrett et al. 2018; Mulders et al. 2018; Hsu et al. planets around Sun-like stars, η-Earth.
- Asia > Middle East > Republic of Türkiye > Batman Province > Batman (0.04)
- North America > United States > Florida > Alachua County > Gainesville (0.04)
- Asia > Singapore (0.05)
- North America > United States > Massachusetts > Suffolk County > Boston (0.04)
- North America > United States > Massachusetts > Middlesex County > Reading (0.04)
- (2 more...)
Kepler: Robust Learning for Faster Parametric Query Optimization
Doshi, Lyric, Zhuang, Vincent, Jain, Gaurav, Marcus, Ryan, Huang, Haoyu, Altinbüken, Deniz, Brevdo, Eugene, Fraser, Campbell
Most existing parametric query optimization (PQO) techniques rely on traditional query optimizer cost models, which are often inaccurate and result in suboptimal query performance. We propose Kepler, an end-to-end learning-based approach to PQO that demonstrates significant speedups in query latency over a traditional query optimizer. Central to our method is Row Count Evolution (RCE), a novel plan generation algorithm based on perturbations in the sub-plan cardinality space. While previous approaches require accurate cost models, we bypass this requirement by evaluating candidate plans via actual execution data and training an ML model to predict the fastest plan given parameter binding values. Our models leverage recent advances in neural network uncertainty in order to robustly predict faster plans while avoiding regressions in query performance. Experimentally, we show that Kepler achieves significant improvements in query runtime on multiple datasets on PostgreSQL.
AI Hilbert: A New Paradigm for Scientific Discovery by Unifying Data and Background Knowledge
Cory-Wright, Ryan, Khadir, Bachir El, Cornelio, Cristina, Dash, Sanjeeb, Horesh, Lior
The discovery of scientific formulae that parsimoniously explain natural phenomena and align with existing background theory is a key goal in science. Historically, scientists have derived natural laws by manipulating equations based on existing knowledge, forming new equations, and verifying them experimentally. In recent years, data-driven scientific discovery has emerged as a viable competitor in settings with large amounts of experimental data. Unfortunately, data-driven methods often fail to discover valid laws when data is noisy or scarce. Accordingly, recent works combine regression and reasoning to eliminate formulae inconsistent with background theory. However, the problem of searching over the space of formulae consistent with background theory to find one that fits the data best is not well-solved. We propose a solution to this problem when all axioms and scientific laws are expressible via polynomial equalities and inequalities and argue that our approach is widely applicable. We further model notions of minimal complexity using binary variables and logical constraints, solve polynomial optimization problems via mixed-integer linear or semidefinite optimization, and prove the validity of our scientific discoveries in a principled manner using Positivestellensatz certificates. Remarkably, the optimization techniques leveraged in this paper allow our approach to run in polynomial time with fully correct background theory, or non-deterministic polynomial (NP) time with partially correct background theory. We demonstrate that some famous scientific laws, including Kepler's Third Law of Planetary Motion, the Hagen-Poiseuille Equation, and the Radiated Gravitational Wave Power equation, can be derived in a principled manner from background axioms and experimental data.
- North America > United States (0.46)
- Oceania (0.14)
- Europe > United Kingdom > England (0.14)
- Europe > Germany > Berlin (0.14)
- Government (0.93)
- Law (0.68)
- Energy > Oil & Gas (0.46)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Scientific Discovery (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)
Identification and Classification of Exoplanets Using Machine Learning Techniques
NASA's Kepler Space Telescope has been instrumental in the task of finding the presence of exoplanets in our galaxy. This search has been supported by computational data analysis to identify exoplanets from the signals received by the Kepler telescope. In this paper, we consider building upon some existing work on exoplanet identification using residual networks for the data of the Kepler space telescope and its extended mission K2. This paper aims to explore how deep learning algorithms can help in classifying the presence of exoplanets with less amount of data in one case and a more extensive variety of data in another. In addition to the standard CNN-based method, we propose a Siamese architecture that is particularly useful in addressing classification in a low-data scenario. The CNN and ResNet algorithms achieved an average accuracy of 68% for three classes and 86% for two-class classification. However, for both the three and two classes, the Siamese algorithm achieved 99% accuracy.