Moldovan, Dan
Identifying Exoplanets with Deep Learning. V. Improved Light Curve Classification for TESS Full Frame Image Observations
Tey, Evan, Moldovan, Dan, Kunimoto, Michelle, Huang, Chelsea X., Shporer, Avi, Daylan, Tansu, Muthukrishna, Daniel, Vanderburg, Andrew, Dattilo, Anne, Ricker, George R., Seager, S.
ABSTRACT The TESS mission produces a large amount of time series data, only a small fraction of which contain detectable exoplanetary transit signals. Deep learning techniques such as neural networks have proved effective at differentiating promising astrophysical eclipsing candidates from other phenomena such as stellar variability and systematic instrumental effects in an efficient, unbiased and sustainable manner. This paper presents a high quality dataset containing light curves from the Primary Mission and 1st Extended Mission full frame images and periodic signals detected via Box Least Squares (Kovács et al. 2002; Hartman 2012). The dataset was curated using a thorough manual review process then used to train a neural network called Astronet-Triage-v2. On our test set, for transiting/eclipsing events we achieve a 99.6% recall (true positives over all data with positive labels) at a precision of 75.7% (true positives over all predicted positives). Since 90% of our training data is from the Primary Mission, we also test our ability to generalize on held-out 1st Extended Mission data. Here, we find an area under the precision-recall curve of 0.965, a 4% improvement over Astronet-Triage (Yu et al. 2019). On the TESS Object of Interest (TOI) Catalog through April 2022, a shortlist of planets and planet candidates, Astronet-Triage-v2 is able to recover 3577 out of 4140 TOIs, while Astronet-Triage only recovers 3349 targets at an equal level of precision. In other words, upgrading to Astronet-Triage-v2 helps save at least 200 planet candidates from being lost. The new model is currently used for planet candidate triage in the Quick-Look Pipeline (Huang et al. 2020a,b; Kunimoto et al. 2021). INTRODUCTION ally requires extremely precise observations.
Underspecification Presents Challenges for Credibility in Modern Machine Learning
D'Amour, Alexander, Heller, Katherine, Moldovan, Dan, Adlam, Ben, Alipanahi, Babak, Beutel, Alex, Chen, Christina, Deaton, Jonathan, Eisenstein, Jacob, Hoffman, Matthew D., Hormozdiari, Farhad, Houlsby, Neil, Hou, Shaobo, Jerfel, Ghassen, Karthikesalingam, Alan, Lucic, Mario, Ma, Yian, McLean, Cory, Mincu, Diana, Mitani, Akinori, Montanari, Andrea, Nado, Zachary, Natarajan, Vivek, Nielson, Christopher, Osborne, Thomas F., Raman, Rajiv, Ramasamy, Kim, Sayres, Rory, Schrouff, Jessica, Seneviratne, Martin, Sequeira, Shannon, Suresh, Harini, Veitch, Victor, Vladymyrov, Max, Wang, Xuezhi, Webster, Kellie, Yadlowsky, Steve, Yun, Taedong, Zhai, Xiaohua, Sculley, D.
ML models often exhibit unexpectedly poor behavior when they are deployed in real-world domains. We identify underspecification as a key reason for these failures. An ML pipeline is underspecified when it can return many predictors with equivalently strong held-out performance in the training domain. Underspecification is common in modern ML pipelines, such as those based on deep learning. Predictors returned by underspecified pipelines are often treated as equivalent based on their training domain performance, but we show here that such predictors can behave very differently in deployment domains. This ambiguity can lead to instability and poor model behavior in practice, and is a distinct failure mode from previously identified issues arising from structural mismatch between training and deployment domains. We show that this problem appears in a wide variety of practical ML pipelines, using examples from computer vision, medical imaging, natural language processing, clinical risk prediction based on electronic health records, and medical genomics. Our results show the need to explicitly account for underspecification in modeling pipelines that are intended for real-world deployment in any domain.
Classification of Semantic Relations between Pairs of Nominals Using Transfer Learning
Zhang, Linrui (University of Texas at Dallas) | Moldovan, Dan (University of Texas at Dallas)
The representation of semantic meaning of sentences using neural network has recently gained popularity, due to the fact that there is no need to specifically extract lexical syntactic and semantic features. A major problem with this approach is that it requires large human annotated corpora. In order to reduce human annotation effort, in recent years, researchers made several attempts to find universal sentence representation methods, aiming to obtain general-purpose sentence embeddings that could be widely adopted to a wide range of NLP tasks without training directly from the specific datasets. InferSent, a supervised universal sentence representation model proposed by Facebook research, implements 8 popular neural network sentence encoding structures trained on natural language inference datasets, and apply to 12 different NLP tasks. However, the relation classification task was not one of these. In this paper, we re-train these 8 sentence encoding structures and use them as the starting points on relation classification task. Experiments using SemEval-2010 datasets show that our models could achieve comparable results to the state-of-the-art relation classification systems.
Tangent: Automatic differentiation using source-code transformation for dynamically typed array programming
Merrienboer, Bart van, Moldovan, Dan, Wiltschko, Alexander
The need to efficiently calculate first- and higher-order derivatives of increasingly complex models expressed in Python has stressed or exceeded the capabilities of available tools. In this work, we explore techniques from the field of automatic differentiation (AD) that can give researchers expressive power, performance and strong usability. These include source-code transformation (SCT), flexible gradient surgery, efficient in-place array operations, and higher-order derivatives. We implement and demonstrate these ideas in the Tangent software library for Python, the first AD framework for a dynamic language that uses SCT.
Tangent: Automatic differentiation using source-code transformation for dynamically typed array programming
Merrienboer, Bart van, Moldovan, Dan, Wiltschko, Alexander
The need to efficiently calculate first- and higher-order derivatives of increasingly complex models expressed in Python has stressed or exceeded the capabilities of available tools. In this work, we explore techniques from the field of automatic differentiation (AD) that can give researchers expressive power, performance and strong usability. These include source-code transformation (SCT), flexible gradient surgery, efficient in-place array operations, and higher-order derivatives. We implement and demonstrate these ideas in the Tangent software library for Python, the first AD framework for a dynamic language that uses SCT.
AutoGraph: Imperative-style Coding with Graph-based Performance
Moldovan, Dan, Decker, James M, Wang, Fei, Johnson, Andrew A, Lee, Brian K, Nado, Zachary, Sculley, D, Rompf, Tiark, Wiltschko, Alexander B
There is a perceived trade-off between machine learning code that is easy to write, and machine learning code that is scalable or fast to execute. In machine learning, imperative style libraries like Autograd and PyTorch are easy to write, but suffer from high interpretive overhead and are not easily deployable in production or mobile settings. Graph-based libraries like TensorFlow and Theano benefit from whole-program optimization and can be deployed broadly, but make expressing complex models more cumbersome. We describe how the use of staged programming in Python, via source code transformation, offers a midpoint between these two library design patterns, capturing the benefits of both. A key insight is to delay all type-dependent decisions until runtime, via dynamic dispatch. We instantiate these principles in AutoGraph, a software system that improves the programming experience of the TensorFlow library, and demonstrate usability improvements with no loss in performance compared to native TensorFlow graphs. We also show that our system is backend agnostic, and demonstrate targeting an alternate IR with characteristics not found in TensorFlow graphs.
Tangent: Automatic differentiation using source-code transformation for dynamically typed array programming
van Merriënboer, Bart, Moldovan, Dan, Wiltschko, Alexander B
The need to efficiently calculate first- and higher-order derivatives of increasingly complex models expressed in Python has stressed or exceeded the capabilities of available tools. In this work, we explore techniques from the field of automatic differentiation (AD) that can give researchers expressive power, performance and strong usability. These include source-code transformation (SCT), flexible gradient surgery, efficient in-place array operations, higher-order derivatives as well as mixing of forward and reverse mode AD. We implement and demonstrate these ideas in the Tangent software library for Python, the first AD framework for a dynamic language that uses SCT.
Chinese Relation Classification via Convolutional Neural Networks
Zhang, Linrui (The University of Texas at Dallas) | Moldovan, Dan (The University of Texas at Dallas)
Relation classification is an important task in natural language processing. Traditional relation classification techniques suffer from extensive use of linguistic features and external toolkits. In recent years, deep learning models that can automatically learn features from text are playing a more essential role in this area. In this paper we present a novel convolutional neural network (CNN) approach along shortest dependency paths (SDP) for Chinese relation classification. We first propose a baseline end-to-end model that only takes sentence-level features, and then improve its performance by joint use of pre-extracted linguistic features. The performance of the system is evaluated on the ACE 2005 Multilingual Training Corpus Chinese dataset. The baseline model achieved a 74.93% F-score on six general type relations and a 66.29% F-score on eighteen subtype relations, and the performance improved 10.71% and 13.60% respectively by incorporating linguistic features into the baseline system.
Tangent: Automatic Differentiation Using Source Code Transformation in Python
van Merriënboer, Bart, Wiltschko, Alexander B., Moldovan, Dan
Automatic differentiation (AD) is an essential primitive for machine learning programming systems. Tangent is a new library that performs AD using source code transformation (SCT) in Python. It takes numeric functions written in a syntactic subset of Python and NumPy as input, and generates new Python functions which calculate a derivative. This approach to automatic differentiation is different from existing packages popular in machine learning, such as TensorFlow and Autograd. Advantages are that Tangent generates gradient code in Python which is readable by the user, easy to understand and debug, and has no runtime overhead. Tangent also introduces abstractions for easily injecting logic into the generated gradient code, further improving usability.
A Logic Prover Approach to Predicting Textual Similarity
Blanco, Eduardo (Lymba Corporation) | Moldovan, Dan (Lymba Corporation)
This paper presents a logic prover approach to predicting textual similarity. Sentences are represented using three logic forms capturing different levels of knowledge, from only content words to semantic representations extracted with an existing semantic parser. A logic prover is used to find proofs and derive semantic features that are combined in a machine learning framework. Experimental results show that incorporating the semantic structure of sentences yields better results than simpler pairwise word similarity measures.