Accuracy
Half-sibling regression meets exoplanet imaging: PSF modeling and subtraction using a flexible, domain knowledge-driven, causal framework
Gebhard, Timothy D., Bonse, Markus J., Quanz, Sascha P., Schölkopf, Bernhard
High-contrast imaging of exoplanets hinges on powerful post-processing methods to denoise the data and separate the signal of a companion from its host star, which is typically orders of magnitude brighter. Existing post-processing algorithms do not use all prior domain knowledge that is available about the problem. We propose a new method that builds on our understanding of the systematic noise and the causal structure of the data-generating process. Our algorithm is based on a modified version of half-sibling regression (HSR), a flexible denoising framework that combines ideas from the fields of machine learning and causality. We adapt the method to address the specific requirements of high-contrast exoplanet imaging data obtained in pupil tracking mode. The key idea is to estimate the systematic noise in a pixel by regressing the time series of this pixel onto a set of causally independent, signal-free predictor pixels. We use regularized linear models in this work; however, other (non-linear) models are also possible. In a second step, we demonstrate how the HSR framework allows us to incorporate observing conditions such as wind speed or air temperature as additional predictors. When we apply our method to four data sets from the VLT/NACO instrument, our algorithm provides a better false-positive fraction than PCA-based PSF subtraction, a popular baseline method in the field. Additionally, we find that the HSR-based method provides direct and accurate estimates for the contrast of the exoplanets without the need to insert artificial companions for calibration in the data sets. Finally, we present first evidence that using the observing conditions as additional predictors can improve the results. Our HSR-based method provides an alternative, flexible and promising approach to the challenge of modeling and subtracting the stellar PSF and systematic noise in exoplanet imaging data.
Artificial Intelligence in Fracture Detection: A Systematic Review and Meta-Analysis
Patients with fractures are a common emergency presentation and may be misdiagnosed at radiologic imaging. An increasing number of studies apply artificial intelligence (AI) techniques to fracture detection as an adjunct to clinician diagnosis. To perform a systematic review and meta-analysis comparing the diagnostic performance in fracture detection between AI and clinicians in peer-reviewed publications and the gray literature (ie, articles published on preprint repositories). A search of multiple electronic databases between January 2018 and July 2020 (updated June 2021) was performed that included any primary research studies that developed and/or validated AI for the purposes of fracture detection at any imaging modality and excluded studies that evaluated image segmentation algorithms. Meta-analysis with a hierarchical model to calculate pooled sensitivity and specificity was used.
Understanding Type-I and Type-II errors in hypothesis testing
We all can relate to thinking about whether route A will take less time than route B, if the average return on investment X is more than investment Y, and if movie ABC is better than movie XYZ. In all these cases, we are testing some hypotheses we have in our minds. Setting up hypotheses, proving/disproving them using data, and helping businesses make decisions is like bread and butter for Data Scientists. Data Scientists often rely on probabilities to understand the likelihood of observing data by chance and use that to make conclusions around a hypothesis. Hence, there are always scenarios of making errors while making conclusions around our assumed hypothesis. The below post is written to provide an intuitive yet detailed explanation of Type-I and Type-II errors that happen during statistical hypothesis testing.
A deep learning framework for the detection and quantification of drusen and reticular pseudodrusen on optical coherence tomography
Schwartz, Roy, Khalid, Hagar, Liakopoulos, Sandra, Ouyang, Yanling, de Vente, Coen, González-Gonzalo, Cristina, Lee, Aaron Y., Guymer, Robyn, Chew, Emily Y., Egan, Catherine, Wu, Zhichao, Kumar, Himeesh, Farrington, Joseph, Sánchez, Clara I., Tufail, Adnan
Purpose - To develop and validate a deep learning (DL) framework for the detection and quantification of drusen and reticular pseudodrusen (RPD) on optical coherence tomography scans. Design - Development and validation of deep learning models for classification and feature segmentation. Methods - A DL framework was developed consisting of a classification model and an out-of-distribution (OOD) detection model for the identification of ungradable scans; a classification model to identify scans with drusen or RPD; and an image segmentation model to independently segment lesions as RPD or drusen. Data were obtained from 1284 participants in the UK Biobank (UKBB) with a self-reported diagnosis of age-related macular degeneration (AMD) and 250 UKBB controls. Drusen and RPD were manually delineated by five retina specialists. The main outcome measures were sensitivity, specificity, area under the ROC curve (AUC), kappa, accuracy and intraclass correlation coefficient (ICC). Results - The classification models performed strongly at their respective tasks (0.95, 0.93, and 0.99 AUC, respectively, for the ungradable scans classifier, the OOD model, and the drusen and RPD classification model). The mean ICC for drusen and RPD area vs. graders was 0.74 and 0.61, respectively, compared with 0.69 and 0.68 for intergrader agreement. FROC curves showed that the model's sensitivity was close to human performance. Conclusions - The models achieved high classification and segmentation performance, similar to human performance. Application of this robust framework will further our understanding of RPD as a separate entity from drusen in both research and clinical settings.
FaceSigns: Semi-Fragile Neural Watermarks for Media Authentication and Countering Deepfakes
Neekhara, Paarth, Hussain, Shehzeen, Zhang, Xinqiao, Huang, Ke, McAuley, Julian, Koushanfar, Farinaz
Deepfakes and manipulated media are becoming a prominent threat due to the recent advances in realistic image and video synthesis techniques. There have been several attempts at combating Deepfakes using machine learning classifiers. However, such classifiers do not generalize well to black-box image synthesis techniques and have been shown to be vulnerable to adversarial examples. To address these challenges, we introduce a deep learning based semi-fragile watermarking technique that allows media authentication by verifying an invisible secret message embedded in the image pixels. Instead of identifying and detecting fake media using visual artifacts, we propose to proactively embed a semi-fragile watermark into a real image so that we can prove its authenticity when needed. Our watermarking framework is designed to be fragile to facial manipulations or tampering while being robust to benign image-processing operations such as image compression, scaling, saturation, contrast adjustments etc. This allows images shared over the internet to retain the verifiable watermark as long as face-swapping or any other Deepfake modification technique is not applied. We demonstrate that FaceSigns can embed a 128 bit secret as an imperceptible image watermark that can be recovered with a high bit recovery accuracy at several compression levels, while being non-recoverable when unseen Deepfake manipulations are applied. For a set of unseen benign and Deepfake manipulations studied in our work, FaceSigns can reliably detect manipulated content with an AUC score of 0.996 which is significantly higher than prior image watermarking and steganography techniques.
What is ROC curve and when to avoid it?
In logistic regression it can output probability of classifying new label as positive(class 1) and if probability is above the threshold point then you predict positive class else negative class. For example if threshold is 20% more labels will be classified as positive leading to increase in recall and decrease in precision. One trick that I started using is changing all the datatypes appropriately to reduce memory, this not only allow your computer to carry larger dataframes however it having right datatypes make computation more efficient. To plot ROC curve we need to measure TPR and FPR at each threshold point. In order to make predictions with different threshold, instead of predicting labels we need to obtain probabilities of new data belonging to each label which can be obtained by predict_proba() .
Predicting Rain with Machine Learning
Let's read in all the data we have. Since all the region data share a primary key date, we can connect them with concat() in pandas and set the keys as the region names. I don't want the regions as the index, so we can reset the index and then rename some columns to get the data in the right shape. Let's first visualize our target class. It appears we have in our hands an imbalanced class, as the N label is dominating the rest of the classes.
Artificial Intelligence's Promise and Peril
John Quackenbush was frustrated with Google. It was January 2020, and a team led by researchers from Google Health had just published a study in Nature about an artificial intelligence (AI) system they had developed to analyze mammograms for signs of breast cancer. The system didn't just work, according to the study, it worked exceptionally well. When the team fed it two large sets of images to analyze--one from the UK and one from the U.S.--it reduced false positives by 1.2 and 5.7 percent and false negatives by 2.7 and 9.4 percent compared with the original determinations made by medical professionals. In a separate test that pitted the AI system against six board-certified radiologists in analyzing nearly 500 mammograms, the algorithm outperformed each of the specialists. The authors concluded that the system was "capable of surpassing human experts in breast cancer prediction" and ready for clinical trials. An avalanche of buzzy headlines soon followed. "Google AI system can beat doctors at detecting breast cancer," a CNN story declared.
Application of Machine Learning Algorithms to Predict AKI
Qiuchong Chen,1,* Yixue Zhang,1,* Mengjun Zhang,1 Ziying Li,1 Jindong Liu1,2 1Department of Anesthesiology, The Affiliated Hospital of Xuzhou Medical University, Xuzhou, Jiangsu, People's Republic of China; 2Jiangsu Province Key Laboratory of Anesthesiology, Xuzhou Medical University, Xuzhou, Jiangsu, People's Republic of China *These authors contributed equally to this work Correspondence: Jindong Liu, Department of Anesthesiology, The Affiliated Hospital of Xuzhou Medical University, 99 Huaihai Road West, Quanshan District, Xuzhou, Jiangsu, 221000, People's Republic of China, Email [email protected] Objective: There has been a worldwide increment in acute kidney injury (AKI) incidence among elderly orthopedic operative patients. The AKI prediction model provides patients' early detection a possibility at risk of AKI; most of the AKI prediction models derive, however, from the cardiothoracic operation. The purpose of this study is to predict the risk of AKI in elderly patients after orthopedic surgery based on machine learning algorithm models. Methods: We organized a retrospective study being comprised of 1000 patients with postoperative AKI undergoing orthopedic surgery from September 2016, to June, 2021. They were divided into training (80%;n 799) and test (20%;n 201) sets.We utilized nine machine learning (ML) algorithms and used intraoperative information and preoperative clinical features to acquire models to predict AKI. The performance of the model was evaluated according to the area under the receiver operating characteristic (AUC), sensitivity, specificity and accuracy. Select the optimal model and establish the nomogram to make the prediction model visualization. The concordance statistic (C-statistic) and calibration curve were used to discriminate and calibrate the nomogram respectively. Results: In predicting AKI, nine ML algorithms posted AUC of 0.656– 1.000 in the training cohort, with the randomforest standing out and AUC of 0.674– 0.821 in the test cohort, with the logistic regression model standing out.
Machine Learning Approaches for Non-Intrusive Home Absence Detection Based on Appliance Electrical Use
Lentzas, Athanasios, Vrakas, Dimitris
Home absence detection is an emerging field on smart home installations. Identifying whether or not the residents of the house are present, is important in numerous scenarios. Possible scenarios include but are not limited to: elderly people living alone, people suffering from dementia, home quarantine. The majority of published papers focus on either pressure / door sensors or cameras in order to detect outing events. Although the aforementioned approaches provide solid results, they are intrusive and require modifications for sensor placement. In our work, appliance electrical use is investigated as a means for detecting the presence or absence of residents. The energy use is the result of power disaggregation, a non intrusive / non invasive sensing method. Since a dataset providing energy data and ground truth for home absence is not available, artificial outing events were introduced on the UK-DALE dataset, a well known dataset for Non Intrusive Load Monitoring (NILM). Several machine learning algorithms were evaluated using the generated dataset. Benchmark results have shown that home absence detection using appliance power consumption is feasible.