Accuracy
Particle Transformer for Jet Tagging
Qu, Huilin, Li, Congqiao, Qian, Sitian
Jet tagging is a critical yet challenging classification task in particle physics. While deep learning has transformed jet tagging and significantly improved performance, the lack of a large-scale public dataset impedes further enhancement. In this work, we present JetClass, a new comprehensive dataset for jet tagging. The JetClass dataset consists of 100 M jets, about two orders of magnitude larger than existing public datasets. A total of 10 types of jets are simulated, including several types unexplored for tagging so far. Based on the large dataset, we propose a new Transformer-based architecture for jet tagging, called Particle Transformer (ParT). By incorporating pairwise particle interactions in the attention mechanism, ParT achieves higher tagging performance than a plain Transformer and surpasses the previous state-of-the-art, ParticleNet, by a large margin. The pre-trained ParT models, once fine-tuned, also substantially enhance the performance on two widely adopted jet tagging benchmarks. The dataset, code and models are publicly available at https://github.com/jet-universe/particle_transformer.
From Modeling to Scoring: Correcting Predicted Class Probabilities in Imbalanced Datasets
Model evaluation is an important part of a data science project and it's exactly this part that quantifies how good your model is, how much it has improved from the previous version, how much better it is than your colleague's model, and how much room for improvement there still is. It is not unusual in machine learning applications to deal with imbalanced datasets such as fraud detection, computer network intrusion, medical diagnostics, and many more. Data imbalance refers to unequal distribution of classes within a dataset, namely that there are far fewer events in one class in comparison to the others. If, for example we have credit card fraud detection dataset, most of the transactions are not fraudulent and very few can be classed as fraud detections. This underrepresented class is called the minority class, and by convention, the positive class.
Fair Generalized Linear Models with a Convex Penalty
Do, Hyungrok, Putzel, Preston, Martin, Axel, Smyth, Padhraic, Zhong, Judy
Despite recent advances in algorithmic fairness, To address these issues there has recently been a significant methodologies for achieving fairness with generalized body of work in the machine learning community on linear models (GLMs) have yet to be algorithmic fairness in the context of predictive modeling, explored in general, despite GLMs being widely including (i) data preprocessing methods that try to reduce used in practice. In this paper we introduce two disparities, (ii) in-process approaches which enforce fairness fairness criteria for GLMs based on equalizing during model training, and (iii) post-process approaches expected outcomes or log-likelihoods. We prove which adjust a model's predictions to achieve fairness after that for GLMs both criteria can be achieved via training is completed. However, the majority of this work a convex penalty term based solely on the linear has focused on classification problems with binary outcome components of the GLM, thus permitting efficient variables, and to a lesser extent on regression.
A machine-generated catalogue of Charon's craters and implications for the Kuiper belt
In this paper we investigate Charon's craters size distribution using a deep learning model. This is motivated by the recent results of Singer et al. (2019) who, using manual cataloging, found a change in the size distribution slope of craters smaller than 12 km in diameter, translating into a paucity of small Kuiper Belt objects. These results were corroborated by Robbins and Singer (2021), but opposed by Morbidelli et al. (2021), necessitating an independent review. Our MaskRCNN-based ensemble of models was trained on Lunar, Mercurian, and Martian crater catalogues and both optical and digital elevation images. We use a robust image augmentation scheme to force the model to generalize and transfer-learn into icy objects. With no prior bias or exposure to Charon, our model find best fit slopes of q =-1.47+-0.33 for craters smaller than 10 km, and q =-2.91+-0.51 for craters larger than 15 km. These values indicate a clear change in slope around 15 km as suggested by Singer et al. (2019) and thus independently confirm their conclusions. Our slopes however are both slightly flatter than those found more recently by Robbins and Singer (2021). Our trained models and relevant codes are available online on github.com/malidib/ACID .
The Mystery of ADASYN is Revealed
This research assumes that you are familiar with class imbalance and the ADASYN algorithm. We strongly encourage our readers to review the conference article that launched ADASYN (just type that into Google Scholar or see the References section of this document), and then read any number of articles in Towards Data Science that discuss class imbalance and ADASYN. Because this is neither a guide nor an overview; it is voyage into uncharted waters with startling discoveries. The answers are 1) surprising, 2) fascinating, and 3) extraordinary, in that order. All models in this research were conducted using the RandomForest and LogisticRegression algorithms in the sci-kit learn library to gain information about both tree and linear structures, respectively. All predictive models were 10-fold cross-validated with stratified sampling using "stratify y" in train_test_split and "cv 10" in GridSearchCV.
Detection of magnetohydrodynamic waves by using machine learning
Nonlinear wave interactions, such as shock refraction at an inclined density interface, in magnetohydrodynamic (MHD) lead to a plethora of wave patterns with myriad wave types. Identification of different types of MHD waves is an important and challenging task in such complex wave patterns. Moreover, owing to the multiplicity of solutions and their admissibility for different systems, especially for intermediate-type MHD shock waves, the identification of MHD wave types is complicated if one solely relies on the Rankine-Hugoniot jump conditions. MHD wave detection is further exacerbated by the unphysical smearing of discontinuous shock waves in numerical simulations. We present two MHD wave detection methods based on a convolutional neural network (CNN) which enables the classification of waves and identification of their locations. The first method separates the output into a regression (location prediction) and a classification problem assuming the number of waves for each training data is fixed. In the second method, the number of waves is not specified a priori and the algorithm, using only regression, predicts the waves' locations and classifies their types. The first fixed output model efficiently provides high precision and recall, the accuracy of the entire neural network achieved is up to 0.99, and the classification accuracy of some waves approaches unity. The second detection model has relatively lower performance, with more sensitivity to the setting of parameters, such as the number of grid cells N_{grid} and the thresholds of confidence score and class probability, etc. The proposed two methods demonstrate very strong potential to be applied for MHD wave detection in some complex wave structures and interactions.
A Novel Implementation of Machine Learning for the Efficient, Explainable Diagnosis of COVID-19 from Chest CT
In a worldwide health crisis as exigent as COVID-19, there has become a pressing need for rapid, reliable diagnostics. Currently, popular testing methods such as reverse transcription polymerase chain reaction (RT-PCR) can have high false negative rates. Consequently, COVID-19 patients are not accurately identified nor treated quickly enough to prevent transmission of the virus. However, the recent rise of medical CT data has presented promising avenues, since CT manifestations contain key characteristics indicative of COVID-19. This study aimed to take a novel approach in the machine learning-based detection of COVID-19 from chest CT scans. First, the dataset utilized in this study was derived from three major sources, comprising a total of 17,698 chest CT slices across 923 patient cases. Image preprocessing algorithms were then developed to reduce noise by excluding irrelevant features. Transfer learning was also implemented with the EfficientNetB7 pre-trained model to provide a backbone architecture and save computational resources. Lastly, several explainability techniques were leveraged to qualitatively validate model performance by localizing infected regions and highlighting fine-grained pixel details. The proposed model attained an overall accuracy of 0.927 and a sensitivity of 0.958. Explainability measures showed that the model correctly distinguished between relevant, critical features pertaining to COVID-19 chest CT images and normal controls. Deep learning frameworks provide efficient, human-interpretable COVID-19 diagnostics that could complement radiologist decisions or serve as an alternative screening tool. Future endeavors may provide insight into infection severity, patient risk stratification, and prognosis.
Python: Confusion Matrix
A confusion matrix is a supervised machine learning evaluation tool that provides more insight into the overall effectiveness of a machine learning classifier. Unlike a simple accuracy metric, which is calculated by dividing the number of correctly predicted records by the total number of records, confusion matrices return 4 unique metrics for you to work with. While I am not saying accuracy is always misleading, there are times, especially when working with examples of imbalanced data, that accuracy can be all but useless. Let's consider credit card fraud. It is not uncommon that given a list of credit card transactions, that a fraud event might make up a little as 1 in 10,000 records.
Cascade Watchdog: A Multi-tiered Adversarial Guard for Outlier Detection
Amigo, Glauco, Bui, Justin M., Baylis, Charles, Marks, Robert J.
The identification of out-of-distribution content is critical to the successful implementation of neural networks. Watchdog techniques have been developed to support the detection of these inputs, but the performance can be limited by the amount of available data. Generative adversarial networks have displayed numerous capabilities, including the ability to generate facsimiles with excellent accuracy. This paper presents and empirically evaluates a multi-tiered watchdog, which is developed using GAN generated data, for improved out-of-distribution detection. The cascade watchdog uses adversarial training to increase the amount of available data similar to the out-of-distribution elements that are more difficult to detect. Then, a specialized second guard is added in sequential order. The results show a solid and significant improvement on the detection of the most challenging out-of-distribution inputs while preserving an extremely low false positive rate.
Crust Macrofracturing as the Evidence of the Last Deglaciation
Aleshin, Igor, Kholodkov, Kirill, Kozlovskaya, Elena, Malygin, Ivan
Machine learning methods were applied to reconsider the results of several passive seismic experiments in Finland. We created datasets from different stages of the receiver function technique and processed them with one of basic machine learning algorithms. All the results were obtained uniformly with the $k$-nearest neighbors algorithm. The first result is the Moho depth map of the region. Another result is the delineation of the near-surface low $S$-wave velocity layer. There are three such areas in the Northern, Southern, and central parts of the region. The low $S$-wave velocity in the Northern and Southern areas can be linked to the geological structure. However, we attribute the central low $S$-wave velocity area to a large number of water-saturated cracks in the upper 1-5 km. Analysis of the structure of this area leads us to the conclusion that macrofracturing was caused by the last deglaciation.