Accuracy
Graph Contrastive Topic Model
Luo, Zheheng, Liu, Lei, Xie, Qianqian, Ananiadou, Sophia
Existing NTMs with contrastive learning suffer from the sample bias problem owing to the word frequency-based sampling strategy, which may result in false negative samples with similar semantics to the prototypes. In this paper, we aim to explore the efficient sampling strategy and contrastive learning in NTMs to address the aforementioned issue. We propose a new sampling assumption that negative samples should contain words that are semantically irrelevant to the prototype. Based on it, we propose the graph contrastive topic model (GCTM), which conducts graph contrastive learning (GCL) using informative positive and negative samples that are generated by the graph-based sampling strategy leveraging in-depth correlation and irrelevance among documents and words. In GCTM, we first model the input document as the document word bipartite graph (DWBG), and construct positive and negative word co-occurrence graphs (WCGs), encoded by graph neural networks, to express in-depth semantic correlation and irrelevance among words. Based on the DWBG and WCGs, we design the document-word information propagation (DWIP) process to perform the edge perturbation of DWBG, based on multi-hop correlations/irrelevance among documents and words. This yields the desired negative and positive samples, which will be utilized for GCL together with the prototypes to improve learning document topic representations and latent topics. We further show that GCL can be interpreted as the structured variational graph auto-encoder which maximizes the mutual information of latent topic representations of different perspectives on DWBG. Experiments on several benchmark datasets demonstrate the effectiveness of our method for topic coherence and document representation learning compared with existing SOTA methods.
Fast and Multi-aspect Mining of Complex Time-stamped Event Streams
Nakamura, Kota, Matsubara, Yasuko, Kawabata, Koki, Umeda, Yuhei, Wada, Yuichiro, Sakurai, Yasushi
Given a huge, online stream of time-evolving events with multiple attributes, such as online shopping logs: (item, price, brand, time), and local mobility activities: (pick-up and drop-off locations, time), how can we summarize large, dynamic high-order tensor streams? How can we see any hidden patterns, rules, and anomalies? Our answer is to focus on two types of patterns, i.e., ''regimes'' and ''components'', for which we present CubeScope, an efficient and effective method over high-order tensor streams. Specifically, it identifies any sudden discontinuity and recognizes distinct dynamical patterns, ''regimes'' (e.g., weekday/weekend/holiday patterns). In each regime, it also performs multi-way summarization for all attributes (e.g., item, price, brand, and time) and discovers hidden ''components'' representing latent groups (e.g., item/brand groups) and their relationship. Thanks to its concise but effective summarization, CubeScope can also detect the sudden appearance of anomalies and identify the types of anomalies that occur in practice. Our proposed method has the following properties: (a) Effective: it captures dynamical multi-aspect patterns, i.e., regimes and components, and statistically summarizes all the events; (b) General: it is practical for successful application to data compression, pattern discovery, and anomaly detection on various types of tensor streams; (c) Scalable: our algorithm does not depend on the length of the data stream and its dimensionality. Extensive experiments on real datasets demonstrate that CubeScope finds meaningful patterns and anomalies correctly, and consistently outperforms the state-of-the-art methods as regards accuracy and execution speed.
Concept-Based Explanations to Test for False Causal Relationships Learned by Abusive Language Classifiers
Nejadgholi, Isar, Kiritchenko, Svetlana, Fraser, Kathleen C., Balkır, Esma
Classifiers tend to learn a false causal relationship between an over-represented concept and a label, which can result in over-reliance on the concept and compromised classification accuracy. It is imperative to have methods in place that can compare different models and identify over-reliances on specific concepts. We consider three well-known abusive language classifiers trained on large English datasets and focus on the concept of negative emotions, which is an important signal but should not be learned as a sufficient feature for the label of abuse. Motivated by the definition of global sufficiency, we first examine the unwanted dependencies learned by the classifiers by assessing their accuracy on a challenge set across all decision thresholds. Further, recognizing that a challenge set might not always be available, we introduce concept-based explanation metrics to assess the influence of the concept on the labels. These explanations allow us to compare classifiers regarding the degree of false global sufficiency they have learned between a concept and a label.
A hybrid machine learning framework for clad characteristics prediction in metal additive manufacturing
During the past decade, metal additive manufacturing (MAM) has experienced significant developments and gained much attention due to its ability to fabricate complex parts, manufacture products with functionally graded materials, minimize waste, and enable low-cost customization. Despite these advantages, predicting the impact of processing parameters on the characteristics of an MAM printed clad is challenging due to the complex nature of MAM processes. Machine learning (ML) techniques can help connect the physics underlying the process and processing parameters to the clad characteristics. In this study, we introduce a hybrid approach which involves utilizing the data provided by a calibrated multi-physics computational fluid dynamic (CFD) model and experimental research for preparing the essential big dataset, and then uses a comprehensive framework consisting of various ML models to predict and understand clad characteristics. We first compile an extensive dataset by fusing experimental data into the data generated using the developed CFD model for this study. This dataset comprises critical clad characteristics, including geometrical features such as width, height, and depth, labels identifying clad quality, and processing parameters. Second, we use two sets of processing parameters for training the ML models: machine setting parameters and physics-aware parameters, along with versatile ML models and reliable evaluation metrics to create a comprehensive and scalable learning framework for predicting clad geometry and quality. This framework can serve as a basis for clad characteristics control and process optimization. The framework resolves many challenges of conventional modeling methods in MAM by solving t the issue of data scarcity using a hybrid approach and introducing an efficient, accurate, and scalable platform for clad characteristics prediction and optimization.
Overconfidence is a Dangerous Thing: Mitigating Membership Inference Attacks by Enforcing Less Confident Prediction
Chen, Zitao, Pattabiraman, Karthik
Machine learning (ML) models are vulnerable to membership inference attacks (MIAs), which determine whether a given input is used for training the target model. While there have been many efforts to mitigate MIAs, they often suffer from limited privacy protection, large accuracy drop, and/or requiring additional data that may be difficult to acquire. This work proposes a defense technique, HAMP that can achieve both strong membership privacy and high accuracy, without requiring extra data. To mitigate MIAs in different forms, we observe that they can be unified as they all exploit the ML model's overconfidence in predicting training samples through different proxies. This motivates our design to enforce less confident prediction by the model, hence forcing the model to behave similarly on the training and testing samples. HAMP consists of a novel training framework with high-entropy soft labels and an entropy-based regularizer to constrain the model's prediction while still achieving high accuracy. To further reduce privacy risk, HAMP uniformly modifies all the prediction outputs to become low-confidence outputs while preserving the accuracy, which effectively obscures the differences between the prediction on members and non-members. We conduct extensive evaluation on five benchmark datasets, and show that HAMP provides consistently high accuracy and strong membership privacy. Our comparison with seven state-of-the-art defenses shows that HAMP achieves a superior privacy-utility trade off than those techniques.
Depth Functions for Partial Orders with a Descriptive Analysis of Machine Learning Algorithms
Blocher, Hannah, Schollmeyer, Georg, Jansen, Christoph, Nalenz, Malte
We propose a framework for descriptively analyzing sets of partial orders based on the concept of depth functions. Despite intensive studies of depth functions in linear and metric spaces, there is very little discussion on depth functions for non-standard data types such as partial orders. We introduce an adaptation of the well-known simplicial depth to the set of all partial orders, the union-free generic (ufg) depth. Moreover, we utilize our ufg depth for a comparison of machine learning algorithms based on multidimensional performance measures. Concretely, we analyze the distribution of different classifier performances over a sample of standard benchmark data sets. Our results promisingly demonstrate that our approach differs substantially from existing benchmarking approaches and, therefore, adds a new perspective to the vivid debate on the comparison of classifiers.
Robust Uncertainty Estimation for Classification of Maritime Objects
Becktor, Jonathan, Scholler, Frederik, Boukas, Evangelos, Nalpantidis, Lazaros
We explore the use of uncertainty estimation in the maritime domain, showing the efficacy on toy datasets (CIFAR10) and proving it on an in-house dataset, SHIPS. We present a method joining the intra-class uncertainty achieved using Monte Carlo Dropout, with recent discoveries in the field of outlier detection, to gain more holistic uncertainty measures. We explore the relationship between the introduced uncertainty measures and examine how well they work on CIFAR10 and in a real-life setting. Our work improves the FPR95 by 8% compared to the current highest-performing work when the models are trained without out-of-distribution data. We increase the performance by 77% compared to a vanilla implementation of the Wide ResNet. We release the SHIPS dataset and show the effectiveness of our method by improving the FPR95 by 44.2% with respect to the baseline. Our approach is model agnostic, easy to implement, and often does not require model retraining.
The ROAD to discovery: machine learning-driven anomaly detection in radio astronomy spectrograms
Mesarcik, Michael, Boonstra, Albert-Jan, Iacobelli, Marco, Ranguelova, Elena, de Laat, Cees, van Nieuwpoort, Rob
As radio telescopes increase in sensitivity and flexibility, so do their complexity and data-rates. For this reason automated system health management approaches are becoming increasingly critical to ensure nominal telescope operations. We propose a new machine learning anomaly detection framework for classifying both commonly occurring anomalies in radio telescopes as well as detecting unknown rare anomalies that the system has potentially not yet seen. To evaluate our method, we present a dataset consisting of 7050 autocorrelation-based spectrograms from the Low Frequency Array (LOFAR) telescope and assign 10 different labels relating to the system-wide anomalies from the perspective of telescope operators. This includes electronic failures, miscalibration, solar storms, network and compute hardware errors among many more. We demonstrate how a novel Self Supervised Learning (SSL) paradigm, that utilises both context prediction and reconstruction losses, is effective in learning normal behaviour of the LOFAR telescope. We present the Radio Observatory Anomaly Detector (ROAD), a framework that combines both SSL-based anomaly detection and a supervised classification, thereby enabling both classification of both commonly occurring anomalies and detection of unseen anomalies. We demonstrate that our system is real-time in the context of the LOFAR data processing pipeline, requiring <1ms to process a single spectrogram. Furthermore, ROAD obtains an anomaly detection F-2 score of 0.92 while maintaining a false positive rate of ~2\%, as well as a mean per-class classification F-2 score 0.89, outperforming other related works.
Automated identification and quantification of myocardial inflammatory infiltration in digital histological images to diagnose myocarditis
Liu, Yanyun, Hua, Xiumeng, Zhu, Shouping, Wang, Congrui, Chen, Xiao, Shi, Yu, Song, Jiangping, Zhou, Weihua
This study aims to develop a new computational pathology approach that automates the identification and quantification of myocardial inflammatory infiltration in digital HE-stained images to provide a quantitative histological diagnosis of myocarditis.898 HE-stained whole slide images (WSIs) of myocardium from 154 heart transplant patients diagnosed with myocarditis or dilated cardiomyopathy (DCM) were included in this study. An automated DL-based computational pathology approach was developed to identify nuclei and detect myocardial inflammatory infiltration, enabling the quantification of the lymphocyte nuclear density (LND) on myocardial WSIs. A cutoff value based on the quantification of LND was proposed to determine if the myocardial inflammatory infiltration was present. The performance of our approach was evaluated with a five-fold cross-validation experiment, tested with an internal test set from the myocarditis group, and confirmed by an external test from a double-blind trial group. An LND of 1.02/mm2 could distinguish WSIs with myocarditis from those without. The accuracy, sensitivity, specificity, and area under the receiver operating characteristic curve (AUC) in the five-fold cross-validation experiment were 0.899 plus or minus 0.035, 0.971 plus or minus 0.017, 0.728 plus or minus 0.073 and 0.849 plus or minus 0.044, respectively. For the internal test set, the accuracy, sensitivity, specificity, and AUC were 0.887, 0.971, 0.737, and 0.854, respectively. The accuracy, sensitivity, specificity, and AUC for the external test set reached 0.853, 0.846, 0.858, and 0.852, respectively. Our new approach provides accurate and reliable quantification of the LND of myocardial WSIs, facilitating automated quantitative diagnosis of myocarditis with HE-stained images.
Multi-Task Learning Improves Performance In Deep Argument Mining Models
Farzam, Amirhossein, Shekhar, Shashank, Mehlhaff, Isaac, Morucci, Marco
The successful analysis of argumentative techniques from user-generated text is central to many downstream tasks such as political and market analysis. Recent argument mining tools use state-of-the-art deep learning methods to extract and annotate argumentative techniques from various online text corpora, however each task is treated as separate and different bespoke models are fine-tuned for each dataset. We show that different argument mining tasks share common semantic and logical structure by implementing a multi-task approach to argument mining that achieves better performance than state-of-the-art methods for the same problems. Our model builds a shared representation of the input text that is common to all tasks and exploits similarities between tasks in order to further boost performance via parameter-sharing. Our results are important for argument mining as they show that different tasks share substantial similarities and suggest a holistic approach to the extraction of argumentative techniques from text.