Accuracy
Efficient Bayes Inference in Neural Networks through Adaptive Importance Sampling
Huang, Yunshi, Chouzenoux, Emilie, Elvira, Victor, Pesquet, Jean-Christophe
Bayesian neural networks (BNNs) have received an increased interest in the last years. In BNNs, a complete posterior distribution of the unknown weight and bias parameters of the network is produced during the training stage. This probabilistic estimation offers several advantages with respect to point-wise estimates, in particular, the ability to provide uncertainty quantification when predicting new data. This feature inherent to the Bayesian paradigm, is useful in countless machine learning applications. It is particularly appealing in areas where decision-making has a crucial impact, such as medical healthcare or autonomous driving. The main challenge of BNNs is the computational cost of the training procedure since Bayesian techniques often face a severe curse of dimensionality. Adaptive importance sampling (AIS) is one of the most prominent Monte Carlo methodologies benefiting from sounded convergence guarantees and ease for adaptation. This work aims to show that AIS constitutes a successful approach for designing BNNs. More precisely, we propose a novel algorithm PMCnet that includes an efficient adaptation mechanism, exploiting geometric information on the complex (often multimodal) posterior distribution. Numerical results illustrate the excellent performance and the improved exploration capabilities of the proposed method for both shallow and deep neural networks.
Heterogeneous Oblique Double Random Forest
Ganaie, M. A., Tanveer, M., Beheshti, I., Ahmad, N., Suganthan, P. N.
The decision tree ensembles use a single data feature at each node for splitting the data. However, splitting in this manner may fail to capture the geometric properties of the data. Thus, oblique decision trees generate the oblique hyperplane for splitting the data at each non-leaf node. Oblique decision trees capture the geometric properties of the data and hence, show better generalization. The performance of the oblique decision trees depends on the way oblique hyperplanes are generate and the data used for the generation of those hyperplanes. Recently, multiple classifiers have been used in a heterogeneous random forest (RaF) classifier, however, it fails to generate the trees of proper depth. Moreover, double RaF studies highlighted that larger trees can be generated via bootstrapping the data at each non-leaf node and splitting the original data instead of the bootstrapped data recently. The study of heterogeneous RaF lacks the generation of larger trees while as the double RaF based model fails to take over the geometric characteristics of the data. To address these shortcomings, we propose heterogeneous oblique double RaF. The proposed model employs several linear classifiers at each non-leaf node on the bootstrapped data and splits the original data based on the optimal linear classifier. The optimal hyperplane corresponds to the models based on the optimized impurity criterion. The experimental analysis indicates that the performance of the introduced heterogeneous double random forest is comparatively better than the baseline models. To demonstrate the effectiveness of the proposed heterogeneous double random forest, we used it for the diagnosis of Schizophrenia disease. The proposed model predicted the disease more accurately compared to the baseline models.
AUC Maximization for Low-Resource Named Entity Recognition
Nguyen, Ngoc Dang, Tan, Wei, Buntine, Wray, Beare, Richard, Chen, Changyou, Du, Lan
Current work in named entity recognition (NER) uses either cross entropy (CE) or conditional random fields (CRF) as the objective/loss functions to optimize the underlying NER model. Both of these traditional objective functions for the NER problem generally produce adequate performance when the data distribution is balanced and there are sufficient annotated training examples. But since NER is inherently an imbalanced tagging problem, the model performance under the low-resource settings could suffer using these standard objective functions. Based on recent advances in area under the ROC curve (AUC) maximization, we propose to optimize the NER model by maximizing the AUC score. We give evidence that by simply combining two binary-classifiers that maximize the AUC score, significant performance improvement over traditional loss functions is achieved under low-resource NER settings. We also conduct extensive experiments to demonstrate the advantages of our method under the low-resource and highly-imbalanced data distribution settings. To the best of our knowledge, this is the first work that brings AUC maximization to the NER setting. Furthermore, we show that our method is agnostic to different types of NER embeddings, models and domains. The code to replicate this work will be provided upon request.
Collaborative Training of Medical Artificial Intelligence Models with non-uniform Labels
Arasteh, Soroosh Tayebi, Isfort, Peter, Saehn, Marwin, Mueller-Franzes, Gustav, Khader, Firas, Kather, Jakob Nikolas, Kuhl, Christiane, Nebelung, Sven, Truhn, Daniel
Due to the rapid advancements in recent years, medical image analysis is largely dominated by deep learning (DL). However, building powerful and robust DL models requires training with large multi-party datasets. While multiple stakeholders have provided publicly available datasets, the ways in which these data are labeled vary widely. For Instance, an institution might provide a dataset of chest radiographs containing labels denoting the presence of pneumonia, while another institution might have a focus on determining the presence of metastases in the lung. Training a single AI model utilizing all these data is not feasible with conventional federated learning (FL). This prompts us to propose an extension to the widespread FL process, namely flexible federated learning (FFL) for collaborative training on such data. Using 695,000 chest radiographs from five institutions from across the globe - each with differing labels - we demonstrate that having heterogeneously labeled datasets, FFL-based training leads to significant performance increase compared to conventional FL training, where only the uniformly annotated images are utilized. We believe that our proposed algorithm could accelerate the process of bringing collaborative training methods from research and simulation phase to the real-world applications in healthcare.
Systemic Fairness
Ray, Arindam, Padmanabhan, Balaji, Bouayad, Lina
Machine learning algorithms are increasingly used to make or support decisions in a wide range of settings. With such expansive use there is also growing concern about the fairness of such methods. Prior literature on algorithmic fairness has extensively addressed risks and in many cases presented approaches to manage some of them. However, most studies have focused on fairness issues that arise from actions taken by a (single) focal decision-maker or agent. In contrast, most real-world systems have many agents that work collectively as part of a larger ecosystem. For example, in a lending scenario, there are multiple lenders who evaluate loans for applicants, along with policymakers and other institutions whose decisions also affect outcomes. Thus, the broader impact of any lending decision of a single decision maker will likely depend on the actions of multiple different agents in the ecosystem. This paper develops formalisms for firm versus systemic fairness, and calls for a greater focus in the algorithmic fairness literature on ecosystem-wide fairness - or more simply systemic fairness - in real-world contexts.
A Blessing of Dimensionality in Membership Inference through Regularization
Tan, Jasper, LeJeune, Daniel, Mason, Blake, Javadi, Hamid, Baraniuk, Richard G.
Is overparameterization a privacy liability? In this work, we study the effect that the number of parameters has on a classifier's vulnerability to membership inference attacks. We first demonstrate how the number of parameters of a model can induce a privacy--utility trade-off: increasing the number of parameters generally improves generalization performance at the expense of lower privacy. However, remarkably, we then show that if coupled with proper regularization, increasing the number of parameters of a model can actually simultaneously increase both its privacy and performance, thereby eliminating the privacy--utility trade-off. Theoretically, we demonstrate this curious phenomenon for logistic regression with ridge regularization in a bi-level feature ensemble setting. Pursuant to our theoretical exploration, we develop a novel leave-one-out analysis tool to precisely characterize the vulnerability of a linear classifier to the optimal membership inference attack. We empirically exhibit this "blessing of dimensionality" for neural networks on a variety of tasks using early stopping as the regularizer.
Probing the Purview of Neural Networks via Gradient Analysis
Lee, Jinsol, Lehman, Charlie, Prabhushankar, Mohit, AlRegib, Ghassan
We analyze the data-dependent capacity of neural networks and assess anomalies in inputs from the perspective of networks during inference. The notion of data-dependent capacity allows for analyzing the knowledge base of a model populated by learned features from training data. We define purview as the additional capacity necessary to characterize inference samples that differ from the training data. To probe the purview of a network, we utilize gradients to measure the amount of change required for the model to characterize the given inputs more accurately. To eliminate the dependency on ground-truth labels in generating gradients, we introduce confounding labels that are formulated by combining multiple categorical labels. We demonstrate that our gradient-based approach can effectively differentiate inputs that cannot be accurately represented with learned features. We utilize our approach in applications of detecting anomalous inputs, including out-of-distribution, adversarial, and corrupted samples. Our approach requires no hyperparameter tuning or additional data processing and outperforms state-of-the-art methods by up to 2.7%, 19.8%, and 35.6% of AUROC scores, respectively.
Maximal Fairness
Defrance, MaryBeth, De Bie, Tijl
Fairness in AI has garnered quite some attention in research, and increasingly also in society. The so-called "Impossibility Theorem" has been one of the more striking research results with both theoretical and practical consequences, as it states that satisfying a certain combination of fairness measures is impossible. To date, this negative result has not yet been complemented with a positive one: a characterization of which combinations of fairness notions are possible. This work aims to fill this gap by identifying maximal sets of commonly used fairness measures that can be simultaneously satisfied. The fairness measures used are demographic parity, equal opportunity, false positive parity, predictive parity, predictive equality, overall accuracy equality and treatment equality. We conclude that in total 12 maximal sets of these fairness measures are possible, among which seven combinations of two measures, and five combinations of three measures. Our work raises interest questions regarding the practical relevance of each of these 12 maximal fairness notions in various scenarios.
PALF: Pre-Annotation and Camera-LiDAR Late Fusion for the Easy Annotation of Point Clouds
Zhang, Yucheng, Fukuda, Masaki, Ishii, Yasunori, Ohshima, Kyoko, Yamashita, Takayoshi
3D object detection has become indispensable in the field of autonomous driving. To date, gratifying breakthroughs have been recorded in 3D object detection research, attributed to deep learning. However, deep learning algorithms are data-driven and require large amounts of annotated point cloud data for training and evaluation. Unlike 2D image labels, annotating point cloud data is difficult due to the limitations of sparsity, irregularity, and low resolution, which requires more manual work, and the annotation efficiency is much lower than 2D image.Therefore, we propose an annotation algorithm for point cloud data, which is pre-annotation and camera-LiDAR late fusion algorithm to easily and accurately annotate. The contributions of this study are as follows. We propose (1) a pre-annotation algorithm that employs 3D object detection and auto fitting for the easy annotation of point clouds, (2) a camera-LiDAR late fusion algorithm using 2D and 3D results for easily error checking, which helps annotators easily identify missing objects, and (3) a point cloud annotation evaluation pipeline to evaluate our experiments. The experimental results show that the proposed algorithm improves the annotating speed by 6.5 times and the annotation quality in terms of the 3D Intersection over Union and precision by 8.2 points and 5.6 points, respectively; additionally, the miss rate is reduced by 31.9 points.
Auditing ICU Readmission Rates in an Clinical Database: An Analysis of Risk Factors and Clinical Outcomes
This study presents a machine learning (ML) pipeline for clinical data classification in the context of a 30-day readmission problem, along with a fairness audit on subgroups based on sensitive attributes. A range of ML models are used for classification and the fairness audit is conducted on the model predictions. The fairness audit uncovers disparities in equal opportunity, predictive parity, false positive rate parity, and false negative rate parity criteria on the MIMIC III dataset based on attributes such as gender, ethnicity, language, and insurance group. The results identify disparities in the model's performance across different groups and highlights the need for better fairness and bias mitigation strategies. The study suggests the need for collaborative efforts among researchers, policymakers, and practitioners to address bias and fairness in artificial intelligence (AI) systems.