Accuracy
A Parameter-Free Two-Bit Covariance Estimator with Improved Operator Norm Error Rate
A covariance matrix estimator using two bits per entry was recently developed by Dirksen, Maly and Rauhut [Annals of Statistics, 50(6), pp. 3538-3562]. The estimator achieves near minimax rate for general sub-Gaussian distributions, but also suffers from two downsides: theoretically, there is an essential gap on operator norm error between their estimator and sample covariance when the diagonal of the covariance matrix is dominated by only a few entries; practically, its performance heavily relies on the dithering scale, which needs to be tuned according to some unknown parameters. In this work, we propose a new 2-bit covariance matrix estimator that simultaneously addresses both issues. Unlike the sign quantizer associated with uniform dither in Dirksen et al., we adopt a triangular dither prior to a 2-bit quantizer inspired by the multi-bit uniform quantizer. By employing dithering scales varying across entries, our estimator enjoys an improved operator norm error rate that depends on the effective rank of the underlying covariance matrix rather than the ambient dimension, thus closing the theoretical gap. Moreover, our proposed method eliminates the need of any tuning parameter, as the dithering scales are entirely determined by the data. Experimental results under Gaussian samples are provided to showcase the impressive numerical performance of our estimator. Remarkably, by halving the dithering scales, our estimator oftentimes achieves operator norm errors less than twice of the errors of sample covariance.
Assessing Cyclostationary Malware Detection via Feature Selection and Classification
Cyclostationarity involves periodic statistical variations in signals and processes, commonly used in signal analysis and network security. In the context of attacks, cyclostationarity helps detect malicious behaviors within network traffic, such as traffic patterns in Distributed Denial of Service (DDoS) attacks or hidden communication channels in malware. This approach enhances security by identifying abnormal patterns and informing Network Intrusion Detection Systems (NIDSs) to recognize potential attacks, enhancing protection against both known and novel threats. This research focuses on identifying cyclostationary malware behavior and its detection. The main goal is to pinpoint essential cyclostationary features used in NIDSs. These features are extracted using algorithms such as Boruta and Principal Component Analysis (PCA), and then categorized to find the most significant cyclostationary patterns. The aim of this article is to reveal periodically changing malware behaviors through cyclostationarity. The study highlights the importance of spotting cyclostationary malware in NIDSs by using established datasets like KDD99, NSL-KDD, and the UGRansome dataset. The UGRansome dataset is designed for anomaly detection research and includes both normal and abnormal network threat categories of zero-day attacks. A comparison is made using the Random Forest (RF) and Support Vector Machine (SVM) algorithms, while also evaluating the effectiveness of Boruta and PCA. The findings show that PCA is more promising than using Boruta alone for extracting cyclostationary network feature patterns. Additionally, the analysis identifies the internet protocol as the most noticeable cyclostationary feature pattern used by malware. Notably, the UGRansome dataset outperforms the KDD99 and NSL-KDD, achieving 99% accuracy in signature malware detection using the RF algorithm and 98% with the SVM.
Uncertainty-inspired Open Set Learning for Retinal Anomaly Identification
Wang, Meng, Lin, Tian, Wang, Lianyu, Lin, Aidi, Zou, Ke, Xu, Xinxing, Zhou, Yi, Peng, Yuanyuan, Meng, Qingquan, Qian, Yiming, Deng, Guoyao, Wu, Zhiqun, Chen, Junhong, Lin, Jianhong, Zhang, Mingzhi, Zhu, Weifang, Zhang, Changqing, Zhang, Daoqiang, Goh, Rick Siow Mong, Liu, Yong, Pang, Chi Pui, Chen, Xinjian, Chen, Haoyu, Fu, Huazhu
Failure to recognize samples from the classes unseen during training is a major limitation of artificial intelligence in the real-world implementation for recognition and classification of retinal anomalies. We established an uncertainty-inspired open-set (UIOS) model, which was trained with fundus images of 9 retinal conditions. Besides assessing the probability of each category, UIOS also calculated an uncertainty score to express its confidence. Our UIOS model with thresholding strategy achieved an F1 score of 99.55%, 97.01% and 91.91% for the internal testing set, external target categories (TC)-JSIEC dataset and TC-unseen testing set, respectively, compared to the F1 score of 92.20%, 80.69% and 64.74% by the standard AI model. Furthermore, UIOS correctly predicted high uncertainty scores, which would prompt the need for a manual check in the datasets of non-target categories retinal diseases, low-quality fundus images, and non-fundus images. UIOS provides a robust method for real-world screening of retinal anomalies.
Identifying Unique Causal Network from Nonstationary Time Series
Kang, Mingyu, Chen, Duxin, Meng, Ning, Yan, Gang, Yu, Wenwu
Identifying causality is a challenging task in many data-intensive scenarios. Many algorithms have been proposed for this critical task. However, most of them consider the learning algorithms for directed acyclic graph (DAG) of Bayesian network (BN). These BN-based models only have limited causal explainability because of the issue of Markov equivalence class. Moreover, they are dependent on the assumption of stationarity, whereas many sampling time series from complex system are nonstationary. The nonstationary time series bring dataset shift problem, which leads to the unsatisfactory performances of these algorithms. To fill these gaps, a novel causation model named Unique Causal Network (UCN) is proposed in this paper. Different from the previous BN-based models, UCN considers the influence of time delay, and proves the uniqueness of obtained network structure, which addresses the issue of Markov equivalence class. Furthermore, based on the decomposability property of UCN, a higher-order causal entropy (HCE) algorithm is designed to identify the structure of UCN in a distributed way. HCE algorithm measures the strength of causality by using nearest-neighbors entropy estimator, which works well on nonstationary time series. Finally, lots of experiments validate that HCE algorithm achieves state-of-the-art accuracy when time series are nonstationary, compared to the other baseline algorithms.
NBIAS: A Natural Language Processing Framework for Bias Identification in Text
Raza, Shaina, Garg, Muskan, Reji, Deepak John, Bashir, Syed Raza, Ding, Chen
Bias in textual data can lead to skewed interpretations and outcomes when the data is used. These biases could perpetuate stereotypes, discrimination, or other forms of unfair treatment. An algorithm trained on biased data may end up making decisions that disproportionately impact a certain group of people. Therefore, it is crucial to detect and remove these biases to ensure the fair and ethical use of data. To this end, we develop a comprehensive and robust framework NBIAS that consists of four main layers: data, corpus construction, model development and an evaluation layer. The dataset is constructed by collecting diverse data from various domains, including social media, healthcare, and job hiring portals. As such, we applied a transformer-based token classification model that is able to identify bias words/ phrases through a unique named entity BIAS. In the evaluation procedure, we incorporate a blend of quantitative and qualitative measures to gauge the effectiveness of our models. We achieve accuracy improvements ranging from 1% to 8% compared to baselines. We are also able to generate a robust understanding of the model functioning. The proposed approach is applicable to a variety of biases and contributes to the fair and ethical use of textual data.
EpiDeNet: An Energy-Efficient Approach to Seizure Detection for Embedded Systems
Ingolfsson, Thorir Mar, Chakraborty, Upasana, Wang, Xiaying, Beniczky, Sandor, Ducouret, Pauline, Benatti, Simone, Ryvlin, Philippe, Cossettini, Andrea, Benini, Luca
Epilepsy is a prevalent neurological disorder that affects millions of individuals globally, and continuous monitoring coupled with automated seizure detection appears as a necessity for effective patient treatment. To enable long-term care in daily-life conditions, comfortable and smart wearable devices with long battery life are required, which in turn set the demand for resource-constrained and energy-efficient computing solutions. In this context, the development of machine learning algorithms for seizure detection faces the challenge of heavily imbalanced datasets. This paper introduces EpiDeNet, a new lightweight seizure detection network, and Sensitivity-Specificity Weighted Cross-Entropy (SSWCE), a new loss function that incorporates sensitivity and specificity, to address the challenge of heavily unbalanced datasets. The proposed EpiDeNet-SSWCE approach demonstrates the successful detection of 91.16% and 92.00% seizure events on two different datasets (CHB-MIT and PEDESITE, respectively), with only four EEG channels. A three-window majority voting-based smoothing scheme combined with the SSWCE loss achieves 3x reduction of false positives to 1.18 FP/h. EpiDeNet is well suited for implementation on low-power embedded platforms, and we evaluate its performance on two ARM Cortex-based platforms (M4F/M7) and two parallel ultra-low power (PULP) systems (GAP8, GAP9). The most efficient implementation (GAP9) achieves an energy efficiency of 40 GMAC/s/W, with an energy consumption per inference of only 0.051 mJ at high performance (726.46 MMAC/s), outperforming the best ARM Cortex-based solutions by approximately 160x in energy efficiency. The EpiDeNet-SSWCE method demonstrates effective and accurate seizure detection performance on heavily imbalanced datasets, while being suited for implementation on energy-constrained platforms.
Fix Fairness, Don't Ruin Accuracy: Performance Aware Fairness Repair using AutoML
Nguyen, Giang, Biswas, Sumon, Rajan, Hridesh
Machine learning (ML) is increasingly being used in critical decision-making software, but incidents have raised questions about the fairness of ML predictions. To address this issue, new tools and methods are needed to mitigate bias in ML-based software. Previous studies have proposed bias mitigation algorithms that only work in specific situations and often result in a loss of accuracy. Our proposed solution is a novel approach that utilizes automated machine learning (AutoML) techniques to mitigate bias. Our approach includes two key innovations: a novel optimization function and a fairness-aware search space. By improving the default optimization function of AutoML and incorporating fairness objectives, we are able to mitigate bias with little to no loss of accuracy. Additionally, we propose a fairness-aware search space pruning method for AutoML to reduce computational cost and repair time. Our approach, built on the state-of-the-art Auto-Sklearn tool, is designed to reduce bias in real-world scenarios. In order to demonstrate the effectiveness of our approach, we evaluated our approach on four fairness problems and 16 different ML models, and our results show a significant improvement over the baseline and existing bias mitigation techniques. Our approach, Fair-AutoML, successfully repaired 60 out of 64 buggy cases, while existing bias mitigation techniques only repaired up to 44 out of 64 cases.
When to Show a Suggestion? Integrating Human Feedback in AI-Assisted Programming
Mozannar, Hussein, Bansal, Gagan, Fourney, Adam, Horvitz, Eric
AI powered code-recommendation systems, such as Copilot and CodeWhisperer, provide code suggestions inside a programmer's environment (e.g., an IDE) with the aim to improve their productivity. Since, in these scenarios, programmers accept and reject suggestions, ideally, such a system should use this feedback in furtherance of this goal. In this work, we leverage prior data of programmers interacting with GitHub Copilot, a system used by millions of programmers, to develop interventions that can save programmer time. We propose a utility theory framework, which models this interaction with programmers and decides which suggestions to display. Our framework Conditional suggestion Display from Human Feedback (CDHF), relies on a cascade of models that predict suggestion acceptance to selectively hide suggestions reducing both latency and programmer verification time. Using data from 535 programmers, we perform a retrospective evaluation of CDHF and show that we can avoid displaying a significant fraction of suggestions that would have been rejected doing so without total knowledge of the suggestions themselves. We further demonstrate the importance of incorporating the programmer's latent unobserved state in deciding when to display suggestions through ablations on user study data. Finally, we showcase that using suggestion acceptance as a reward signal to know which suggestions to display leads to reduced quality suggestions indicating an unexpected pitfall.
Counterpart Fairness -- Addressing Systematic between-group Differences in Fairness Evaluation
Wang, Yifei, Zhou, Zhengyang, Wang, Liqin, Laurentiev, John, Hou, Peter, Zhou, Li, Hong, Pengyu
When using machine learning (ML) to aid decision-making, it is critical to ensure that an algorithmic decision is fair, i.e., it does not discriminate against specific individuals/groups, particularly those from underprivileged populations. Existing group fairness methods require equal group-wise measures, which however fails to consider systematic between-group differences. The confounding factors, which are non-sensitive variables but manifest systematic differences, can significantly affect fairness evaluation. To tackle this problem, we believe that a fairness measurement should be based on the comparison between counterparts (i.e., individuals who are similar to each other with respect to the task of interest) from different groups, whose group identities cannot be distinguished algorithmically by exploring confounding factors. We have developed a propensity-score-based method for identifying counterparts, which prevents fairness evaluation from comparing "oranges" with "apples". In addition, we propose a counterpart-based statistical fairness index, termed Counterpart-Fairness (CFair), to assess fairness of ML models. Various empirical studies were conducted to validate the effectiveness of CFair. We publish our code at \url{https://github.com/zhengyjo/CFair}.
Closing the Gap in High-Risk Pregnancy Care Using Machine Learning and Human-AI Collaboration
Mozannar, Hussein, Utsumi, Yuria, Chen, Irene Y., Gervasi, Stephanie S., Ewing, Michele, Smith-McLallen, Aaron, Sontag, David
High-risk pregnancy (HRP) is a pregnancy complicated by factors that can adversely affect outcomes of the mother or the infant. Health insurers use algorithms to identify members who would benefit from additional clinical support. We aimed to build machine learning algorithms to identify pregnant patients and triage them by risk of complication to assist care management. In this retrospective study, we trained a hybrid Lasso regularized classifier to predict whether a patient is currently pregnant using claims data from 36735 insured members of Independence Blue Cross (IBC), a health insurer in Philadelphia. We then train a linear classifier on a subset of 12,243 members to predict whether a patient will develop gestational diabetes or gestational hypertension. These algorithms were developed in cooperation with the care management team at IBC and integrated into the dashboard. In small user studies with the nurses, we evaluated the impact of integrating our algorithms into their workflow. We find that the proposed model predicts an earlier pregnancy start date for 3.54% (95% CI 3.05-4.00) for patients with complications compared to only using a set of pre-defined codes that indicate the start of pregnancy and never later at the expense of a 5.58% (95% CI 4.05-6.40) false positive rate. The classifier for predicting complications has an AUC of 0.754 (95% CI 0.764-0.788) using data up to the patient's first trimester. Nurses from the care management program expressed a preference for the proposed models over existing approaches. The proposed model outperformed commonly used claim codes for the identification of pregnant patients at the expense of a manageable false positive rate. Our risk complication classifier shows that we can accurately triage patients by risk of complication.