Accuracy
Joint Detection and Localization of Stealth False Data Injection Attacks in Smart Grids using Graph Neural Networks
Boyaci, Osman, Narimani, Mohammad Rasoul, Davis, Katherine, Ismail, Muhammad, Overbye, Thomas J, Serpedin, Erchin
False data injection attacks (FDIA) are a main category of cyber-attacks threatening the security of power systems. Contrary to the detection of these attacks, less attention has been paid to identifying the attacked units of the grid. To this end, this work jointly studies detecting and localizing the stealth FDIA in power grids. Exploiting the inherent graph topology of power systems as well as the spatial correlations of measurement data, this paper proposes an approach based on the graph neural network (GNN) to identify the presence and location of the FDIA. The proposed approach leverages the auto-regressive moving average (ARMA) type graph filters (GFs) which can better adapt to sharp changes in the spectral domain due to their rational type filter composition compared to the polynomial type GFs such as Chebyshev. To the best of our knowledge, this is the first work based on GNN that automatically detects and localizes FDIA in power systems. Extensive simulations and visualizations show that the proposed approach outperforms the available methods in both detection and localization of FDIA for different IEEE test systems. Thus, the targeted areas can be identified and preventive actions can be taken before the attack impacts the grid.
Focus Your Distribution: Coarse-to-Fine Non-Contrastive Learning for Anomaly Detection and Localization
Zheng, Ye, Wang, Xiang, Deng, Rui, Bao, Tianpeng, Zhao, Rui, Wu, Liwei
The essence of unsupervised anomaly detection is to learn the compact distribution of normal samples and detect outliers as anomalies in testing. Meanwhile, the anomalies in real-world are usually subtle and fine-grained in a high-resolution image especially for industrial applications. Towards this end, we propose a novel framework for unsupervised anomaly detection and localization. Our method aims at learning dense and compact distribution from normal images with a coarse-to-fine alignment process. The coarse alignment stage standardizes the pixel-wise position of objects in both image and feature levels. The fine alignment stage then densely maximizes the similarity of features among all corresponding locations in a batch. To facilitate the learning with only normal images, we propose a new pretext task called non-contrastive learning for the fine alignment stage. Non-contrastive learning extracts robust and discriminating normal image representations without making assumptions on abnormal samples, and it thus empowers our model to generalize to various anomalous scenarios. Extensive experiments on two typical industrial datasets of MVTec AD and BenTech AD demonstrate that our framework is effective in detecting various real-world defects and achieves a new state-of-the-art in industrial unsupervised anomaly detection.
Accountability in AI: From Principles to Industry-specific Accreditation
Percy, Chris, Dragicevic, Simo, Sarkar, Sanjoy, Garcez, Artur S. d'Avila
Recent AI-related scandals have shed a spotlight on accountability in AI, with increasing public interest and concern. This paper draws on literature from public policy and governance to make two contributions. First, we propose an AI accountability ecosystem as a useful lens on the system, with different stakeholders requiring and contributing to specific accountability mechanisms. We argue that the present ecosystem is unbalanced, with a need for improved transparency via AI explainability and adequate documentation and process formalisation to support internal audit, leading up eventually to external accreditation processes. Second, we use a case study in the gambling sector to illustrate in a subset of the overall ecosystem the need for industry-specific accountability principles and processes. We define and evaluate critically the implementation of key accountability principles in the gambling industry, namely addressing algorithmic bias and model explainability, before concluding and discussing directions for future work based on our findings. Keywords: Accountability, Explainable AI, Algorithmic Bias, Regulation.
Measure Twice, Cut Once: Quantifying Bias and Fairness in Deep Neural Networks
Blakeney, Cody, Atkinson, Gentry, Huish, Nathaniel, Yan, Yan, Metris, Vangelis, Zong, Ziliang
Algorithmic bias is of increasing concern, both to the research community, and society at large. Bias in AI is more abstract and unintuitive than traditional forms of discrimination and can be more difficult to detect and mitigate. A clear gap exists in the current literature on evaluating the relative bias in the performance of multi-class classifiers. In this work, we propose two simple yet effective metrics, Combined Error Variance (CEV) and Symmetric Distance Error (SDE), to quantitatively evaluate the class-wise bias of two models in comparison to one another. By evaluating the performance of these new metrics and by demonstrating their practical application, we show that they can be used to measure fairness as well as bias. These demonstrations show that our metrics can address specific needs for measuring bias in multi-class classification. Broad acceptance of the large-scale deployment of AI and neural networks depends on the models' perceived trustworthiness and fairness. However, research on evaluating and mitigating bias for neural networks in general and compressed neural networks in particular is still in its infancy. Because deep neural networks (DNNs) are "black box" learners, it can be difficult to understand what correlations they have learned from their training data, and how that affects the downstream decisions that are made in the real world. Two models may appear to have very similar performance when only measured in terms of accuracy, precision, etc. but deeper analysis can show uneven performance across many classes.
Medical Dead-ends and Learning to Identify High-risk States and Treatments
Fatemi, Mehdi, Killian, Taylor W., Subramanian, Jayakumar, Ghassemi, Marzyeh
Machine learning has successfully framed many sequential decision making problems as either supervised prediction, or optimal decision-making policy identification via reinforcement learning. In data-constrained offline settings, both approaches may fail as they assume fully optimal behavior or rely on exploring alternatives that may not exist. We introduce an inherently different approach that identifies possible ``dead-ends'' of a state space. We focus on the condition of patients in the intensive care unit, where a ``medical dead-end'' indicates that a patient will expire, regardless of all potential future treatment sequences. We postulate ``treatment security'' as avoiding treatments with probability proportional to their chance of leading to dead-ends, present a formal proof, and frame discovery as an RL problem. We then train three independent deep neural models for automated state construction, dead-end discovery and confirmation. Our empirical results discover that dead-ends exist in real clinical data among septic patients, and further reveal gaps between secure treatments and those that were administered.
Protecting Retail Investors from Order Book Spoofing using a GRU-based Detection Model
Tuccella, Jean-Noël, Nadler, Philip, Şerban, Ovidiu
Market manipulation is tackled through regulation in traditional markets because of its detrimental effect on market efficiency and many participating financial actors. The recent increase of private retail investors due to new low-fee platforms and new asset classes such as decentralised digital currencies has increased the number of vulnerable actors due to lack of institutional sophistication and strong regulation. This paper proposes a method to detect illicit activity and inform investors on spoofing attempts, a well-known market manipulation technique. Our framework is based on a highly extendable Gated Recurrent Unit (GRU) model and allows the inclusion of market variables that can explain spoofing and potentially other illicit activities. The model is tested on granular order book data, in one of the most unregulated markets prone to spoofing with a large number of non-institutional traders. The results show that the model is performing well in an early detection context, allowing the identification of spoofing attempts soon enough to allow investors to react. This is the first step to a fully comprehensive model that will protect investors in various unregulated trading environments and regulators to identify illicit activity.
Fingerprinting Multi-exit Deep Neural Network Models via Inference Time
Dong, Tian, Qiu, Han, Zhang, Tianwei, Li, Jiwei, Li, Hewu, Lu, Jialiang
Transforming large deep neural network (DNN) models into the multi-exit architectures can overcome the overthinking issue and distribute a large DNN model on resource-constrained scenarios (e.g. IoT frontend devices and backend servers) for inference and transmission efficiency. Nevertheless, intellectual property (IP) protection for the multi-exit models in the wild is still an unsolved challenge. Previous efforts to verify DNN model ownership mainly rely on querying the model with specific samples and checking the responses, e.g., DNN watermarking and fingerprinting. However, they are vulnerable to adversarial settings such as adversarial training and are not suitable for the IP verification for multi-exit DNN models. In this paper, we propose a novel approach to fingerprint multi-exit models via inference time rather than inference predictions. Specifically, we design an effective method to generate a set of fingerprint samples to craft the inference process with a unique and robust inference time cost as the evidence for model ownership. We conduct extensive experiments to prove the uniqueness and robustness of our method on three structures (ResNet-56, VGG-16, and MobileNet) and three datasets (CIFAR-10, CIFAR-100, and Tiny-ImageNet) under comprehensive adversarial settings.
Detecting adversaries in Crowdsourcing
Traganitis, Panagiotis A., Giannakis, Georgios B.
Despite its successes in various machine learning and data science tasks, crowdsourcing can be susceptible to attacks from dedicated adversaries. This work investigates the effects of adversaries on crowdsourced classification, under the popular Dawid and Skene model. The adversaries are allowed to deviate arbitrarily from the considered crowdsourcing model, and may potentially cooperate. To address this scenario, we develop an approach that leverages the structure of second-order moments of annotator responses, to identify large numbers of adversaries, and mitigate their impact on the crowdsourcing task. The potential of the proposed approach is empirically demonstrated on synthetic and real crowdsourcing datasets.
Sparse MoEs meet Efficient Ensembles
Allingham, James Urquhart, Wenzel, Florian, Mariet, Zelda E, Mustafa, Basil, Puigcerver, Joan, Houlsby, Neil, Jerfel, Ghassen, Fortuin, Vincent, Lakshminarayanan, Balaji, Snoek, Jasper, Tran, Dustin, Ruiz, Carlos Riquelme, Jenatton, Rodolphe
Machine learning models based on the aggregated outputs of submodels, either at the activation or prediction levels, lead to strong performance. We study the interplay of two popular classes of such models: ensembles of neural networks and sparse mixture of experts (sparse MoEs). First, we show that these two approaches have complementary features whose combination is beneficial. Then, we present partitioned batch ensembles, an efficient ensemble of sparse MoEs that takes the best of both classes of models. Extensive experiments on fine-tuned vision transformers demonstrate the accuracy, log-likelihood, few-shot learning, robustness, and uncertainty calibration improvements of our approach over several challenging baselines. Partitioned batch ensembles not only scale to models with up to 2.7B parameters, but also provide larger performance gains for larger models.
AgFlow: Fast Model Selection of Penalized PCA via Implicit Regularization Effects of Gradient Flow
Jiang, Haiyan, Xiong, Haoyi, Wu, Dongrui, Liu, Ji, Dou, Dejing
Principal component analysis (PCA) has been widely used as an effective technique for feature extraction and dimension reduction. In the High Dimension Low Sample Size (HDLSS) setting, one may prefer modified principal components, with penalized loadings, and automated penalty selection by implementing model selection among these different models with varying penalties. The earlier work [1, 2] has proposed penalized PCA, indicating the feasibility of model selection in $L_2$- penalized PCA through the solution path of Ridge regression, however, it is extremely time-consuming because of the intensive calculation of matrix inverse. In this paper, we propose a fast model selection method for penalized PCA, named Approximated Gradient Flow (AgFlow), which lowers the computation complexity through incorporating the implicit regularization effect introduced by (stochastic) gradient flow [3, 4] and obtains the complete solution path of $L_2$-penalized PCA under varying $L_2$-regularization. We perform extensive experiments on real-world datasets. AgFlow outperforms existing methods (Oja [5], Power [6], and Shamir [7] and the vanilla Ridge estimators) in terms of computation costs.