Goto

Collaborating Authors

 Performance Analysis


AnomMAN: Detect Anomaly on Multi-view Attributed Networks

arXiv.org Artificial Intelligence

Anomaly detection on attributed networks is widely used in web shopping, financial transactions, communication networks, and so on. However, most work tries to detect anomalies on attributed networks only considering a single interaction action, which cannot consider rich kinds of interaction actions in multi-view attributed networks. In fact, it remains a challenging task to consider all different kinds of interaction actions uniformly and detect anomalous instances in multi-view attributed networks. In this paper, we propose a Graph Convolution based framework, AnomMAN, to detect \textbf{Anom}aly on \textbf{M}ulti-view \textbf{A}ttributed \textbf{N}etworks. To consider the attributes and all interaction actions jointly, we use the attention mechanism to define the importance of all views in networks. Besides, the Graph Convolution operation cannot be simply applied in anomaly detection tasks on account of its low-pass characteristic. Therefore, AnomMAN uses a graph auto-encoder module to overcome the shortcoming and transform it to our strength. According to experiments on real-world datasets, AnomMAN outperforms state-of-the-art models and two variants of our proposed model. Besides, the Accuracy@50 indicator of AnomMAN reaches 1.000 on the dataset, which shows that the top 50 anomalous instances detected by AnomMAN are all anomalous ones.


Experts warn prenatal screening tests can lead to false positive results in some cases

FOX News

Fox News Flash top headlines are here. Check out what's clicking on Foxnews.com. Non-invasive prenatal testing (NIPT) on pregnant women to detect the risk of a fetus having rare genetic abnormalities may often be wrong, according to recent reports. These tests, according to multiple health experts, can actually give false positives, which can create significant angst in expecting parents. Health experts explained to Fox News that NIPT works by taking blood samples from the pregnant mother and then analyzing fragments of free-floating cell-free DNA (cfDNA).


Applications of Signature Methods to Market Anomaly Detection

arXiv.org Machine Learning

While these instances are called outliers (anomalies), the normal instances are called inliers. Anomaly detection is a fundamental research problem that has been investigated by researchers from diverse research fields and application areas. Anomaly detection can be made manually by searching through whole data clouds to diagnose the problem, but clearly this is a long and labourintensive process. Anomaly detection often appears in the context of uncertainty, i.e. absence, principal or not, of knowledge on the data generating process. Hence, over time, a plethora of anomaly detection techniques ranging from simple statistical techniques to complex machine learning algorithms has been developed for certain application areas such as fraud detection in financial transactions (West and Bhattacharya (2016)), fault detection in production (Miljkoviฤ‡ (2011)), intrusion detection in a computer network (Sabahi and Movaghar (2008)), etc. Some of the well known statistical methods such as z-score, Tukey method (Interquartile Range) or Gaussian Mixture models can be useful for the initial screening of outliers. Although these statistical or econometric anomaly detection methods have been well rooted in the literature (we refer the reader to Chandola et al. (2009) for an extensive review) dating back to Edgeworth (1887), many of them have failed to provide sufficient performance and accuracy in the last decade. This is mainly in view of big data collected from various sources such as financial transactions, health records, and surveillance logs etc. Nowadays high-volume, high-velocity, and high-variety data sets demand cost-effective novel data analytics for decision-making and to infer useful insights


United adversarial learning for liver tumor segmentation and detection of multi-modality non-contrast MRI

arXiv.org Artificial Intelligence

Simultaneous segmentation and detection of liver tumors (hemangioma and hepatocellular carcinoma (HCC)) by using multi-modality non-contrast magnetic resonance imaging (NCMRI) are crucial for the clinical diagnosis. However, it is still a challenging task due to: (1) the HCC information on NCMRI is invisible or insufficient makes extraction of liver tumors feature difficult; (2) diverse imaging characteristics in multi-modality NCMRI causes feature fusion and selection difficult; (3) no specific information between hemangioma and HCC on NCMRI cause liver tumors detection difficult. In this study, we propose a united adversarial learning framework (UAL) for simultaneous liver tumors segmentation and detection using multi-modality NCMRI. The UAL first utilizes a multi-view aware encoder to extract multi-modality NCMRI information for liver tumor segmentation and detection. In this encoder, a novel edge dissimilarity feature pyramid module is designed to facilitate the complementary multi-modality feature extraction. Second, the newly designed fusion and selection channel is used to fuse the multi-modality feature and make the decision of the feature selection. Then, the proposed mechanism of coordinate sharing with padding integrates the multi-task of segmentation and detection so that it enables multi-task to perform united adversarial learning in one discriminator. Lastly, an innovative multi-phase radiomics guided discriminator exploits the clear and specific tumor information to improve the multi-task performance via the adversarial learning strategy. The UAL is validated in corresponding multi-modality NCMRI (i.e. T1FS pre-contrast MRI, T2FS MRI, and DWI) and three phases contrast-enhanced MRI of 255 clinical subjects. The experiments show that UAL has great potential in the clinical diagnosis of liver tumors.


Detection of extragalactic Ultra-Compact Dwarfs and Globular Clusters using Explainable AI techniques

arXiv.org Artificial Intelligence

Compact stellar systems such as Ultra-compact dwarfs (UCDs) and Globular Clusters (GCs) around galaxies are known to be the tracers of the merger events that have been forming these galaxies. Therefore, identifying such systems allows to study galaxies mass assembly, formation and evolution. However, in the lack of spectroscopic information detecting UCDs/GCs using imaging data is very uncertain. Here, we aim to train a machine learning model to separate these objects from the foreground stars and background galaxies using the multi-wavelength imaging data of the Fornax galaxy cluster in 6 filters, namely u, g, r, i, J and Ks. The classes of objects are highly imbalanced which is problematic for many automatic classification techniques. Hence, we employ Synthetic Minority Over-sampling to handle the imbalance of the training data. Then, we compare two classifiers, namely Localized Generalized Matrix Learning Vector Quantization (LGMLVQ) and Random Forest (RF). Both methods are able to identify UCDs/GCs with a precision and a recall of >93 percent and provide relevances that reflect the importance of each feature dimension %(colors and angular sizes) for the classification. Both methods detect angular sizes as important markers for this classification problem. While it is astronomical expectation that color indices of u-i and i-Ks are the most important colors, our analysis shows that colors such as g-r are more informative, potentially because of higher signal-to-noise ratio. Besides the excellent performance the LGMLVQ method allows further interpretability by providing the feature importance for each individual class, class-wise representative samples and the possibility for non-linear visualization of the data as demonstrated in this contribution. We conclude that employing machine learning techniques to identify UCDs/GCs can lead to promising results.


Towards Understanding and Harnessing the Effect of Image Transformation in Adversarial Detection

arXiv.org Artificial Intelligence

Deep neural networks (DNNs) are threatened by adversarial examples. Adversarial detection, which distinguishes adversarial images from benign images, is fundamental for robust DNN-based services. Image transformation is one of the most effective approaches to detect adversarial examples. During the last few years, a variety of image transformations have been studied and discussed to design reliable adversarial detectors. In this paper, we systematically synthesize the recent progress on adversarial detection via image transformations with a novel classification method. Then, we conduct extensive experiments to test the detection performance of image transformations against state-of-the-art adversarial attacks. Furthermore, we reveal that each individual transformation is not capable of detecting adversarial examples in a robust way, and propose a DNN-based approach referred to as AdvJudge, which combines scores of 9 image transformations. Without knowing which individual scores are misleading or not misleading, AdvJudge can make the right judgment, and achieve a significant improvement in detection accuracy. We claim that AdvJudge is a more effective adversarial detector than those based on an individual image transformation.


Why AI is now table-stakes in cybersecurity

#artificialintelligence

When we stride down the aisles at our local grocer, shelves are full of products vying for our attention. To make their way into our shopping carts, some tout their superior performance on their packaging, and some even try to back their claims up with some magical ingredient. Yet when the rubber meets the road, few of us expect a laundry detergent empowered by such a magical compound to truly get rid of all traces of stains from holiday cooking. While the stakes may be high if our favorite pair of trousers is involved, they are surely higher when picking a security solution. In cybersecurity, most offerings tout some level of AI.


QuantumNAS: Noise-Adaptive Search for Robust Quantum Circuits

arXiv.org Artificial Intelligence

Quantum noise is the key challenge in Noisy Intermediate-Scale Quantum (NISQ) computers. Previous work for mitigating noise has primarily focused on gate-level or pulse-level noise-adaptive compilation. However, limited research efforts have explored a higher level of optimization by making the quantum circuits themselves resilient to noise. We propose QuantumNAS, a comprehensive framework for noise-adaptive co-search of the variational circuit and qubit mapping. Variational quantum circuits are a promising approach for constructing QML and quantum simulation. However, finding the best variational circuit and its optimal parameters is challenging due to the large design space and parameter training cost. We propose to decouple the circuit search and parameter training by introducing a novel SuperCircuit. The SuperCircuit is constructed with multiple layers of pre-defined parameterized gates and trained by iteratively sampling and updating the parameter subsets (SubCircuits) of it. It provides an accurate estimation of SubCircuits performance trained from scratch. Then we perform an evolutionary co-search of SubCircuit and its qubit mapping. The SubCircuit performance is estimated with parameters inherited from SuperCircuit and simulated with real device noise models. Finally, we perform iterative gate pruning and finetuning to remove redundant gates. Extensively evaluated with 12 QML and VQE benchmarks on 14 quantum computers, QuantumNAS significantly outperforms baselines. For QML, QuantumNAS is the first to demonstrate over 95% 2-class, 85% 4-class, and 32% 10-class classification accuracy on real QC. It also achieves the lowest eigenvalue for VQE tasks on H2, H2O, LiH, CH4, BeH2 compared with UCCSD. We also open-source TorchQuantum (https://github.com/mit-han-lab/torchquantum) for fast training of parameterized quantum circuits to facilitate future research.


New Hard-thresholding Rules based on Data Splitting in High-dimensional Imbalanced Classification

arXiv.org Machine Learning

In binary classification, imbalance refers to situations in which one class is heavily under-represented. This issue is due to either a data collection process or because one class is indeed rare in a population. Imbalanced classification frequently arises in applications such as biology, medicine, engineering, and social sciences. In this paper, for the first time, we theoretically study the impact of imbalance class sizes on the linear discriminant analysis (LDA) in high dimensions. We show that due to data scarcity in one class, referred to as the minority class, and high-dimensionality of the feature space, the LDA ignores the minority class yielding a maximum misclassification rate. We then propose a new construction of hard-thresholding rules based on a data splitting technique that reduces the large difference between the misclassification rates. We show that the proposed method is asymptotically optimal. We further study two well-known sparse versions of the LDA in imbalanced cases. We evaluate the finite-sample performance of different methods using simulations and by analyzing two real data sets. The results show that our method either outperforms its competitors or has comparable performance based on a much smaller subset of selected features, while being computationally more efficient.


Building Interpretable Models on Imbalanced Data

#artificialintelligence

I've always believed that to truly learn data science you need to practice data science and I wanted to do this project to practice working with imbalanced classes in classification problems. This was also a perfect opportunity to start working with mlflow to help track my machine learning experiments: it allows me to track the different models I have used, the parameters I've trained with, and the metrics I've recorded. This project was aimed at predicting customer churn using the telecommunications data found on Kaggle [1] (which is a publicly available synthetic dataset). That is, we want to be able to predict if a given customer is going the leave the telecom provider based on the information we have on that customer. Now, why is this useful? Well, if we can predict which customers we think are going to leave before they leave then we can try to do something about it! For example, we could target them with specific offers, and maybe we could even use the model to provide us insight into what to offer them because we will know, or at least have an idea, as to why they are leaving.