Goto

Collaborating Authors

 Accuracy


Graph Domain Adaptation with Localized Graph Signal Representations

arXiv.org Machine Learning

Graph Domain Adaptation with Localized Graph Signal Representations Yusuf Yi git Pilavcฤฑ, Eylem Tu g ce G uneyi, Cemil Cengiz and Elif Vural Abstract In this paper we propose a domain adaptation algorithm designed for graph domains. Given a source graph with many labeled nodes and a target graph with few or no labeled nodes, we aim to estimate the target labels by making use of the similarity between the characteristics of the variation of the label functions on the two graphs. Our assumption about the source and the target domains is that the local behaviour of the label function, such as its spread and speed of variation on the graph, bears resemblance between the two graphs. We estimate the unknown target labels by solving an optimization problem where the label information is transferred from the source graph to the target graph based on the prior that the projections of the label functions onto localized graph bases be similar between the source and the target graphs. In order to efficiently capture the local variation of the label functions on the graphs, spectral graph wavelets are used as the graph bases. Experimentation on various data sets shows that the proposed method yields quite satisfactory classification accuracy compared to reference domain adaptation methods. Keywords: Domain adaptation, spectral graph theory, graph signal processing, spectral graph wavelets, graph Laplacian 1 Introduction A common assumption in machine learning is that the training and the test data are sampled from the same distribution. Domain adaptation methods aim to provide solutions to machine learning problems by dealing with this distribution discrepancy. In domain adaptation, a source domain and a target domain are considered where the label information is mostly available for the data samples in the source domain, and few or none of the class labels are known in the target domain. The purpose is then to improve the learning performance in the target domain by making use Y. Y. Pilavcฤฑ is with the GIPSA Lab at Universit e Grenoble Alpes, Grenoble. C. Cengiz is with the Dept. of Computer Science and Engineering at Ko c University, Istanbul. Most part of this work was performed while the authors were at METU. 1 arXiv:1911.02883v1 A variety of approaches have been proposed so far for the domain adaptation problem. Some methods are based on reweighing the samples for removing the sample selection bias [1, 2]. Another common solution is to align the source and the target domains through feature space mappings.


Fair Meta-Learning: Learning How to Learn Fairly

arXiv.org Machine Learning

Data sets for fairness relevant tasks can lack examples or be biased according to a specific label in a sensitive attribute. We demonstrate the usefulness of weight based meta-learning approaches in such situations. For models that can be trained through gradient descent, we demonstrate that there are some parameter configurations that allow models to be optimized from a few number of gradient steps and with minimal data which are both fair and accurate. To learn such weight sets, we adapt the popular MAML algorithm to Fair-MAML by the inclusion of a fairness regularization term. In practice, Fair-MAML allows practitioners to train fair machine learning models from only a few examples when data from related tasks is available. We empirically exhibit the value of this technique by comparing to relevant baselines.


Researchers develop machine learning-based detector that stops lateral phishing attacks - Help Net Security

#artificialintelligence

Lateral phishing attacks โ€“ scams targeting users from compromised email accounts within an organization โ€“ are becoming an increasing concern in the U.S. Whereas in the past attackers would send phishing scams from email accounts external to an organization, recently there's been an explosion of email-borne scams in which an attackers compromise email accounts within organizations, and then uses those accounts to launch internal phishing emails to fellow employees โ€“ the kind of attacks known as lateral phishing. And when a phishing email comes from an internal account, the vast majority of email security systems can't stop it. Existing security systems largely detect cyber attacks that come from the outside, relying on signals like IP and domain reputation, which are ineffective when the email comes from an internal source. Lateral phishing attacks are also costly. FBI data shows that these cyberattacks caused more than $12 billion in losses between 2013-2018.


How to make algorithms fairer

#artificialintelligence

Fixing algorithms may not be the best response to bias. Ethicist Tom Douglas offers a more radical approach to creating fairness, that aims for'substantive' rather than'procedural' fairness outside of design. Our lives are increasingly affected by algorithms. People may be denied loans, jobs, insurance policies, or even parole on the basis of risk scores that they produce. Yet algorithms are notoriously prone to biases.


Detecting Point Outliers Using Prune-based Outlier Factor (PLOF)

arXiv.org Machine Learning

Outlier detection (also known as anomaly detection or deviation detection) is a process of detecting data points in which their patterns deviate significantly from others. It is common to have outliers in industry applications, which could be generated by different causes such as human error, fraudulent activities, or system failure. Recently, density-based methods have shown promising results, particularly among which Local Outlier Factor (LOF) is arguably dominating. However, one of the major drawbacks of LOF is that it is computationally expensive. Motivated by the mentioned problem, this research presents a novel pruning-based procedure in which the execution time of LOF is reduced while the performance is maintained. A novel Prune-based Local Outlier Factor (PLOF) approach is proposed, in which prior to employing LOF, outlierness of each data instance is measured. Next, based on a threshold, data instances that require further investigation are separated and LOF score is only computed for these points. Extensive experiments have been conducted and results are promising. Comparison experiments with the original LOF and two state-of-the-art variants of LOF have shown that PLOF produces higher accuracy and precision while reducing execution time.


An "augmentation-free" rotation invariant classification scheme on point-cloud and its application to neuroimaging

arXiv.org Machine Learning

Recent years have witnessed the emergence and increasing popularity of 3D medical imaging techniques with the development of 3D sensors and technology. However, achieving geometric invariance in the processing of 3D medical images is computationally expensive but nonetheless essential due to the presence of possible errors caused by rigid registration techniques. An alternative way to analyze medical imaging is by understanding the 3D shapes represented in terms of point-cloud. Though in the medical imaging community, 3D point-cloud processing is not a "go-to" choice, it is a canonical way to preserve rotation invariance. Unfortunately, due to the presence of discrete topology, one can not use the standard convolution operator on point-cloud. To the best of our knowledge, the existing ways to do "convolution" can not preserve the rotation invariance without explicit data augmentation. Therefore, we propose a rotation invariant convolution operator by inducing topology from hypersphere. Experimental validation has been performed on publicly available OASIS dataset in terms of classification accuracy between subjects with (without) dementia, demonstrating the usefulness of our proposed method in terms of model complexity, classification accuracy, and last but most important invariance to rotations.


Practical Compositional Fairness: Understanding Fairness in Multi-Task ML Systems

arXiv.org Machine Learning

Most literature in fairness has focused on improving fairness with respect to one single model or one single objective. However, real-world machine learning systems are usually composed of many different components. Unfortunately, recent research has shown that even if each component is "fair", the overall system can still be "unfair". In this paper, we focus on how well fairness composes over multiple components in real systems. We consider two recently proposed fairness metrics for rankings: exposure and pairwise ranking accuracy gap. We provide theory that demonstrates a set of conditions under which fairness of individual models does compose. We then present an analytical framework for both understanding whether a system's signals can achieve compositional fairness, and diagnosing which of these signals lowers the overall system's end-to-end fairness the most. Despite previously bleak theoretical results, on multiple data-sets -- including a large-scale real-world recommender system -- we find that the overall system's end-to-end fairness is largely achievable by improving fairness in individual components.


More than a million payment card frauds thwarted in past year

#artificialintelligence

The National Cyber Security Centre (NCSC) revealed it has thwarted over a million cases of suspected payment card fraud in the past year, according to the organisation's third Annual Review. The NCSC's latest figures also revealed it dealt with 658 UK cyber-attacks during the past twelve months โ€“ taking the total to more than 1,800 over three years. Payment fraud, which historically has been driven by card cloning, has since mitigated towards transactions where the card does not need to be present, such as purchases online. Head of artificial intelligence at SAS UK & Ireland, Caroline Hermon, commented: "The rapid expansion of payment services over the last few years has led to consumer demands for convenience and flexibility with new payment methods. "Banks and other financial institutions are aware that they must meet these demands, but they are also aware that these new payment systems leave them open to new forms of fraud.


Long-range Event-level Prediction and Response Simulation for Urban Crime and Global Terrorism with Granger Networks

arXiv.org Machine Learning

Large-scale trends in urban crime and global terrorism are well-predicted by socio-economic drivers, but focused, event-level predictions have had limited success. Standard machine learning approaches are promising, but lack interpretability, are generally interpolative, and ineffective for precise future interventions with costly and wasteful false positives. Here, we are introducing Granger Network inference as a new forecasting approach for individual infractions with demonstrated performance far surpassing past results, yet transparent enough to validate and extend social theory. Considering the problem of predicting crime in the City of Chicago, we achieve an average AUC of ~90\% for events predicted a week in advance within spatial tiles approximately $1000$ ft across. Instead of pre-supposing that crimes unfold across contiguous spaces akin to diffusive systems, we learn the local transport rules from data. As our key insights, we uncover indications of suburban bias -- how law-enforcement response is modulated by socio-economic contexts with disproportionately negative impacts in the inner city -- and how the dynamics of violent and property crimes co-evolve and constrain each other -- lending quantitative support to controversial pro-active policing policies. To demonstrate broad applicability to spatio-temporal phenomena, we analyze terror attacks in the middle-east in the recent past, and achieve an AUC of ~80% for predictions made a week in advance, and within spatial tiles measuring approximately 120 miles across. We conclude that while crime operates near an equilibrium quickly dissipating perturbations, terrorism does not. Indeed terrorism aims to destabilize social order, as shown by its dynamics being susceptible to run-away increases in event rates under small perturbations.


The Tale of Evil Twins: Adversarial Inputs versus Backdoored Models

arXiv.org Machine Learning

Despite their tremendous success in a wide range of applications, deep neural network (DNN) models are inherently vulnerable to two types of malicious manipulations: adversarial inputs, which are crafted samples that deceive target DNNs, and backdoored models, which are forged DNNs that misbehave on trigger-embedded inputs. While prior work has intensively studied the two attack vectors in parallel, there is still a lack of understanding about their fundamental connection, which is critical for assessing the holistic vulnerability of DNNs deployed in realistic settings. In this paper, we bridge this gap by conducting the first systematic study of the two attack vectors within a unified framework. More specifically, (i) we develop a new attack model that integrates both adversarial inputs and backdoored models; (ii) with both analytical and empirical evidence, we reveal that there exists an intricate "mutual reinforcement" effect between the two attack vectors; (iii) we demonstrate that this effect enables a large spectrum for the adversary to optimize the attack strategies, such as maximizing attack evasiveness with respect to various defenses and designing trigger patterns satisfying multiple desiderata; (v) finally, we discuss potential countermeasures against this unified attack and their technical challenges, which lead to several promising research directions.