Plotting

 Xu, Xiwei


IronForge: An Open, Secure, Fair, Decentralized Federated Learning

arXiv.org Artificial Intelligence

Federated learning (FL) provides an effective machine learning (ML) architecture to protect data privacy in a distributed manner. However, the inevitable network asynchrony, the over-dependence on a central coordinator, and the lack of an open and fair incentive mechanism collectively hinder its further development. We propose \textsc{IronForge}, a new generation of FL framework, that features a Directed Acyclic Graph (DAG)-based data structure and eliminates the need for central coordinators to achieve fully decentralized operations. \textsc{IronForge} runs in a public and open network, and launches a fair incentive mechanism by enabling state consistency in the DAG, so that the system fits in networks where training resources are unevenly distributed. In addition, dedicated defense strategies against prevalent FL attacks on incentive fairness and data privacy are presented to ensure the security of \textsc{IronForge}. Experimental results based on a newly developed testbed FLSim highlight the superiority of \textsc{IronForge} to the existing prevalent FL frameworks under various specifications in performance, fairness, and security. To the best of our knowledge, \textsc{IronForge} is the first secure and fully decentralized FL framework that can be applied in open networks with realistic network and training settings.


Learning to Infer Counterfactuals: Meta-Learning for Estimating Multiple Imbalanced Treatment Effects

arXiv.org Artificial Intelligence

We regularly consider answering counterfactual questions in practice, such as "Would people with diabetes take a turn for the better had they choose another medication?". Observational studies are growing in significance in answering such questions due to their widespread accumulation and comparatively easier acquisition than Randomized Control Trials (RCTs). Recently, some works have introduced representation learning and domain adaptation into counterfactual inference. However, most current works focus on the setting of binary treatments. None of them considers that different treatments' sample sizes are imbalanced, especially data examples in some treatment groups are relatively limited due to inherent user preference. In this paper, we design a new algorithmic framework for counterfactual inference, which brings an idea from Meta-learning for Estimating Individual Treatment Effects (MetaITE) to fill the above research gaps, especially considering multiple imbalanced treatments. Specifically, we regard data episodes among treatment groups in counterfactual inference as meta-learning tasks. We train a meta-learner from a set of source treatment groups with sufficient samples and update the model by gradient descent with limited samples in target treatment. Moreover, we introduce two complementary losses. One is the supervised loss on multiple source treatments. The other loss which aligns latent distributions among various treatment groups is proposed to reduce the discrepancy. We perform experiments on two real-world datasets to evaluate inference accuracy and generalization ability. Experimental results demonstrate that the model MetaITE matches/outperforms state-of-the-art methods.


Software Engineering for Responsible AI: An Empirical Study and Operationalised Patterns

arXiv.org Artificial Intelligence

Although artificial intelligence (AI) is solving real-world challenges and transforming industries, there are serious concerns about its ability to behave and make decisions in a responsible way. Many AI ethics principles and guidelines for responsible AI have been recently issued by governments, organisations, and enterprises. However, these AI ethics principles and guidelines are typically high-level and do not provide concrete guidance on how to design and develop responsible AI systems. To address this shortcoming, we first present an empirical study where we interviewed 21 scientists and engineers to understand the practitioners' perceptions on AI ethics principles and their implementation. We then propose a template that enables AI ethics principles to be operationalised in the form of concrete patterns and suggest a list of patterns using the newly created template. These patterns provide concrete, operationalised guidance that facilitate the development of responsible AI systems.


Cycle-Balanced Representation Learning For Counterfactual Inference

arXiv.org Artificial Intelligence

With the widespread accumulation of observational data, researchers obtain a new direction to learn counterfactual effects in many domains (e.g., health care and computational advertising) without Randomized Controlled Trials(RCTs). However, observational data suffer from inherent missing counterfactual outcomes, and distribution discrepancy between treatment and control groups due to behaviour preference. Motivated by recent advances of representation learning in the field of domain adaptation, we propose a novel framework based on Cycle-Balanced REpresentation learning for counterfactual inference (CBRE), to solve above problems. Specifically, we realize a robust balanced representation for different groups using adversarial training, and meanwhile construct an information loop, such that preserve original data properties cyclically, which reduces information loss when transforming data into latent representation space.Experimental results on three real-world datasets demonstrate that CBRE matches/outperforms the state-of-the-art methods, and it has a great potential to be applied to counterfactual inference.


Blockchain-based Trustworthy Federated Learning Architecture

arXiv.org Artificial Intelligence

Federated learning is an emerging privacy-preserving AI technique where clients (i.e., organisations or devices) train models locally and formulate a global model based on the local model updates without transferring local data externally. However, federated learning systems struggle to achieve trustworthiness and embody responsible AI principles. In particular, federated learning systems face accountability and fairness challenges due to multi-stakeholder involvement and heterogeneity in client data distribution. To enhance the accountability and fairness of federated learning systems, we present a blockchain-based trustworthy federated learning architecture. We first design a smart contract-based data-model provenance registry to enable accountability. Additionally, we propose a weighted fair data sampler algorithm to enhance fairness in training data. We evaluate the proposed approach using a COVID-19 X-ray detection use case. The evaluation results show that the approach is feasible to enable accountability and improve fairness. The proposed algorithm can achieve better performance than the default federated learning setting in terms of the model's generalisation and accuracy.


AI and Ethics -- Operationalising Responsible AI

arXiv.org Artificial Intelligence

One is on how to design, develop, and validate AI technologies and systems responsibly (i.e., Responsible AI) so that we can adequately assure ethical and legal concerns, especially pertaining to human values. The other is the use of AI itself as a means to achieve the Responsible AI ends. In this chapter, we focus on the former issue. In the last few years, AI continues demonstrating its positive impact on society while sometimes with ethically questionable consequences. Not doing AI responsibly is starting to have devastating effect on humanity, not only on data protection, privacy and bias but also on labour rights and climate justice [8]. Building and maintaining public trust in AI has been identified as the key to successful and sustainable innovation [6].


Generative Inverse Deep Reinforcement Learning for Online Recommendation

arXiv.org Artificial Intelligence

Deep reinforcement learning enables an agent to capture user's interest through interactions with the environment dynamically. It has attracted great interest in the recommendation research. Deep reinforcement learning uses a reward function to learn user's interest and to control the learning process. However, most reward functions are manually designed; they are either unrealistic or imprecise to reflect the high variety, dimensionality, and non-linearity properties of the recommendation problem. That makes it difficult for the agent to learn an optimal policy to generate the most satisfactory recommendations. To address the above issue, we propose a novel generative inverse reinforcement learning approach, namely InvRec, which extracts the reward function from user's behaviors automatically, for online recommendation. We conduct experiments on an online platform, VirtualTB, and compare with several state-of-the-art methods to demonstrate the feasibility and effectiveness of our proposed approach.


MAMO: Memory-Augmented Meta-Optimization for Cold-start Recommendation

arXiv.org Machine Learning

A common challenge for most current recommender systems is the cold-start problem. Due to the lack of user-item interactions, the fine-tuned recommender systems are unable to handle situations with new users or new items. Recently, some works introduce the meta-optimization idea into the recommendation scenarios, i.e. predicting the user preference by only a few of past interacted items. The core idea is learning a global sharing initialization parameter for all users and then learning the local parameters for each user separately. However, most meta-learning based recommendation approaches adopt model-agnostic meta-learning for parameter initialization, where the global sharing parameter may lead the model into local optima for some users. In this paper, we design two memory matrices that can store task-specific memories and feature-specific memories. Specifically, the feature-specific memories are used to guide the model with personalized parameter initialization, while the task-specific memories are used to guide the model fast predicting the user preference. And we adopt a meta-optimization approach for optimizing the proposed method. We test the model on two widely used recommendation datasets and consider four cold-start situations. The experimental results show the effectiveness of the proposed methods.