Goto

Collaborating Authors

 Fernández-Veiga, Manuel


Byzantine-Robust Aggregation for Securing Decentralized Federated Learning

arXiv.org Artificial Intelligence

Federated Learning (FL) emerges as a distributed machine learning approach that addresses privacy concerns by training AI models locally on devices. Decentralized Federated Learning (DFL) extends the FL paradigm by eliminating the central server, thereby enhancing scalability and robustness through the avoidance of a single point of failure. However, DFL faces significant challenges in optimizing security, as most Byzantine-robust algorithms proposed in the literature are designed for centralized scenarios. In this paper, we present a novel Byzantine-robust aggregation algorithm to enhance the security of Decentralized Federated Learning environments, coined WFAgg. This proposal handles the adverse conditions and strength robustness of dynamic decentralized topologies at the same time by employing multiple filters to identify and mitigate Byzantine attacks. Experimental results demonstrate the effectiveness of the proposed algorithm in maintaining model accuracy and convergence in the presence of various Byzantine attack scenarios, outperforming state-of-the-art centralized Byzantine-robust aggregation schemes (such as Multi-Krum or Clustering). These algorithms are evaluated on an IID image classification problem in both centralized and decentralized scenarios.


Decentralised and collaborative machine learning framework for IoT

arXiv.org Artificial Intelligence

Decentralised machine learning has recently been proposed as a potential solution to the security issues of the canonical federated learning approach. In this paper, we propose a decentralised and collaborative machine learning framework specially oriented to resource-constrained devices, usual in IoT deployments. With this aim we propose the following construction blocks. First, an incremental learning algorithm based on prototypes that was specifically implemented to work in low-performance computing elements. Second, two random-based protocols to exchange the local models among the computing elements in the network. This proposal was compared to a typical centralized incremental learning approach in terms of accuracy, training time and robustness with very promising results. Decentralized machine learning faces how to use data and models from different sources to build machine learning models that gather the partial knowledge learned by each agent in this network to create, in a collaborative way, a global vision or model of the whole network. This would allow processing large amount of data managed by different computing elements. However, this approach entails several issues that must be considered when proposing solutions for this kind of computing environments. One of the most worrying is how to provide secure and private solutions that protect personal data when building global models. Some approaches have been already proposed to decentralise machine learning algorithms so that a set of networked agents can participate in building a global model.


Scheduling and Communication Schemes for Decentralized Federated Learning

arXiv.org Artificial Intelligence

Federated learning (FL) is a distributed machine learning paradigm in which a large number of clients coordinate with a central server to learn a model without sharing their own training data. One central server is not enough, due to problems of connectivity with clients. In this paper, a decentralized federated learning (DFL) model with the stochastic gradient descent (SGD) algorithm has been introduced, as a more scalable approach to improve the learning performance in a network of agents with arbitrary topology. Three scheduling policies for DFL have been proposed for communications between the clients and the parallel servers, and the convergence, accuracy, and loss have been tested in a totally decentralized mplementation of SGD. The experimental results show that the proposed scheduling polices have an impact both on the speed of convergence and in the final global model.


Using Decentralized Aggregation for Federated Learning with Differential Privacy

arXiv.org Artificial Intelligence

On the other hand, although Federated Learning (FL) data silos, eliminating the need for raw data sharing as it has provides some level of privacy by retaining the data at the local the ambition to protect data privacy through distributed learning node, which executes a local training to enrich a global model, this methods that keep the data local. In simple terms, with FL, it is scenario is still susceptible to privacy breaches as membership inference not the data that moves to a model, but it is a model that moves to attacks. To provide a stronger level of privacy, this research data, which means that training is happening from user interaction deploys an experimental environment for FL with Differential Privacy with end devices. Federated Learning's key motivation is to provide (DP) using benchmark datasets. The obtained results show privacy protection as well as there has recently been some research that the election of parameters and techniques of DP is central in into combining the formal privacy notion of Differential Privacy the aforementioned trade-off between privacy and utility by means (DP) with FL. of a classification example.


A Blockchain Solution for Collaborative Machine Learning over IoT

arXiv.org Artificial Intelligence

The proliferation of Internet of Things (IoT) devices and applications has generated massive amounts of data that require advanced analytics and machine learning techniques for meaningful insights. However, traditional centralized machine learning models face challenges such as data privacy, security, and scalability. Federated learning (FL) [1] is an emerging technique that addresses these challenges by enabling decentralized model training on distributed data sources while preserving data privacy and security. Despite its promise, FL still faces several technical challenges such as non-iid data distribution, communication overhead, and straggler nodes [2]. In the traditional FL approach, multiple devices work together to train a machine learning model while retaining their data locally, without sharing it with other participating devices; thus, data resides on trusted nodes. This scenario is particularly convenient for IoT applications, where devices often generate sensitive data that must be protected from unauthorized access. Model updates are exchanged between these nodes for aggregation, contributing to enrich the global model without exposing their raw data. Consequently, by retaining their data locally and collaborating on model training through the exchange of model updates, the devices can effectively contribute to the learning process while maintaining data privacy and security. However, this exchange of model updates introduces new security and privacy concerns, as it makes the models potentially vulnerable to various types of attacks. Therefore, FL encounters additional security-related challenges, including data poisoning attacks where malicious nodes inject corrupted or misleading data into the training process, compromising the accuracy of the global model. Model inversion attacks pose another threat, as adversaries aim to reconstruct individual data samples from aggregated model updates, potentially revealing sensitive information. Furthermore, sibyl attacks occur when malicious entities create multiple fake nodes to disproportionately influence the federated learning process, and collusion attacks involve a group of malicious nodes conspiring to manipulate the global model [3]. To address these challenges, recent research has proposed FL solutions that leverage blockchain technology for secure and efficient data sharing, model training, and prototype storage in a distributed environment. Blockchain technology [4], by providing a tamper-proof distributed ledger for storing and sharing data, models, and training results, enables collaboration among multiple parties without the need for a central authority, thereby significantly enhancing data privacy and security in the process.