Fernández-Vilas, Ana
Towards efficient compression and communication for prototype-based decentralized learning
Fernández-Piñeiro, Pablo, Ferández-Veiga, Manuel, Díaz-Redondo, Rebeca P., Fernández-Vilas, Ana, González-Soto, Martín
In prototype-based federated learning, the exchange of model parameters between clients and the master server is replaced by transmission of prototypes or quantized versions of the data samples to the aggregation server. A fully decentralized deployment of prototype-based learning, without a central agregartor of prototypes, is more robust upon network failures and reacts faster to changes in the statistical distribution of the data, suggesting potential advantages and quick adaptation in dynamic learning tasks, e.g., when the data sources are IoT devices or when data is non-iid. In this paper, we consider the problem of designing a communication-efficient decentralized learning system based on prototypes. We address the challenge of prototype redundancy by leveraging on a twofold data compression technique, i.e., sending only update messages if the prototypes are informationtheoretically useful (via the Jensen-Shannon distance), and using clustering on the prototypes to compress the update messages used in the gossip protocol. We also use parallel instead of sequential gossiping, and present an analysis of its age-of-information (AoI). Our experimental results show that, with these improvements, the communications load can be substantially reduced without decreasing the convergence rate of the learning algorithm. Federated Learning (FL) [1], [2], [3] and Decentralized Federated Learning (DFL) [4], [5] provide good approaches for distributed machine learning system where the main focus is the minimization of a global loss function using different versions of a model created by multiple clients. These approaches have been extensively studied in the literature and applied, traditionally, to process private data in areas such as health and banking. In this paper, differently to these well-known approaches, we focus on the analysis and implementation of a decentralized machine learning system based on prototypes. On the one hand, our choice of prototype-based algorithms is motivated by the advantages of these prototypes as compact representation of the data, capturing the essential features and patterns within the dataset.
Byzantine-Robust Aggregation for Securing Decentralized Federated Learning
Cajaraville-Aboy, Diego, Fernández-Vilas, Ana, Díaz-Redondo, Rebeca P., Fernández-Veiga, Manuel
Federated Learning (FL) emerges as a distributed machine learning approach that addresses privacy concerns by training AI models locally on devices. Decentralized Federated Learning (DFL) extends the FL paradigm by eliminating the central server, thereby enhancing scalability and robustness through the avoidance of a single point of failure. However, DFL faces significant challenges in optimizing security, as most Byzantine-robust algorithms proposed in the literature are designed for centralized scenarios. In this paper, we present a novel Byzantine-robust aggregation algorithm to enhance the security of Decentralized Federated Learning environments, coined WFAgg. This proposal handles the adverse conditions and strength robustness of dynamic decentralized topologies at the same time by employing multiple filters to identify and mitigate Byzantine attacks. Experimental results demonstrate the effectiveness of the proposed algorithm in maintaining model accuracy and convergence in the presence of various Byzantine attack scenarios, outperforming state-of-the-art centralized Byzantine-robust aggregation schemes (such as Multi-Krum or Clustering). These algorithms are evaluated on an IID image classification problem in both centralized and decentralized scenarios.
Decentralised and collaborative machine learning framework for IoT
González-Soto, Martín, Díaz-Redondo, Rebeca P., Fernández-Veiga, Manuel, Rodríguez-Castro, Bruno, Fernández-Vilas, Ana
Decentralised machine learning has recently been proposed as a potential solution to the security issues of the canonical federated learning approach. In this paper, we propose a decentralised and collaborative machine learning framework specially oriented to resource-constrained devices, usual in IoT deployments. With this aim we propose the following construction blocks. First, an incremental learning algorithm based on prototypes that was specifically implemented to work in low-performance computing elements. Second, two random-based protocols to exchange the local models among the computing elements in the network. This proposal was compared to a typical centralized incremental learning approach in terms of accuracy, training time and robustness with very promising results. Decentralized machine learning faces how to use data and models from different sources to build machine learning models that gather the partial knowledge learned by each agent in this network to create, in a collaborative way, a global vision or model of the whole network. This would allow processing large amount of data managed by different computing elements. However, this approach entails several issues that must be considered when proposing solutions for this kind of computing environments. One of the most worrying is how to provide secure and private solutions that protect personal data when building global models. Some approaches have been already proposed to decentralise machine learning algorithms so that a set of networked agents can participate in building a global model.
Classification of retail products: From probabilistic ranking to neural networks
Hafez, Manar Mohamed, Redondo, Rebeca P. Díaz, Fernández-Vilas, Ana, Pazó, Héctor Olivera
ood retailing is now on an accelerated path to a success penetration into the digital market by new ways of value creation at all stages of the consumer decision process. One of the most important imperatives in this path is the availability of quality data to feed all the process in digital transformation. But the quality of data is not so obvious if we consider the variety of products and suppliers in the grocery market. Within this context of digital transformation of grocery industry, Midiadia is Spanish data provider company that works on converting data from the retailers' products into knowledge with attributes and insights from the product labels, that is, maintaining quality data in a dynamic market with a high dispersion of products. Currently, they manually categorize products (groceries) according to the information extracted directly (text processing) from the product labelling and packaging. This paper introduces a solution to automatically categorize the constantly changing product catalogue into a 3-level food taxonomy. Thus, we provide four different classifiers that support a more efficient and less errorprone maintenance of groceries catalogues, the main asset of the company. Finally, we have compared the performance of these three alternatives, concluding that traditional machine learning algorithms perform better, but closely followed by the score-based approach.ood One of the most important imperatives in this path is the availability of quality data to feed all the process in digital transformation. But the quality of data is not so obvious if we consider the variety of products and suppliers in the grocery market. Within this context of digital transformation of grocery industry, Midiadia is Spanish data provider company that works on converting data from the retailers' products into knowledge with attributes and insights from the product labels, that is, maintaining quality data in a dynamic market with a high dispersion of products. Currently, they manually categorize products (groceries) according to the information extracted directly (text processing) from the product labelling and packaging. This paper introduces a solution to automatically categorize the constantly changing product catalogue into a 3-level food taxonomy.
Multi-criteria recommendation systems to foster online grocery
Hafez, Manar Mohamed, Redondo, Rebeca P. Díaz, Fernández-Vilas, Ana, Pazó, Héctor Olivera
With the exponential increase in information, it has become imperative to design mechanisms that allow users to access what matters to them as quickly as possible. The recommendation system ($RS$) with information technology development is the solution, it is an intelligent system. Various types of data can be collected on items of interest to users and presented as recommendations. $RS$ also play a very important role in e-commerce. The purpose of recommending a product is to designate the most appropriate designation for a specific product. The major challenges when recommending products are insufficient information about the products and the categories to which they belong. In this paper, we transform the product data using two methods of document representation: bag-of-words (BOW) and the neural network-based document combination known as vector-based (Doc2Vec). We propose three-criteria recommendation systems (product, package, and health) for each document representation method to foster online grocery, which depends on product characteristics such as (composition, packaging, nutrition table, allergen, etc.). For our evaluation, we conducted a user and expert survey. Finally, we have compared the performance of these three criteria for each document representation method, discovering that the neural network-based (Doc2Vec) performs better and completely alters the results.
KPIs-Based Clustering and Visualization of HPC jobs: a Feature Reduction Approach
Halawa, Mohamed Soliman, Díaz-Redondo, Rebeca P., Fernández-Vilas, Ana
High-Performance Computing (HPC) systems need to be constantly monitored to ensure their stability. The monitoring systems collect a tremendous amount of data about different parameters or Key Performance Indicators (KPIs), such as resource usage, IO waiting time, etc. A proper analysis of this data, usually stored as time series, can provide insight in choosing the right management strategies as well as the early detection of issues. In this paper, we introduce a methodology to cluster HPC jobs according to their KPI indicators. Our approach reduces the inherent high dimensionality of the collected data by applying two techniques to the time series: literature-based and variance-based feature extraction. We also define a procedure to visualize the obtained clusters by combining the two previous approaches and the Principal Component Analysis (PCA). Finally, we have validated our contributions on a real data set to conclude that those KPIs related to CPU usage provide the best cohesion and separation for clustering analysis and the good results of our visualization methodology.
Unsupervised KPIs-Based Clustering of Jobs in HPC Data Centers
Halawa, Mohamed S., Díaz-Redondo, Rebeca P., Fernández-Vilas, Ana
Performance analysis is an essential task in High-Performance Computing (HPC) systems and it is applied for different purposes such as anomaly detection, optimal resource allocation, and budget planning. HPC monitoring tasks generate a huge number of Key Performance Indicators (KPIs) to supervise the status of the jobs running in these systems. KPIs give data about CPU usage, memory usage, network (interface) traffic, or other sensors that monitor the hardware. Analyzing this data, it is possible to obtain insightful information about running jobs, such as their characteristics, performance, and failures. The main contribution in this paper is to identify which metric/s (KPIs) is/are the most appropriate to identify/classify different types of jobs according to their behavior in the HPC system. With this aim, we have applied different clustering techniques (partition and hierarchical clustering algorithms) using a real dataset from the Galician Computation Center (CESGA). We have concluded that (i) those metrics (KPIs) related to the Network (interface) traffic monitoring provide the best cohesion and separation to cluster HPC jobs, and (ii) hierarchical clustering algorithms are the most suitable for this task. Our approach was validated using a different real dataset from the same HPC center.
Scheduling and Communication Schemes for Decentralized Federated Learning
Abdelghany, Bahaa-Eldin Ali, Fernández-Vilas, Ana, Fernández-Veiga, Manuel, El-Bendary, Nashwa, Hassan, Ammar M., Abdelmoez, Walid M.
Federated learning (FL) is a distributed machine learning paradigm in which a large number of clients coordinate with a central server to learn a model without sharing their own training data. One central server is not enough, due to problems of connectivity with clients. In this paper, a decentralized federated learning (DFL) model with the stochastic gradient descent (SGD) algorithm has been introduced, as a more scalable approach to improve the learning performance in a network of agents with arbitrary topology. Three scheduling policies for DFL have been proposed for communications between the clients and the parallel servers, and the convergence, accuracy, and loss have been tested in a totally decentralized mplementation of SGD. The experimental results show that the proposed scheduling polices have an impact both on the speed of convergence and in the final global model.
Using Decentralized Aggregation for Federated Learning with Differential Privacy
El-Kareem, Hadeel Abd, Saleh, Abd El-Moaty, Fernández-Vilas, Ana, Fernández-Veiga, Manuel, El-Sonbaty, asser
On the other hand, although Federated Learning (FL) data silos, eliminating the need for raw data sharing as it has provides some level of privacy by retaining the data at the local the ambition to protect data privacy through distributed learning node, which executes a local training to enrich a global model, this methods that keep the data local. In simple terms, with FL, it is scenario is still susceptible to privacy breaches as membership inference not the data that moves to a model, but it is a model that moves to attacks. To provide a stronger level of privacy, this research data, which means that training is happening from user interaction deploys an experimental environment for FL with Differential Privacy with end devices. Federated Learning's key motivation is to provide (DP) using benchmark datasets. The obtained results show privacy protection as well as there has recently been some research that the election of parameters and techniques of DP is central in into combining the formal privacy notion of Differential Privacy the aforementioned trade-off between privacy and utility by means (DP) with FL. of a classification example.
A Blockchain Solution for Collaborative Machine Learning over IoT
Beis-Penedo, Carlos, Troncoso-Pastoriza, Francisco, Díaz-Redondo, Rebeca P., Fernández-Vilas, Ana, Fernández-Veiga, Manuel, Soto, Martín González
The proliferation of Internet of Things (IoT) devices and applications has generated massive amounts of data that require advanced analytics and machine learning techniques for meaningful insights. However, traditional centralized machine learning models face challenges such as data privacy, security, and scalability. Federated learning (FL) [1] is an emerging technique that addresses these challenges by enabling decentralized model training on distributed data sources while preserving data privacy and security. Despite its promise, FL still faces several technical challenges such as non-iid data distribution, communication overhead, and straggler nodes [2]. In the traditional FL approach, multiple devices work together to train a machine learning model while retaining their data locally, without sharing it with other participating devices; thus, data resides on trusted nodes. This scenario is particularly convenient for IoT applications, where devices often generate sensitive data that must be protected from unauthorized access. Model updates are exchanged between these nodes for aggregation, contributing to enrich the global model without exposing their raw data. Consequently, by retaining their data locally and collaborating on model training through the exchange of model updates, the devices can effectively contribute to the learning process while maintaining data privacy and security. However, this exchange of model updates introduces new security and privacy concerns, as it makes the models potentially vulnerable to various types of attacks. Therefore, FL encounters additional security-related challenges, including data poisoning attacks where malicious nodes inject corrupted or misleading data into the training process, compromising the accuracy of the global model. Model inversion attacks pose another threat, as adversaries aim to reconstruct individual data samples from aggregated model updates, potentially revealing sensitive information. Furthermore, sibyl attacks occur when malicious entities create multiple fake nodes to disproportionately influence the federated learning process, and collusion attacks involve a group of malicious nodes conspiring to manipulate the global model [3]. To address these challenges, recent research has proposed FL solutions that leverage blockchain technology for secure and efficient data sharing, model training, and prototype storage in a distributed environment. Blockchain technology [4], by providing a tamper-proof distributed ledger for storing and sharing data, models, and training results, enables collaboration among multiple parties without the need for a central authority, thereby significantly enhancing data privacy and security in the process.