Nowadays, devices are equipped with advanced sensors with higher processing/computing capabilities. Further, widespread Internet availability enables communication among sensing devices. As a result, vast amounts of data are generated on edge devices to drive Internet-of-Things (IoT), crowdsourcing, and other emerging technologies. The collected extensive data can be pre-processed, scaled, classified, and finally, used for predicting future events using machine learning (ML) methods. In traditional ML approaches, data is sent to and processed in a central server, which encounters communication overhead, processing delay, privacy leakage, and security issues. To overcome these challenges, each client can be trained locally based on its available data and by learning from the global model. This decentralized learning structure is referred to as Federated Learning (FL). However, in large-scale networks, there may be clients with varying computational resource capabilities. This may lead to implementation and scalability challenges for FL techniques. In this paper, we first introduce some recently implemented real-life applications of FL. We then emphasize on the core challenges of implementing the FL algorithms from the perspective of resource limitations (e.g., memory, bandwidth, and energy budget) of client clients. We finally discuss open issues associated with FL and highlight future directions in the FL area concerning resource-constrained devices.
In recent years, mobile devices have gained increasingly development with stronger computation capability and larger storage. Some of the computation-intensive machine learning and deep learning tasks can now be run on mobile devices. To take advantage of the resources available on mobile devices and preserve users' privacy, the idea of mobile distributed machine learning is proposed. It uses local hardware resources and local data to solve machine learning sub-problems on mobile devices, and only uploads computation results instead of original data to contribute to the optimization of the global model. This architecture can not only relieve computation and storage burden on servers, but also protect the users' sensitive information. Another benefit is the bandwidth reduction, as various kinds of local data can now participate in the training process without being uploaded to the server. In this paper, we provide a comprehensive survey on recent studies of mobile distributed machine learning. We survey a number of widely-used mobile distributed machine learning methods. We also present an in-depth discussion on the challenges and future directions in this area. We believe that this survey can demonstrate a clear overview of mobile distributed machine learning and provide guidelines on applying mobile distributed machine learning to real applications.
Most recently, with the proliferation of IoT devices, computational nodes in manufacturing systems IIoT(Industrial-Internet-of-things) and the lunch of 5G networks, there will be millions of connected devices generating a massive amount of data. In such an environment, the controlling systems need to be intelligent enough to deal with a vast amount of data to detect defects in a real-time process. Driven by such a need, artificial intelligence models such as deep learning have to be deployed into IIoT systems. However, learning and using deep learning models are computationally expensive, so an IoT device with limited computational power could not run such models. To tackle this issue, edge intelligence had emerged as a new paradigm towards running Artificial Intelligence models on edge devices. Although a considerable amount of studies have been proposed in this area, the research is still in the early stages. In this paper, we propose a novel edge-based multi-phase pruning pipelines to ensemble learning on IIoT devices. In the first phase, we generate a diverse ensemble of pruned models, then we apply integer quantisation, next we prune the generated ensemble using a clustering-based technique. Finally, we choose the best representative from each generated cluster to be deployed to a distributed IoT environment. On CIFAR-100 and CIFAR-10, our proposed approach was able to outperform the predictability levels of a baseline model (up to 7%), more importantly, the generated learners have small sizes (up to 90% reduction in the model size) that minimise the required computational capabilities to make an inference on the resource-constraint devices.
In recent years, mobile devices have gained increasingly development with stronger computation capability and larger storage. Some of the computation-intensive machine learning and deep learning tasks can now be run on mobile devices. To take advantage of the resources available on mobile devices and preserve users' privacy, the idea of mobile distributed machine learning is proposed. It uses local hardware resources and local data to solve machine learning sub-problems on mobile devices, and only uploads computation results instead of original data to contribute to the optimization of the global model. This architecture can not only relieve computation and storage burden on servers, but also protect the users' sensitive information.
Semi-supervised anomaly detection is referred as an approach to identify rare data instances (i.e, anomalies) on the assumption that all the available training data belong to the majority (i.e., the normal class). A typical strategy is to model distributions of normal data, then identify data samples far from the distributions as anomalies. Nowadays, backpropagation based neural networks (i.e., BP-NNs) have been drawing attention as well as in the field of semi-supervised anomaly detection because of their high generalization capability for real-world high dimensional data. As a typical application, such BP-NN based models are iteratively optimized in server machines with accumulated data gathered from edge devices. However, there are two issues in this framework: (1) BP-NNs' iterative optimization approach often takes too long time to follow changes of the distributions of normal data (i.e., concept drift), and (2) data transfers between servers and edge devices have a potential risk to cause data breaches. To address these underlying issues, we propose an ON-device sequential Learning semi-supervised Anomaly Detector called ONLAD. The aim of this work is to propose the algorithm, and also to implement it as an IP core called ONLAD Core so that various kinds of edge devices can adopt our approach at low power consumption. Experimental results using open datasets show that ONLAD has favorable anomaly detection capability especially in a testbed which simulates concept drift. Experimental results on hardware performance of the FPGA based ONLAD Core show that its training latency and prediction latency are x1.95 - x4.51 and x2.29 - x4.73 faster than those of BP-NN based software implementations. It is also confirmed that our on-board implementation of ONLAD Core actually works at x6.7 - x27.1 lower power consumption than the other software implementations at a high workload.