--Machine Learning has proven useful in the recent years as a way to achieve failure prediction for industrial systems. However, the high computational resources necessary to run learning algorithms are an obstacle to its widespread application. The sub-field of Distributed Learning offers a solution to this problem by enabling the use of remote resources but at the expense of introducing communication costs in the application that are not always acceptable. In this paper, we propose a distributed learning approach able to optimize the use of computational and communication resources to achieve excellent learning model performances through a centralized architecture. T o achieve this, we present a new centralized distributed learning algorithm that relies on the learning paradigms of Active Learning and Federated Learning to offer a communication-efficient method that offers guarantees of model precision on both the clients and the central server . We evaluate this method on a public benchmark and show that its performances in terms of precision are very close to state-of-the-art performance level of non-distributed learning despite additional constraints. A. General problem In the recent years, the efficiency of Machine Learning for automated processing of large volumes of data has been widely demonstrated. This has been of particular interest for industrial applications that commonly generate large datasets.
The use of an environment means that there is no fixed training dataset, rather a goal or set of goals that an agent is required to achieve, actions they may perform, and feedback about performance toward the goal. Some machine learning algorithms do not just experience a fixed dataset. For example, reinforcement learning algorithms interact with an environment, so there is a feedback loop between the learning system and its experiences.
I am Imtiaz Adam, and this article is an introduction to AI key terminologies and methodologies on behalf of myself and DLS (www.dls.ltd). This article has been updated in September 2020 to take into account advances in the field of AI with techniques such as NeuroSymbolic AI, Neuroevolution and Federated Learning. AI deals with the area of developing computing systems which are capable of performing tasks that humans are very good at, for example recognising objects, recognising and making sense of speech, and decision making in a constrained environment. Narrow AI: the field of AI where the machine is designed to perform a single task and the machine gets very good at performing that particular task. However, once the machine is trained, it does not generalise to unseen domains. This is the form of AI that we have today, for example Google Translate.
Machine learning is the subfield of computer science that gives computers the ability to learn without being explicitly programmed (Arthur Samuel, 1959). Evolved from the study of pattern recognition and computational learning theory in artificial intelligence, machine learning explores the study and construction of algorithms that can learn from and make predictions on data – such algorithms overcome following strictly static program instructions by making data driven predictions or decisions,:2 through building a model from sample inputs. Machine learning is employed in a range of computing tasks where designing and programming explicit algorithms is infeasible; example applications include spam filtering, detection of network intruders or malicious insiders working towards a data breach, optical character recognition (OCR), search engines and computer vision. Machine learning is closely related to (and often overlaps with) computational statistics, which also focuses in prediction-making through the use of computers. It has strong ties to mathematical optimization, which delivers methods, theory and application domains to the field. Machine learning is sometimes conflated with data mining, where the latter subfield focuses more on exploratory data analysis and is known as unsupervised learning.:vii
Over the past few years, providers such as Google, Microsoft, and Amazon have started to provide customers with access to software interfaces allowing them to easily embed machine learning tasks into their applications. Overall, organizations can now use Machine Learning as a Service (MLaaS) engines to outsource complex tasks, e.g., training classifiers, performing predictions, clustering, etc. They can also let others query models trained on their data. Naturally, this approach can also be used (and is often advocated) in other contexts, including government collaborations, citizen science projects, and business-to-business partnerships. However, if malicious users were able to recover data used to train these models, the resulting information leakage would create serious issues. Likewise, if the inner parameters of the model are considered proprietary information, then access to the model should not allow an adversary to learn such parameters. In this document, we set to review privacy challenges in this space, providing a systematic review of the relevant research literature, also exploring possible countermeasures. More specifically, we provide ample background information on relevant concepts around machine learning and privacy. Then, we discuss possible adversarial models and settings, cover a wide range of attacks that relate to private and/or sensitive information leakage, and review recent results attempting to defend against such attacks. Finally, we conclude with a list of open problems that require more work, including the need for better evaluations, more targeted defenses, and the study of the relation to policy and data protection efforts.