Federated learning systems enable the collaborative training of machine learning models among different organizations under the privacy restrictions. As researchers try to support more machine learning models with different privacy-preserving approaches, current federated learning systems face challenges from various issues such as unpractical system assumptions, scalability and efficiency. Inspired by federated systems in other fields such as databases and cloud computing, we investigate the characteristics of federated learning systems. We find that two important features for other federated systems, i.e., heterogeneity and autonomy, are rarely considered in the existing federated learning systems. Moreover, we provide a thorough categorization for federated learning systems according to four different aspects, including data partition, model, privacy level, and communication architecture. Lastly, we take a systematic comparison among the existing federated learning systems and present future research opportunities and directions.
Today's AI still faces two major challenges. One is that in most industries, data exists in the form of isolated islands. The other is the strengthening of data privacy and security. We propose a possible solution to these challenges: secure federated learning. Beyond the federated learning framework first proposed by Google in 2016, we introduce a comprehensive secure federated learning framework, which includes horizontal federated learning, vertical federated learning and federated transfer learning. We provide definitions, architectures and applications for the federated learning framework, and provide a comprehensive survey of existing works on this subject. In addition, we propose building data networks among organizations based on federated mechanisms as an effective solution to allow knowledge to be shared without compromising user privacy.
Torkzadehmahani, Reihaneh, Nasirigerdeh, Reza, Blumenthal, David B., Kacprowski, Tim, List, Markus, Matschinske, Julian, Späth, Julian, Wenke, Nina Kerstin, Bihari, Béla, Frisch, Tobias, Hartebrodt, Anne, Hausschild, Anne-Christin, Heider, Dominik, Holzinger, Andreas, Hötzendorfer, Walter, Kastelitz, Markus, Mayer, Rudolf, Nogales, Cristian, Pustozerova, Anastasia, Röttger, Richard, Schmidt, Harald H. H. W., Schwalber, Ameli, Tschohl, Christof, Wohner, Andrea, Baumbach, Jan
Artificial intelligence (AI) has been successfully applied in numerous scientific domains including biomedicine and healthcare. Here, it has led to several breakthroughs ranging from clinical decision support systems, image analysis to whole genome sequencing. However, training an AI model on sensitive data raises also concerns about the privacy of individual participants. Adversary AIs, for example, can abuse even summary statistics of a study to determine the presence or absence of an individual in a given dataset. This has resulted in increasing restrictions to access biomedical data, which in turn is detrimental for collaborative research and impedes scientific progress. Hence there has been an explosive growth in efforts to harness the power of AI for learning from sensitive data while protecting patients' privacy. This paper provides a structured overview of recent advances in privacy-preserving AI techniques in biomedicine. It places the most important state-of-the-art approaches within a unified taxonomy, and discusses their strengths, limitations, and open problems.
Matrix factorization is one of the most commonly used technologies in recommendation system. With the promotion of recommendation system in e-commerce shopping, online video and other aspects, distributed recommendation system has been widely promoted, and the privacy problem of multi-source data becomes more and more important. Based on Federated learning technology, this paper proposes a shared matrix factorization scheme called SharedMF. Firstly, a distributed recommendation system is built, and then secret sharing technology is used to protect the privacy of local data. Experimental results show that compared with the existing homomorphic encryption methods, our method can have faster execution speed without privacy disclosure, and can better adapt to recommendation scenarios with large amount of data.
Along with the rapid expansion of information technology and digitalization of health data, there is an increasing concern on maintaining data privacy while garnering the benefits in medical field. Two critical challenges are identified: Firstly, medical data is naturally distributed across multiple local sites, making it difficult to collectively train machine learning models without data leakage. Secondly, in medical applications, data are often collected from different sources and views, resulting in heterogeneity and complexity that requires reconciliation. This paper aims to provide a generic Federated Multi-View Learning (FedMV) framework for multi-view data leakage prevention, which is based on different types of local data availability and enables to accommodate two types of problems: Vertical Federated Multi-View Learning (V-FedMV) and Horizontal Federated Multi-View Learning (H-FedMV). We experimented with real-world keyboard data collected from BiAffect study. The results demonstrated that the proposed FedMV approach can make full use of multi-view data in a privacy-preserving way, and both V-FedMV and H-FedMV methods perform better than their single-view and pairwise counterparts. Besides, the proposed model can be easily adapted to deal with multi-view sequential data in a federated environment, which has been modeled and experimentally studied. To the best of our knowledge, this framework is the first to consider both vertical and horizontal diversification in the multi-view setting, as well as their sequential federated learning.