Country
Free-riders in Federated Learning: Attacks and Defenses
Lin, Jierui, Du, Min, Liu, Jian
Free-riders in Federated Learning: Attacks and Defenses Jierui Lin, Min Du, and Jian Liu University of California, Berkeley Abstract--Federated learning is a recently proposed paradigm that enables multiple clients to collaboratively train a joint model. It allows clients to train models locally, and leverages the parameter server to generate a global model by aggregating the locally submitted gradient updates at each round. Although the incentive model for federated learning has not been fully developed, it is supposed that participants are able to get rewards or the privilege to use the final global model, as a compensation for taking efforts to train the model. Therefore, a client who does not have any local data has the incentive to construct local gradient updates in order to deceive for rewards. In this paper, we are the first to propose the notion of free rider attacks, to explore possible ways that an attacker may construct gradient updates, without any local training data. Furthermore, we explore possible defenses that could detect the proposed attacks, and propose a new high dimensional detection method called STD-DAGMM, which particularly works well for anomaly detection of model parameters. We extend the attacks and defenses to consider more free riders as well as differential privacy, which sheds light on and calls for future research in this field. I NTRODUCTION F EDERA TED learning [1], [2], [3] has been proposed to facilitate a joint model training leveraging data from multiple clients, where the training process is coordinated by a parameter server. In the whole process, clients' data stay local, and only model parameters are communicated among clients through the parameter server. A typical training iteration works as follows. First, the parameter server sends the newest global model to each client. Then, each client locally updates the model using local data and reports updated gradients to the parameter server. Finally, the server performs model aggregation on all submitted local updates to form a new global model, which has better performance than models trained using any single client's data. Compared with an alternative approach which simply collects all data from the clients and trains a model on those data, federated learning is able to save the communication overhead by only transmitting model parameters, as well as protect privacy since all data stay local.
Stability of the Decoupled Extended Kalman Filter Learning Algorithm in LSTM-Based Online Learning
Vural, N. Mert, Kozat, Suleyman S.
We investigate the convergence and stability properties of the decoupled extended Kalman filter learning algorithm (DEKF) within the long-short term memory network (LSTM) based online learning framework. For this purpose, we model DEKF as a perturbed extended Kalman filter and derive sufficient conditions for its stability during LSTM training. We show that if the perturbations -- introduced due to decoupling -- stay bounded, DEKF learns LSTM parameters with similar convergence and stability properties of the global extended Kalman filter learning algorithm. We verify our results with several numerical simulations and compare DEKF with other LSTM training methods. In our simulations, we also observe that the well-known hyper-parameter selection approaches used for DEKF in the literature satisfy our conditions.
Lifelong Spectral Clustering
Sun, Gan, Cong, Yang, Wang, Qianqian, Li, Jun, Fu, Yun
In the past decades, spectral clustering (SC) has become one of the most effective clustering algorithms. However, most previous studies focus on spectral clustering tasks with a fixed task set, which cannot incorporate with a new spectral clustering task without accessing to previously learned tasks. In this paper, we aim to explore the problem of spectral clustering in a lifelong machine learning framework, i.e., Lifelong Spectral Clustering (L2SC). Its goal is to efficiently learn a model for a new spectral clustering task by selectively transferring previously accumulated experience from knowledge library. Specifically, the knowledge library of L2SC contains two components: 1) orthogonal basis library: capturing latent cluster centers among the clusters in each pair of tasks; 2) feature embedding library: embedding the feature manifold information shared among multiple related tasks. As a new spectral clustering task arrives, L2SC firstly transfers knowledge from both basis library and feature library to obtain encoding matrix, and further redefines the library base over time to maximize performance across all the clustering tasks. Meanwhile, a general online update formulation is derived to alternatively update the basis library and feature library. Finally, the empirical experiments on several real-world benchmark datasets demonstrate that our L2SC model can effectively improve the clustering performance when comparing with other state-of-the-art spectral clustering algorithms.
Cryptocurrency Price Prediction and Trading Strategies Using Support Vector Machines
Zhao, David, Rinaldo, Alessandro, Brookins, Christopher
Few assets in financial history have been as notoriously volatile as cryptocurrencies. While the long term outlook for this asset class remains unclear, we are successful in making short term price predictions for several major crypto assets. Using historical data from July 2015 to November 2019, we develop a large number of technical indicators to capture patterns in the cryptocurrency market. We then test various classification methods to forecast short-term future price movements based on these indicators. On both PPV and NPV metrics, our classifiers do well in identifying up and down market moves over the next 1 hour. Beyond evaluating classification accuracy, we also develop a strategy for translating 1-hour-ahead class predictions into trading decisions, along with a backtester that simulates trading in a realistic environment. We find that support vector machines yield the most profitable trading strategies, which outperform the market on average for Bitcoin, Ethereum and Litecoin over the past 22 months, since January 2018.
Shifted Randomized Singular Value Decomposition
Among the typical applications of SVD are the low-rank matrix approximation and principal component analysis (PCA) of data matrices (Jolliffe, 2002). Using SVD to accurately estimate a low-rank factorization or the principal components of a data matrix, a mean-centering step should be carried out before performing SVD on the matrix. Despite its simplicity, the mean-centering can be very costly if the data matrix is large and sparse. This cost is because the mean subtraction of a sparse matrix turns it to a dense matrix which requires a considerable amount of memory and CPU time to be analyzed. This motivates us to extend the randomized SVD algorithm introduced by (Halko et al., 2011) to estimate the singular value decomposition of a mean-centered matrix without explicitly forming the matrix in the memory. More generally, we introduce a shifted randomized SVD algorithm that provides for the SVD estimation of a data matrix shifted by any vector in the ali.basirat@lingfil.uu.se 1 arXiv:1911.11772v2
A Unified Framework for Lifelong Learning in Deep Neural Networks
Ling, Charles X., Bohn, Tanner
Humans can learn a variety of concepts and skills incrementally over the course of their lives while exhibiting an array of desirable properties, such as non-forgetting, concept rehearsal, forward transfer and backward transfer of knowledge, few-shot learning, and selective forgetting. Previous approaches to lifelong machine learning can only demonstrate subsets of these properties, often by combining multiple complex mechanisms. In this Perspective, we propose a powerful unified framework that can demonstrate all of the properties by utilizing a small number of weight consolidation parameters in deep neural networks. In addition, we are able to draw many parallels between the behaviours and mechanisms of our proposed framework and those surrounding human learning, such as memory loss or sleep deprivation. This Perspective serves as a conduit for two-way inspiration to further understand lifelong learning in machines and humans.
Using LSTMs for climate change assessment studies on droughts and floods
Kratzert, Frederik, Klotz, Daniel, Brandstetter, Johannes, Hoedt, Pieter-Jan, Nearing, Grey, Hochreiter, Sepp
Climate change affects occurrences of floods and droughts worldwide. However, predicting climate impacts over individual watersheds is difficult, primarily because accurate hydrological forecasts require models that are calibrated to past data. In this work we present a large-scale LSTM-based modeling approach that -- by training on large data sets -- learns a diversity of hydrological behaviors. Previous work shows that this model is more accurate than current state-of-the-art models, even when the LSTM-based approach operates out-of-sample and the latter in-sample. In this work, we show how this model can assess the sensitivity of the underlying systems with regard to extreme (high and low) flows in individual watersheds over the continental US.
A Dynamic Modelling Framework for Human Hand Gesture Task Recognition
Masoud, Sara, Chowdhury, Bijoy, Son, Young-Jun, Kubota, Chieri, Tronstad, Russell
Gesture recognition and hand motion tracking are important tasks in advanced gesture based interaction systems. In this paper, we propose to apply a sliding windows filtering approach to sample the incoming streams of data from data gloves and a decision tree model to recognize the gestures in real time for a manual grafting operation of a vegetable seedling propagation facility. The sequence of these recognized gestures defines the tasks that are taking place, which helps to evaluate individuals' performances and to identify any bottlenecks in real time. In this work, two pairs of data gloves are utilized, which reports the location of the fingers, hands, and wrists wirelessly (i.e., via Bluetooth). To evaluate the performance of the proposed framework, a preliminary experiment was conducted in multiple lab settings of tomato grafting operations, where multiple subjects wear the data gloves while performing different tasks. Our results show an accuracy of 91% on average, in terms of gesture recognition in real time by employing our proposed framework.
Learning stable and predictive structures in kinetic systems: Benefits of a causal approach
Pfister, Niklas, Bauer, Stefan, Peters, Jonas
Learning kinetic systems from data is one of the core challenges in many fields. Identifying stable models is essential for the generalization capabilities of data-driven inference. We introduce a computationally efficient framework, called CausalKinetiX, that identifies structure from discrete time, noisy observations, generated from heterogeneous experiments. The algorithm assumes the existence of an underlying, invariant kinetic model, a key criterion for reproducible research. Results on both simulated and real-world examples suggest that learning the structure of kinetic systems benefits from a causal perspective. The identified variables and models allow for a concise description of the dynamics across multiple experimental settings and can be used for prediction in unseen experiments. We observe significant improvements compared to well established approaches focusing solely on predictive performance, especially for out-of-sample generalization.
Cutting-edge 'social' robot holds BINGO lessons for OAPs in a care home
A cutting-edge'social' robot designed to keep people company, is to hold bingo lessons for pensioners in a British care home as part of a study. Stevie the robot is so advanced he was recently named one of the best inventions of 2019 and featured on the cover of Time Magazine. There he will keep residents involved, entertained, and engaged - and rather bizarrely he will even be leading bingo sessions. Recently back from a visit to the States, Stevie has been placed into the care of experts from the University of Plymouth's Centre for Health Technology. Dr Conor McGinn, assistant professor at Trinity College Dublin, said: 'This pilot is the start of an exciting new relationship with the University of Plymouth.