Plotting

 Cormode, Graham


Federated Analytics in Practice: Engineering for Privacy, Scalability and Practicality

arXiv.org Artificial Intelligence

Cross-device Federated Analytics (FA) is a distributed computation paradigm designed to answer analytics queries about and derive insights from data held locally on users' devices. On-device computations combined with other privacy and security measures ensure that only minimal data is transmitted off-device, achieving a high standard of data protection. Despite FA's broad relevance, the applicability of existing FA systems is limited by compromised accuracy; lack of flexibility for data analytics; and an inability to scale effectively. In this paper, we describe our approach to combine privacy, scalability, and practicality to build and deploy a system that overcomes these limitations. Our FA system leverages trusted execution environments (TEEs) and optimizes the use of on-device computing resources to facilitate federated data processing across large fleets of devices, while ensuring robust, defensible, and verifiable privacy safeguards. We focus on federated analytics (statistics and monitoring), in contrast to systems for federated learning (ML workloads), and we flag the key differences.


Towards Robust Federated Analytics via Differentially Private Measurements of Statistical Heterogeneity

arXiv.org Artificial Intelligence

Statistical heterogeneity is a measure of how skewed the samples of a dataset are. It is a common problem in the study of differential privacy that the usage of a statistically heterogeneous dataset results in a significant loss of accuracy. In federated scenarios, statistical heterogeneity is more likely to happen, and so the above problem is even more pressing. We explore the three most promising ways to measure statistical heterogeneity and give formulae for their accuracy, while simultaneously incorporating differential privacy. We find the optimum privacy parameters via an analytic mechanism, which incorporates root finding methods. We validate the main theorems and related hypotheses experimentally, and test the robustness of the analytic mechanism to different heterogeneity levels. The analytic mechanism in a distributed setting delivers superior accuracy to all combinations involving the classic mechanism and/or the centralized setting. All measures of statistical heterogeneity do not lose significant accuracy when a heterogeneous sample is used.


Distributed, communication-efficient, and differentially private estimation of KL divergence

arXiv.org Artificial Intelligence

Modern applications in data analysis and machine learning work with highdimensional data to support inferences and provide recommendations [1, 2]. Increasingly, the data to support these tasks comes from individuals who hold their data on personal devices such as smartphones and wearables. In the federated model of computation [3, 4], this data remains on the users' devices, which collaborate and cooperate to build accurate models by performing computations and aggregations on their locally held information (e.g., training and fine-tuning small-scale models). A key primitive needed is the ability to compare the distribution of data held by these clients with a reference distribution. For instance, a platform or a service provider would like to know whether the overall behavior of the data is consistent over time for deploying the best fitting and most relevant model. In cases where the data distribution has changed, it may be necessary to trigger model rebuilding or fine-tuning, whereas if there is no change the current model can continue to be used.


Pruning Compact ConvNets for Efficient Inference

arXiv.org Artificial Intelligence

Neural network pruning is frequently used to compress over-parameterized networks by large amounts, while incurring only marginal drops in generalization performance. However, the impact of pruning on networks that have been highly optimized for efficient inference has not received the same level of attention. In this paper, we analyze the effect of pruning for computer vision, and study state-ofthe-art ConvNets, such as the FBNetV3 family of models. We show that model pruning approaches can be used to further optimize networks trained through NAS (Neural Architecture Search). The resulting family of pruned models can consistently obtain better performance than existing FBNetV3 models at the same level of computation, and thus provide state-of-the-art results when trading off between computational complexity and generalization performance on the ImageNet benchmark. In addition to better generalization performance, we also demonstrate that when limited computation resources are available, pruning FBNetV3 models incur only a fraction of GPU-hours involved in running a full-scale NAS. Neural networks frequently suffer from the problem of over-parameterization, such that the model can be compressed by a large factor to drastically reduce memory footprint, computation as well as energy consumption while maintaining similar performance. This is especially pronounced for models for computer vision (Simonyan & Zisserman, 2014), speech recognition (Pratap et al., 2020) and large text understanding models such as BERT (Devlin et al., 2018).


Learning Graphical Models from a Distributed Stream

arXiv.org Machine Learning

A current challenge for data management systems is to support the construction and maintenance of machine learning models over data that is large, multi-dimensional, and evolving. While systems that could support these tasks are emerging, the need to scale to distributed, streaming data requires new models and algorithms. In this setting, as well as computational scalability and model accuracy, we also need to minimize the amount of communication between distributed processors, which is the chief component of latency. We study Bayesian networks, the workhorse of graphical models, and present a communication-efficient method for continuously learning and maintaining a Bayesian network model over data that is arriving as a distributed stream partitioned across multiple processors. We show a strategy for maintaining model parameters that leads to an exponential reduction in communication when compared with baseline approaches to maintain the exact MLE (maximum likelihood estimation). Meanwhile, our strategy provides similar prediction errors for the target distribution and for classification tasks.