AITopics | Cormode, Graham

Plotting

Cormode, Graham

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Federated Analytics in Practice: Engineering for Privacy, Scalability and Practicality

Srinivas, Harish, Cormode, Graham, Honarkhah, Mehrdad, Lurye, Samuel, Hehir, Jonathan, He, Lunwen, Hong, George, Magdy, Ahmed, Huba, Dzmitry, Wang, Kaikai, Guo, Shen, Bhattacharya, Shoubhik

arXiv.org Artificial IntelligenceDec-3-2024

Cross-device Federated Analytics (FA) is a distributed computation paradigm designed to answer analytics queries about and derive insights from data held locally on users' devices. On-device computations combined with other privacy and security measures ensure that only minimal data is transmitted off-device, achieving a high standard of data protection. Despite FA's broad relevance, the applicability of existing FA systems is limited by compromised accuracy; lack of flexibility for data analytics; and an inability to scale effectively. In this paper, we describe our approach to combine privacy, scalability, and practicality to build and deploy a system that overcomes these limitations. Our FA system leverages trusted execution environments (TEEs) and optimizes the use of on-device computing resources to facilitate federated data processing across large fleets of devices, while ensuring robust, defensible, and verifiable privacy safeguards. We focus on federated analytics (statistics and monitoring), in contrast to systems for federated learning (ML workloads), and we flag the key differences.

artificial intelligence, machine learning, query, (19 more...)

arXiv.org Artificial Intelligence

2412.0234

Country: North America > United States (0.46)

Genre: Research Report (0.82)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Towards Robust Federated Analytics via Differentially Private Measurements of Statistical Heterogeneity

Scott, Mary, Cormode, Graham, Maple, Carsten

arXiv.org Artificial IntelligenceNov-28-2024

Statistical heterogeneity is a measure of how skewed the samples of a dataset are. It is a common problem in the study of differential privacy that the usage of a statistically heterogeneous dataset results in a significant loss of accuracy. In federated scenarios, statistical heterogeneity is more likely to happen, and so the above problem is even more pressing. We explore the three most promising ways to measure statistical heterogeneity and give formulae for their accuracy, while simultaneously incorporating differential privacy. We find the optimum privacy parameters via an analytic mechanism, which incorporates root finding methods. We validate the main theorems and related hypotheses experimentally, and test the robustness of the analytic mechanism to different heterogeneity levels. The analytic mechanism in a distributed setting delivers superior accuracy to all combinations involving the classic mechanism and/or the centralized setting. All measures of statistical heterogeneity do not lose significant accuracy when a heterogeneous sample is used.

artificial intelligence, machine learning, statistical heterogeneity, (18 more...)

arXiv.org Artificial Intelligence

2411.04579

Genre: Research Report (1.00)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

Add feedback

Distributed, communication-efficient, and differentially private estimation of KL divergence

Scott, Mary, Biswas, Sayan, Cormode, Graham, Maple, Carsten

arXiv.org Artificial IntelligenceNov-28-2024

Modern applications in data analysis and machine learning work with highdimensional data to support inferences and provide recommendations [1, 2]. Increasingly, the data to support these tasks comes from individuals who hold their data on personal devices such as smartphones and wearables. In the federated model of computation [3, 4], this data remains on the users' devices, which collaborate and cooperate to build accurate models by performing computations and aggregations on their locally held information (e.g., training and fine-tuning small-scale models). A key primitive needed is the ability to compare the distribution of data held by these clients with a reference distribution. For instance, a platform or a service provider would like to know whether the overall behavior of the data is consistent over time for deploying the best fitting and most relevant model. In cases where the data distribution has changed, it may be necessary to trigger model rebuilding or fine-tuning, whereas if there is no change the current model can continue to be used.

artificial intelligence, kl divergence, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2411.16478

Country: Europe > Austria (0.28)

Genre: Research Report (1.00)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Communications (0.87)

Add feedback

Pruning Compact ConvNets for Efficient Inference

Ghosh, Sayan, Prasad, Karthik, Dai, Xiaoliang, Zhang, Peizhao, Wu, Bichen, Cormode, Graham, Vajda, Peter

arXiv.org Artificial IntelligenceJan-11-2023

Neural network pruning is frequently used to compress over-parameterized networks by large amounts, while incurring only marginal drops in generalization performance. However, the impact of pruning on networks that have been highly optimized for efficient inference has not received the same level of attention. In this paper, we analyze the effect of pruning for computer vision, and study state-ofthe-art ConvNets, such as the FBNetV3 family of models. We show that model pruning approaches can be used to further optimize networks trained through NAS (Neural Architecture Search). The resulting family of pruned models can consistently obtain better performance than existing FBNetV3 models at the same level of computation, and thus provide state-of-the-art results when trading off between computational complexity and generalization performance on the ImageNet benchmark. In addition to better generalization performance, we also demonstrate that when limited computation resources are available, pruning FBNetV3 models incur only a fraction of GPU-hours involved in running a full-scale NAS. Neural networks frequently suffer from the problem of over-parameterization, such that the model can be compressed by a large factor to drastically reduce memory footprint, computation as well as energy consumption while maintaining similar performance. This is especially pronounced for models for computer vision (Simonyan & Zisserman, 2014), speech recognition (Pratap et al., 2020) and large text understanding models such as BERT (Devlin et al., 2018).

artificial intelligence, machine learning, pruning, (15 more...)

arXiv.org Artificial Intelligence

2301.04502

Genre: Research Report > New Finding (0.68)

Industry: Energy (0.48)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

Learning Graphical Models from a Distributed Stream

Zhang, Yu, Tirthapura, Srikanta, Cormode, Graham

arXiv.org Machine LearningOct-5-2017

A current challenge for data management systems is to support the construction and maintenance of machine learning models over data that is large, multi-dimensional, and evolving. While systems that could support these tasks are emerging, the need to scale to distributed, streaming data requires new models and algorithms. In this setting, as well as computational scalability and model accuracy, we also need to minimize the amount of communication between distributed processors, which is the chief component of latency. We study Bayesian networks, the workhorse of graphical models, and present a communication-efficient method for continuously learning and maintaining a Bayesian network model over data that is arriving as a distributed stream partitioned across multiple processors. We show a strategy for maintaining model parameters that leads to an exponential reduction in communication when compared with baseline approaches to maintain the exact MLE (maximum likelihood estimation). Meanwhile, our strategy provides similar prediction errors for the target distribution and for classification tasks.

bayesian inference, communication cost, survey article, (18 more...)

arXiv.org Machine Learning

1710.02103

Genre: Research Report (0.64)

Industry: Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback