AITopics | Scott, Mary

Collaborating Authors

Scott, Mary

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Towards Robust Federated Analytics via Differentially Private Measurements of Statistical Heterogeneity

Scott, Mary, Cormode, Graham, Maple, Carsten

arXiv.org Artificial IntelligenceNov-28-2024

Statistical heterogeneity is a measure of how skewed the samples of a dataset are. It is a common problem in the study of differential privacy that the usage of a statistically heterogeneous dataset results in a significant loss of accuracy. In federated scenarios, statistical heterogeneity is more likely to happen, and so the above problem is even more pressing. We explore the three most promising ways to measure statistical heterogeneity and give formulae for their accuracy, while simultaneously incorporating differential privacy. We find the optimum privacy parameters via an analytic mechanism, which incorporates root finding methods. We validate the main theorems and related hypotheses experimentally, and test the robustness of the analytic mechanism to different heterogeneity levels. The analytic mechanism in a distributed setting delivers superior accuracy to all combinations involving the classic mechanism and/or the centralized setting. All measures of statistical heterogeneity do not lose significant accuracy when a heterogeneous sample is used.

artificial intelligence, machine learning, statistical heterogeneity, (18 more...)

arXiv.org Artificial Intelligence

2411.04579

Genre: Research Report (1.00)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

Add feedback

Distributed, communication-efficient, and differentially private estimation of KL divergence

Scott, Mary, Biswas, Sayan, Cormode, Graham, Maple, Carsten

arXiv.org Artificial IntelligenceNov-28-2024

Modern applications in data analysis and machine learning work with highdimensional data to support inferences and provide recommendations [1, 2]. Increasingly, the data to support these tasks comes from individuals who hold their data on personal devices such as smartphones and wearables. In the federated model of computation [3, 4], this data remains on the users' devices, which collaborate and cooperate to build accurate models by performing computations and aggregations on their locally held information (e.g., training and fine-tuning small-scale models). A key primitive needed is the ability to compare the distribution of data held by these clients with a reference distribution. For instance, a platform or a service provider would like to know whether the overall behavior of the data is consistent over time for deploying the best fitting and most relevant model. In cases where the data distribution has changed, it may be necessary to trigger model rebuilding or fine-tuning, whereas if there is no change the current model can continue to be used.

artificial intelligence, kl divergence, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2411.16478

Country: Europe > Austria (0.28)

Genre: Research Report (1.00)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Communications (0.87)

Add feedback

A robust synthetic data generation framework for machine learning in High-Resolution Transmission Electron Microscopy (HRTEM)

DaCosta, Luis Rangel, Sytwu, Katherine, Groschner, Catherine, Scott, Mary

arXiv.org Artificial IntelligenceSep-12-2023

Machine learning techniques are attractive options for developing highly-accurate automated analysis tools for nanomaterials characterization, including high-resolution transmission electron microscopy (HRTEM). However, successfully implementing such machine learning tools can be difficult due to the challenges in procuring sufficiently large, high-quality training datasets from experiments. In this work, we introduce Construction Zone, a Python package for rapidly generating complex nanoscale atomic structures, and develop an end-to-end workflow for creating large simulated databases for training neural networks. Construction Zone enables fast, systematic sampling of realistic nanomaterial structures, and can be used as a random structure generator for simulated databases, which is important for generating large, diverse synthetic datasets. Using HRTEM imaging as an example, we train a series of neural networks on various subsets of our simulated databases to segment nanoparticles and holistically study the data curation process to understand how various aspects of the curated simulated data -- including simulation fidelity, the distribution of atomic structures, and the distribution of imaging conditions -- affect model performance across several experimental benchmarks. Using our results, we are able to achieve state-of-the-art segmentation performance on experimental HRTEM images of nanoparticles from several experimental benchmarks and, further, we discuss robust strategies for consistently achieving high performance with machine learning in experimental settings using purely synthetic data.

artificial intelligence, deep learning, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2309.06122

Country: North America > United States > California > Alameda County > Berkeley (0.14)

Genre: Research Report > New Finding (0.48)

Industry: Government > Regional Government > North America Government > United States Government (0.93)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Add feedback