AITopics | cohort size

Collaborating Authors

cohort size

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Enabling Differentially Private Federated Learning for Speech Recognition: Benchmarks, Adaptive Optimizers and Gradient Clipping

Neural Information Processing SystemsJun-20-2026, 22:00:04 GMT

While federated learning (FL) and differential privacy (DP) have been extensively studied, their application to automatic speech recognition (ASR) remains largely unexplored due to the challenges in training large transformer models. Specifically, large models further exacerbate issues in FL as they are particularly susceptible to gradient heterogeneity across layers, unlike the relatively uniform gradient behavior observed in shallow models. As a result, prior works struggle to converge with standard optimization techniques, even in the absence of DP mechanisms. To the best of our knowledge, no existing work establishes a competitive, practical recipe for FL with DP in the context of ASR. To address this gap, we establish the first benchmark for FL with DP in end-to-end ASR. Our approach centers on per-layer clipping and layer-wise gradient normalization: theoretical analysis reveals that these techniques together mitigate clipping bias and gradient heterogeneity across layers in deeper models. Consistent with these theoretical insights, our empirical results show that FL with DP is viable under strong privacy guarantees, provided a population of at least several million users. Specifically, we achieve user-level (7.2, 10 9)-DP (resp.

artificial intelligence, machine learning, seed model, (17 more...)

Neural Information Processing Systems

Country: North America > United States (0.45)

Genre: Research Report > New Finding (1.00)

Industry:

Information Technology > Security & Privacy (1.00)
Education (0.67)

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

ab9ebd57177b5106ad7879f0896685d4-Supplemental.pdf

Neural Information Processing SystemsFeb-10-2026, 14:22:52 GMT

accuracy, cohort size, communication round, (12 more...)

Neural Information Processing Systems

Country:

North America > United States > Virginia (0.04)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)

Genre: Research Report > New Finding (0.46)

Industry: Energy (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.67)

Add feedback

OnLarge-CohortTrainingforFederatedLearning

Neural Information Processing SystemsFeb-10-2026, 14:22:47 GMT

We give partial answers to these questions based on extensive empirical evaluation.

artificial intelligence, arxivpreprintarxiv, machine learning, (16 more...)

Neural Information Processing Systems

Country:

North America > United States > Virginia (0.05)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.69)

Add feedback

On Large-Cohort Training for Federated Learning

Neural Information Processing SystemsDec-24-2025, 17:11:19 GMT

Federated learning methods typically learn a model by iteratively sampling updates from a population of clients. In this work, we explore how the number of clients sampled at each round (the cohort size) impacts the quality of the learned model and the training dynamics of federated learning algorithms. Our work poses three fundamental questions. First, what challenges arise when trying to scale federated learning to larger cohorts? Second, what parallels exist between cohort sizes in federated learning, and batch sizes in centralized learning? Last, how can we design federated learning methods that effectively utilize larger cohort sizes? We give partial answers to these questions based on extensive empirical evaluation. Our work highlights a number of challenges stemming from the use of larger cohorts. While some of these (such as generalization issues and diminishing returns) are analogs of large-batch training challenges, others (including catastrophic training failures and fairness concerns) are unique to federated learning.

federated learning, large-cohort training, name change, (5 more...)

Neural Information Processing Systems

Country: North America > United States > Virginia (0.08)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

f64e55d03e2fe61aa4114e49cb654acb-Paper-Datasets_and_Benchmarks.pdf

Neural Information Processing SystemsAug-19-2025, 20:32:42 GMT

artificial intelligence, machine learning, natural language, (16 more...)

Neural Information Processing Systems

Country: North America > United States > Virginia (0.04)

Genre: Research Report (0.46)

Industry: Information Technology > Security & Privacy (0.93)

Technology:

Information Technology > Communications (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(3 more...)

Add feedback

ab9ebd57177b5106ad7879f0896685d4-Supplemental.pdf

Neural Information Processing SystemsAug-16-2025, 17:01:17 GMT

artificial intelligence, cohort size, machine learning, (17 more...)

Neural Information Processing Systems

Country:

North America > United States > Virginia (0.04)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)

Genre: Research Report > New Finding (0.46)

Industry: Energy (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.67)

Add feedback

ab9ebd57177b5106ad7879f0896685d4-Paper.pdf

Neural Information Processing SystemsAug-16-2025, 17:01:13 GMT

artificial intelligence, cohort size, machine learning, (16 more...)

Neural Information Processing Systems

Country:

North America > United States > Virginia (0.05)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)

Genre: Research Report (0.46)

Industry: Energy (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.68)

Add feedback

On Large-Cohort Training for Federated Learning

Neural Information Processing SystemsJan-18-2025, 14:41:19 GMT

cohort size, federated learning, large-cohort training, (1 more...)

Neural Information Processing Systems

Country: North America > United States > Virginia (0.10)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Federated Learning with Differential Privacy for End-to-End Speech Recognition

Pelikan, Martin, Azam, Sheikh Shams, Feldman, Vitaly, Silovsky, Jan "Honza", Talwar, Kunal, Likhomanenko, Tatiana

arXiv.org Machine LearningSep-29-2023

While federated learning (FL) has recently emerged as a promising approach to train machine learning models, it is limited to only preliminary explorations in the domain of automatic speech recognition (ASR). Moreover, FL does not inherently guarantee user privacy and requires the use of differential privacy (DP) for robust privacy guarantees. However, we are not aware of prior work on applying DP to FL for ASR. In this paper, we aim to bridge this research gap by formulating an ASR benchmark for FL with DP and establishing the first baselines. First, we extend the existing research on FL for ASR by exploring different aspects of recent $\textit{large end-to-end transformer models}$: architecture design, seed models, data heterogeneity, domain shift, and impact of cohort size. With a $\textit{practical}$ number of central aggregations we are able to train $\textbf{FL models}$ that are \textbf{nearly optimal} even with heterogeneous data, a seed model from another domain, or no pre-trained seed model. Second, we apply DP to FL for ASR, which is non-trivial since DP noise severely affects model training, especially for large transformer models, due to highly imbalanced gradients in the attention block. We counteract the adverse effect of DP noise by reviving per-layer clipping and explaining why its effect is more apparent in our case than in the prior work. Remarkably, we achieve user-level ($7.2$, $10^{-9}$)-$\textbf{DP}$ (resp. ($4.5$, $10^{-9}$)-$\textbf{DP}$) with a 1.3% (resp. 4.6%) absolute drop in the word error rate for extrapolation to high (resp. low) population scale for $\textbf{FL with DP in ASR}$.

artificial intelligence, machine learning, seed model, (18 more...)

arXiv.org Machine Learning

2310.00098

Country:

North America > United States > Virginia (0.04)
North America > Canada > Ontario > Toronto (0.04)
South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
(6 more...)

Genre: Research Report > New Finding (1.00)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
Information Technology > Artificial Intelligence > Speech > Speech Recognition (0.90)

Add feedback

Importance of Smoothness Induced by Optimizers in FL4ASR: Towards Understanding Federated Learning for End-to-End ASR

Azam, Sheikh Shams, Likhomanenko, Tatiana, Pelikan, Martin, Silovsky, Jan "Honza"

arXiv.org Artificial IntelligenceSep-22-2023

In this paper, we start by training End-to-End Automatic Speech Recognition (ASR) models using Federated Learning (FL) and examining the fundamental considerations that can be pivotal in minimizing the performance gap in terms of word error rate between models trained using FL versus their centralized counterpart. Specifically, we study the effect of (i) adaptive optimizers, (ii) loss characteristics via altering Connectionist Temporal Classification (CTC) weight, (iii) model initialization through seed start, (iv) carrying over modeling setup from experiences in centralized training to FL, e.g., pre-layer or post-layer normalization, and (v) FL-specific hyperparameters, such as number of local epochs, client sampling size, and learning rate scheduler, specifically for ASR under heterogeneous data distribution. We shed light on how some optimizers work better than others via inducing smoothness. We also summarize the applicability of algorithms, trends, and propose best practices from prior works in FL (in general) toward End-to-End ASR models.

international conference, learning, optimizer, (14 more...)

arXiv.org Artificial Intelligence

2309.13102

Country: South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.94)
Information Technology > Artificial Intelligence > Speech > Speech Recognition (0.69)

Add feedback