Towards Federated Foundation Models: Scalable Dataset Pipelines for Group-Structured Learning Zachary Charles
–Neural Information Processing Systems
We introduce Dataset Grouper, a library to create large-scale group-structured (e.g., federated) datasets, enabling federated learning simulation at the scale of foundation models. This library facilitates the creation of group-structured versions of existing datasets based on user-specified partitions, and directly leads to a variety of useful heterogeneous datasets that can be plugged into existing software frameworks. Dataset Grouper offers three key advantages. First, it scales to settings where even a single group's dataset is too large to fit in memory. Second, it provides flexibility, both in choosing the base (non-partitioned) dataset and in defining partitions.
Neural Information Processing Systems
Feb-19-2026, 19:13:43 GMT
- Country:
- Asia > Myanmar
- Tanintharyi Region > Dawei (0.04)
- North America > United States
- Virginia (0.04)
- South America > Chile
- Asia > Myanmar
- Genre:
- Research Report > New Finding (0.46)
- Industry:
- Information Technology > Security & Privacy (0.67)
- Technology: