AITopics | dataloader

Collaborating Authors

dataloader

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

6d538a6e667960b168d3d947eb6207a6-Paper-Conference.pdf

Neural Information Processing SystemsFeb-9-2026, 16:05:30 GMT

Prior work tries to improve the sampling locality by enforcing all the training jobs loading the same dataset in the same order and pace. However, such a solution isonly efficient under strong constraints: alljobs are trained onthe same dataset with the same starting moment and training speed. In this paper, we propose a new data loading method for efficiently training parallel DNNs with much flexible constraints. Our method is still highly efficient when different training jobs use different but overlapped datasets and have different starting moments andtrainingspeeds.

artificial intelligence, deep learning, machine learning, (18 more...)

Neural Information Processing Systems

Country: North America > United States > Illinois (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.31)

Add feedback

stable-pretraining-v1: Foundation Model Research Made Simple

Balestriero, Randall, Van Assel, Hugues, BuGhanem, Sami, Maes, Lucas

arXiv.org Artificial IntelligenceNov-26-2025

Foundation models and self-supervised learning (SSL) have become central to modern AI, yet research in this area remains hindered by complex codebases, redundant re-implementations, and the heavy engineering burden of scaling experiments. We present stable-pretraining, a modular, extensible, and performance-optimized library built on top of PyTorch, Lightning, Hugging Face, and TorchMetrics. Unlike prior toolkits focused narrowly on reproducing state-of-the-art results, stable-pretraining is designed for flexibility and iteration speed: it unifies essential SSL utilities--including probes, collapse detection metrics, augmentation pipelines, and extensible evaluation routines--within a coherent and reliable framework. A central design principle is logging everything, enabling fine-grained visibility into training dynamics that makes debugging, monitoring, and reproducibility seamless. We validate the library by demonstrating its ability to generate new research insights with minimal overhead, including depthwise representation probing and the analysis of CLIP degradation under synthetic data finetuning. By lowering barriers to entry while remaining scalable to large experiments, stable-pretraining aims to accelerate discovery and expand the possibilities of foundation model research.

artificial intelligence, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2511.19484

Country: North America (0.14)

Genre: Research Report (0.51)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.52)

Add feedback

Preparation Meets Opportunity: Enhancing Data Preprocessing for ML Training With Seneca

Desai, Omkar, Jiao, Ziyang, Pei, Shuyi, Bhimani, Janki, Kim, Bryan S.

arXiv.org Artificial IntelligenceNov-19-2025

Input data preprocessing is a common bottleneck when concurrently training multimedia machine learning (ML) models in modern systems. To alleviate these bottlenecks and reduce the training time for concurrent jobs, we present Seneca, a data loading system that optimizes cache partitioning and data sampling for the data storage and ingestion (DSI) pipeline. The design of Seneca contains two key techniques. First, Seneca uses a performance model for the data pipeline to optimally partition the cache for three different forms of data (encoded, decoded, and augmented). Second, Seneca opportunistically serves cached data over uncached ones during random batch sampling so that concurrent jobs benefit from each other. We implement Seneca by modifying PyTorch and demonstrate its effectiveness by comparing it against several state-of-the-art caching systems for DNN training. Seneca reduces the makespan by 45.23% compared to PyTorch and increases data processing throughput by up to 3.45x compared to the next best dataloader.

artificial intelligence, information management, machine learning, (21 more...)

arXiv.org Artificial Intelligence

2511.13724

Genre: Research Report (1.00)

Industry: Information Technology > Services (0.46)

Technology:

Information Technology > Information Management (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

OVERLORD: Ultimate Scaling of DataLoader for Multi-Source Large Foundation Model Training

Zhao, Juntao, Lu, Qi, Jia, Wei, Wan, Borui, Zuo, Lei, Feng, Junda, Jiang, Jianyu, Chen, Yangrui, Cao, Shuaishuai, He, Jialing, Jiang, Kaihua, Hu, Yuanzhe, Nong, Shibiao, Peng, Yanghua, Lin, Haibin, Liu, Xin, Wu, Chuan

arXiv.org Artificial IntelligenceMay-20-2025

Modern frameworks for training large foundation models (LFMs) employ dataloaders in a data-parallel manner, with each loader processing a disjoint subset of training data. Under multisource preprocessing, two fundamental challenges exist. First, due to the quadratic computational complexity of the attention operator, the non-uniform sample distribution over data-parallel ranks leads to significant workload imbalance among dataloaders, degrading the training efficiency. Second, supporting diverse data sources requires per-dataset file access states that are redundantly replicated across parallel loaders, consuming excessive memory. This also hinders dynamic data mixing (e.g., curriculum learning) and causes redundant access/memory overhead in hybrid parallelism. We present Omniload, an industrial-grade distributed data loading architecture for LFMs, with four innovations: (1) Disaggregated data preprocessing via role-specific actors (Source Loaders/Data Constructors) to eliminate source and parallelism redundant data access and ensure multisource scalability. (2) Centralized and declarative data plane for elastic multisource orchestration, such as long-short context, multimodality, and curriculum learning. (3) Multi-level auto-partitioning and scaling mechanism for source loaders under heterogeneous preprocessing costs. (4) Shadow loaders with differential checkpointing for fault recovery without workflow interruption. Deployed on production clusters scaling to multi-thousand GPUs, Omniload achieves: (1) 4.5x end-to-end training throughput improvement, (2) 13.5x reduction in CPU memory usage.

data mining, large language model, machine learning, (20 more...)

arXiv.org Artificial Intelligence

2504.09844

Country: North America > United States (0.28)

Genre:

Research Report (0.66)
Workflow (0.48)

Industry: Information Technology (0.47)

Technology:

Information Technology > Information Management (1.00)
Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)
(3 more...)

Add feedback

M^3Builder: A Multi-Agent System for Automated Machine Learning in Medical Imaging

Feng, Jinghao, Zheng, Qiaoyu, Wu, Chaoyi, Zhao, Ziheng, Zhang, Ya, Wang, Yanfeng, Xie, Weidi

arXiv.org Artificial IntelligenceFeb-27-2025

Agentic AI systems have gained significant attention for their ability to autonomously perform complex tasks. However, their reliance on well-prepared tools limits their applicability in the medical domain, which requires to train specialized models. In this paper, we make three contributions: (i) We present M3Builder, a novel multi-agent system designed to automate machine learning (ML) in medical imaging. At its core, M3Builder employs four specialized agents that collaborate to tackle complex, multi-step medical ML workflows, from automated data processing and environment configuration to self-contained auto debugging and model training. These agents operate within a medical imaging ML workspace, a structured environment designed to provide agents with free-text descriptions of datasets, training codes, and interaction tools, enabling seamless communication and task execution. (ii) To evaluate progress in automated medical imaging ML, we propose M3Bench, a benchmark comprising four general tasks on 14 training datasets, across five anatomies and three imaging modalities, covering both 2D and 3D data. (iii) We experiment with seven state-of-the-art large language models serving as agent cores for our system, such as Claude series, GPT-4o, and DeepSeek-V3. Compared to existing ML agentic designs, M3Builder shows superior performance on completing ML tasks in medical imaging, achieving a 94.29% success rate using Claude-3.7-Sonnet as the agent core, showing huge potential towards fully automated machine learning in medical imaging.

dataloader, dataset, json, (15 more...)

arXiv.org Artificial Intelligence

2502.20301

Country:

Asia > China > Shanghai > Shanghai (0.04)
Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Health Care Technology (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

G3Flow: Generative 3D Semantic Flow for Pose-aware and Generalizable Object Manipulation

Chen, Tianxing, Mu, Yao, Liang, Zhixuan, Chen, Zanxin, Peng, Shijia, Chen, Qiangyu, Xu, Mingkun, Hu, Ruizhen, Zhang, Hongyuan, Li, Xuelong, Luo, Ping

arXiv.org Artificial IntelligenceNov-27-2024

Recent advances in imitation learning for 3D robotic manipulation have shown promising results with diffusion-based policies. However, achieving human-level dexterity requires seamless integration of geometric precision and semantic understanding. We present G3Flow, a novel framework that constructs real-time semantic flow, a dynamic, object-centric 3D semantic representation by leveraging foundation models. Our approach uniquely combines 3D generative models for digital twin creation, vision foundation models for semantic feature extraction, and robust pose tracking for continuous semantic flow updates. This integration enables complete semantic understanding even under occlusions while eliminating manual annotation requirements. By incorporating semantic flow into diffusion policies, we demonstrate significant improvements in both terminal-constrained manipulation and cross-object generalization. Extensive experiments across five simulation tasks show that G3Flow consistently outperforms existing approaches, achieving up to 68.3% and 50.1% average success rates on terminal-constrained manipulation and cross-object generalization tasks respectively. Our results demonstrate the effectiveness of G3Flow in enhancing real-time dynamic semantic feature understanding for robotic manipulation policies.

manipulation, representation, shoe place, (15 more...)

arXiv.org Artificial Intelligence

2411.18369

Country:

Asia > China > Hong Kong (0.04)
Asia > China > Guangdong Province > Shenzhen (0.04)

Genre: Research Report > New Finding (0.68)

Industry: Information Technology (0.34)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.35)

Add feedback

ML Research Benchmark

Kenney, Matthew

arXiv.org Artificial IntelligenceOct-29-2024

Artificial intelligence agents are increasingly capable of performing complex tasks across various domains. As these agents advance, there is a growing need to accurately measure and benchmark their capabilities, particularly in accelerating AI research and development. Current benchmarks focus on general machine learning tasks, but lack comprehensive evaluation methods for assessing AI agents' abilities in tackling research-level problems and competition-level challenges in the field of AI. We present the ML Research Benchmark (MLRB), comprising 7 competition-level tasks derived from recent machine learning conference tracks. These tasks span activities typically undertaken by AI researchers, including model training efficiency, pretraining on limited data, domain specific fine-tuning, and model compression. This paper introduces a novel benchmark and evaluates it using agent scaffolds powered by frontier models, including Claude-3 and GPT-4o. The results indicate that the Claude-3.5 Sonnet agent performs best across our benchmark, excelling in planning and developing machine learning models. However, both tested agents struggled to perform non-trivial research iterations. We observed significant performance variations across tasks, highlighting the complexity of AI development and the challenges in creating versatile agent scaffolds. While current AI agents can successfully navigate complex instructions and produce baseline results, they fall short of the capabilities required for advanced AI research. The ML Research Benchmark provides a valuable framework for assessing and comparing AI agents on tasks mirroring real-world AI research challenges.

large language model, machine learning, natural language, (22 more...)

arXiv.org Artificial Intelligence

2410.22553

Country:

Asia > Middle East > Jordan (0.04)
Europe > Monaco (0.04)

Genre: Research Report > New Finding (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Relative Representations: Topological and Geometric Perspectives

García-Castellanos, Alejandro, Marchetti, Giovanni Luca, Kragic, Danica, Scolamiero, Martina

arXiv.org Artificial IntelligenceSep-17-2024

Relative representations are an established approach to zero-shot model stitching, consisting of a non-trainable transformation of the latent space of a deep neural network. Based on insights of topological and geometric nature, we propose two improvements to relative representations. First, we introduce a normalization procedure in the relative transformation, resulting in invariance to non-isotropic rescalings and permutations. The latter coincides with the symmetries in parameter space induced by common activation functions. Second, we propose to deploy topological densification when fine-tuning relative representations, a topological regularization loss encouraging clustering within classes. We provide an empirical investigation on a natural language task, where both the proposed variations yield improved performance on zero-shot model stitching.

relative transformation, representation, transformation, (12 more...)

arXiv.org Artificial Intelligence

2409.10967

Country:

Europe > Sweden (0.05)
Europe > Netherlands > North Holland > Amsterdam (0.04)
Asia (0.04)

Genre: Research Report (0.51)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.49)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.46)

Add feedback

A Partial Replication of MaskFormer in TensorFlow on TPUs for the TensorFlow Model Garden

Purohit, Vishal, Jiang, Wenxin, Ravikiran, Akshath R., Davis, James C.

arXiv.org Artificial IntelligenceApr-29-2024

This paper undertakes the task of replicating the MaskFormer model -- a universal image segmentation model -- originally developed using the PyTorch framework, within the TensorFlow ecosystem, specifically optimized for execution on Tensor Processing Units (TPUs). Our implementation exploits the modular constructs available within the TensorFlow Model Garden (TFMG), encompassing elements such as the data loader, training orchestrator, and various architectural components, tailored and adapted to meet the specifications of the MaskFormer model. We address key challenges encountered during the replication, non-convergence issues, slow training, adaptation of loss functions, and the integration of TPU-specific functionalities. We verify our reproduced implementation and present qualitative results on the COCO dataset. Although our implementation meets some of the objectives for end-to-end reproducibility, we encountered challenges in replicating the Py-Torch version of MaskFormer in TensorFlow. This replication process is not straightforward and requires substantial engineering efforts.

architecture, implementation, maskformer, (15 more...)

arXiv.org Artificial Intelligence

2404.18801

Country: North America > United States (0.14)

Genre:

Workflow (1.00)
Research Report (0.82)

Industry: Information Technology > Services (0.47)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.92)

Add feedback

Filters

Collaborating Authors

dataloader

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

6d538a6e667960b168d3d947eb6207a6-Paper-Conference.pdf

81e3225c6ad49623167a4309eb4b2e75-Supplemental.pdf

stable-pretraining-v1: Foundation Model Research Made Simple

Preparation Meets Opportunity: Enhancing Data Preprocessing for ML Training With Seneca

OVERLORD: Ultimate Scaling of DataLoader for Multi-Source Large Foundation Model Training

M^3Builder: A Multi-Agent System for Automated Machine Learning in Medical Imaging

G3Flow: Generative 3D Semantic Flow for Pose-aware and Generalizable Object Manipulation

ML Research Benchmark

Relative Representations: Topological and Geometric Perspectives

A Partial Replication of MaskFormer in TensorFlow on TPUs for the TensorFlow Model Garden