I/O in Machine Learning Applications on HPC Systems: A 360-degree Survey
Lewis, Noah, Bez, Jean Luca, Byna, Suren
–arXiv.org Artificial Intelligence
Because of the increased popularity of Machine Learning (ML) workloads, there is a rising demand for I/O systems that can effectively accommodate their distinct I/O access patterns. Write operation bursts commonly dominate traditional workloads; however, ML workloads are usually read-intensive and use many small files [99]. Due to the absence of a well-established consensus on the preferred I/O stack for ML workloads, numerous developers resort to crafting their own ad-hoc algorithms and storage systems to cater to the specific requirements of their applications [50]. This can result in sub-optimal application performance due to the under-utilization of the storage system, prompting the necessity for novel I/O optimization methods tailored to the demands of ML workloads. In Figure 1, we show the evolving I/O stack used for running ML workloads (on the right side) in comparison with the traditional HPC I/O stack (on the left side). Traditional HPC I/O stack has been developed to support massive parallelism. Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page.
arXiv.org Artificial Intelligence
Apr-16-2024
- Country:
- Asia
- India > Maharashtra
- Pune (0.04)
- Japan > Honshū
- Kansai > Kyoto Prefecture > Kyoto (0.04)
- Middle East > Iraq
- Erbil Governorate > Erbil (0.04)
- India > Maharashtra
- Europe
- Italy > Calabria
- Catanzaro Province > Catanzaro (0.04)
- Switzerland (0.04)
- United Kingdom > England
- Cambridgeshire > Cambridge (0.04)
- Italy > Calabria
- North America
- Canada > British Columbia
- United States
- New York > New York County
- New York City (0.14)
- Missouri > St. Louis County
- St. Louis (0.04)
- Washington > King County
- Renton (0.04)
- Georgia > Fulton County
- Atlanta (0.04)
- Virginia (0.04)
- Ohio (0.04)
- Utah > Salt Lake County
- Salt Lake City (0.04)
- California > Santa Clara County
- Palo Alto (0.04)
- Louisiana > Orleans Parish
- New Orleans (0.04)
- Texas > Dallas County
- Dallas (0.04)
- Florida > Orange County
- Orlando (0.04)
- New York > New York County
- Asia
- Genre:
- Research Report (0.50)
- Industry:
- Energy (0.93)
- Government > Regional Government
- Information Technology > Services (0.67)
- Technology: