Microsoft's 'Singularity' to Enable Global Accelerator Network for AI Training
In science fiction and future studies, the word "singularity" is invoked in reference to a rapidly snowballing artificial intelligence that, repeatedly iterating on itself, eclipses all human knowledge and ability. It is this word that Microsoft--perhaps ambitiously--has invoked for its new AI project, a "globally distributed scheduling service for highly efficient and reliable execution of deep learning training and inference workloads." Microsoft's Singularity is a response to the computational costs of training deep learning workloads--costs that have quickly spiraled as those workloads have grown in size, complexity and number. It is also an attempt to maximize the use of idle time, which has increasingly become a focus of discussions of how to minimize the costs and environmental footprints of high-performance computing systems and AI model training on such systems. "Singularity is built with one key goal," explains the preprint paper, which was written by a team of more than two dozen Microsoft researchers and published on arXiv, "driving down the cost of AI by maximizing the aggregate useful throughput on a given fixed pool of capacity of accelerators on a planet scale, while providing stringent [service-level agreements] for multiple pricing tiers."
Feb-24-2022, 16:26:21 GMT
- Technology: