Goto

Collaborating Authors

 batch ai


How to Do Distributed Deep Learning for Object Detection Using Horovod on Azure

#artificialintelligence

This post is co-authored by Mary Wahl, Data Scientist, Xiaoyong Zhu, Program Manager, Siyu Yang, Software Development Engineer, and Wee Hyong Tok, Principal Data Scientist Manager, at Microsoft. Object detection powers some of the most widely adopted computer vision applications, from people counting in crowd control to pedestrian detection used by self-driving cars. Training an object detection model can take up to weeks on a single GPU, a prohibitively long time for experimenting with hyperparameters and model architectures. This blog will show how you can train an object detection model by distributing deep learning training to multiple GPUs. These GPUs can be on a single machine or several machines.


Scale up your deep learning with Batch AI preview Blog Microsoft Azure

#artificialintelligence

Imagine reducing your training time for an epoch from 30 minutes to 30 seconds, and testing many different hyper-parameter weights in parallel. Available now, in public preview, Batch AI is a new service that helps you train and test deep learning and other AI or machine learning models with the same scale and flexibility used by Microsoft's data scientists. Managed clusters of GPUs enable you to design larger networks, run experiments in parallel and at scale to reduce iteration time and make development easier and more productive. Spin up a cluster when you need GPUs, then turn them off when you're done and stop the bill. Developing powerful AI involves combining large data sets for training with clusters of GPUs for experimenting with network design and optimization of hyper-parameters.