determined
The Deep Learning Tool We Wish We Had In Grad School
Machine learning PhD students are in a unique position: they often need to run large-scale experiments to conduct state-of-the-art research but they don't have the support of the platform teams that industrial ML engineers can rely on. As former PhD students ourselves, we recount our hands-on experience with these challenges and explain how open-source tools like Determined would have made grad school a lot less painful. When we started graduate school as PhD students at Carnegie Mellon University (CMU), we thought the challenge laid in having novel ideas, testing hypotheses, and presenting research. Instead, the most difficult part was building out the tooling and infrastructure needed to run deep learning experiments. While industry labs like Google Brain and FAIR have teams of engineers to provide this kind of support, independent researchers and graduate students are left to manage on their own.
How to speed up a Deep Learning Language model by almost 50X at half the cost - KDnuggets
One of the big headaches in deep learning is that models take forever to train. As an ML engineer, waiting hours or days for training to complete makes iteratively improving your model a slow and frustrating process. In this blog post, we show how to accelerate fine-tuning the ALBERT language model while also reducing costs by using Determined's built-in support for distributed training with AWS spot instances. Originally, ALBERT took over 36 hours to train on a single V100 GPU and cost $112 on AWS. With distributed training and spot instances, training the model using 64 V100 GPUs took only 48 minutes and cost only $47! That's both a 46x performance improvement and a 58% reduction in cost!
Data Scientists Don't Care About Kubernetes
Kubernetes is one of the most important pieces of software produced in the last decade and one of the most influential open source projects ever. Kubernetes has completely revolutionized how applications are developed and how infrastructure is deployed and managed. With Kubernetes' explosive rise, more and more physical hardware is being managed by Kubernetes. This trend has coincided with an explosion in the popularity of deep learning, an extremely computation-demanding technology that can result in a single data scientist occupying dozens of GPUs for weeks at a time. Give data scientists access to more hardware.
The Deep Learning Tool We Wish We Had In Grad School
Machine learning PhD students are in a unique position: they often need to run large-scale experiments to conduct state-of-the-art research but they don't have the support of the platform teams that industrial ML engineers can rely on. As former PhD students ourselves, we recount our hands-on experience with these challenges and explain how open-source tools like Determined would have made grad school a lot less painful. When we started graduate school as PhD students at Carnegie Mellon University (CMU), we thought the challenge laid in having novel ideas, testing hypotheses, and presenting research. Instead, the most difficult part was building out the tooling and infrastructure needed to run deep learning experiments. While industry labs like Google Brain and FAIR have teams of engineers to provide this kind of support, independent researchers and graduate students are left to manage on their own.