But there are other tools that also claim to make machine learning easier and speed model development. I am wondering how they compare? So, this week, I am taking a look at Amazon SageMaker (SageMaker) and how it compares to Studio. What I found when I looked at SageMaker in comparison to Studio is a significantly different approach to model building. The vendors of each tool would both claim to offer a fully managed service that covers the entire machine learning workflow to build, train, and deploy machine learning models quickly.
At re:Invent 2018, AWS added many capabilities to Amazon SageMaker, a machine learning platform as a service. SageMaker Neo was announced as an extension of SageMaker that optimizes fully-trained ML models for various deployment targets. Neo-AI project turns SageMaker Neo into an open source project making it possible for hardware and software vendors to extend the platform. Machine learning models have two distinct phases – training and inference. Data scientists and developers select the right algorithm that's most appropriate for the business problem.
Amazon SageMaker makes it easy to train (and deploy) Machine Learning models at scale. Thanks to its Python SDK, developers can first experiment with their data set and model using a notebook instance. Once they're happy with a model, it's quite likely that they will need to train it again and again with new data or new parameters. The SageMaker SDK is great when experimenting, but it's too large to fit in a Lambda package. No worries though: the SageMaker client in boto3 includes a CreateTrainingJob API that will serve our purpose just fine.
All Amazon SageMaker API operations are now fully supported via AWS PrivateLink, which increases the security of data shared with cloud-based applications by reducing data exposure to the internet. In this blog, I show you how to set up a VPC endpoint to secure your Amazon SageMaker API calls using AWS PrivateLink. AWS PrivateLink traffic doesn't traverse the internet, which reduces the exposure to threats such as brute force and distributed denial of service attacks. Because all communication between your application and Amazon SageMaker API operations is inside your VPC, you don't need an internet gateway, a NAT device, a VPN connection, or AWS Direct Connect to communicate with Amazon SageMaker. Instead, AWS PrivateLink enables you to privately access all Amazon SageMaker API operations from your VPC in a scalable manner by using interface VPC endpoints.
Not only does Amazon SageMaker provide easy scalability and distribution to train and host ML models, it is modularized so that the process of training a model is decoupled from deploying the model. This means that models that are trained outside of Amazon SageMaker can be brought into SageMaker only to be deployed. This is very useful if you have models that are already trained, and you want to use only the hosting part of SageMaker instead of the entire pipeline. This is also useful if you don't train your own models, but you buy pre-trained models. This blog post explains how to deploy your own models on Amazon SageMaker that have been trained on TensorFlow or MXNet.