Explore Amazon SageMaker Serverless Inference for Deploying ML Models - The New Stack


Prisma Cloud from Palo Alto Networks is sponsoring our coverage of AWS re:Invent 2021. Launched at the company's re:Invent 2021 user conference earlier this month, Amazon Web Services' Amazon SageMaker Serverless Inference is a new inference option to deploy machine learning models without configuring and managing the compute infrastructure. It brings some of the attributes of serverless computing, such as scale-to-zero and consumption-based pricing. With serverless inference, SageMaker decides to launch additional instances based on the concurrency and the utilization of existing compute resources. The fundamental difference between the other mechanisms and serverless inference is how the compute infrastructure is provisioned, scaled, and managed. You don't even need to choose an instance type or define the minimum and maximum capacity.