Load Testing SageMaker Multi-Model Endpoints
Productionizing Machine Learning models is a complicated practice. There's a lot of iteration around different model parameters, hardware configurations, traffic patterns that you will have to test to try to finalize a production grade deployment. Load testing is an essential software engineering practice, but also crucial to apply in the MLOps space to see how performant your model is in a real-world setting. How can we load test? A simple yet highly effective framework is the Python package: Locust. Locust can be used in both a vanilla and distributed mode to simulate up to thousands of Transactions Per Second (TPS).
Feb-28-2023, 13:56:20 GMT
- Technology: