GitHub - bentoml/BentoML: Model Serving Made Easy
BentoML is a flexible, high-performance framework for serving, managing, and deploying machine learning models. By providing a standard interface for describing a prediction service, BentoML abstracts away how to run model inference efficiently and how model serving workloads can integrate with cloud infrastructures. Be sure to check out deployment overview doc to understand which deployment option is best suited for your use case. BentoML provides APIs for defining a prediction service, a servable model so to speak, which includes the trained ML model itself, plus its pre-processing, post-processing code, input/output specifications and dependencies. The generated BentoML bundle is a file directory that contains all the code files, serialized models, and configs required for reproducing this prediction service for inference. BentoML automatically captures all the python dependencies information and have everything versioned and managed together in one place.
Sep-8-2021, 07:00:25 GMT