Goto

Collaborating Authors

 nebullvm


New Major Release for Nebullvm Speeds Up AI Inference by 2-30x

#artificialintelligence

Nebuly is very excited to announce the new major release nebullvm 0.3.0, Nebullvm is an open-source library that generates an optimized version of your deep learning model that runs 2 to 10 times faster in inference without performance loss by leveraging multiple deep learning compilers (OpenVINO, TensorRT, ONNX Runtime, TVM, etc.). This additional acceleration is achieved by exploiting optimization techniques that slightly modify the model graph to make it lighter, such as quantization, half precision, distillation, sparsity, etc. Find tutorials and examples on how to use nebullvm, as well as installation instructions in the main readme of nebullvm library. It takes a few lines of code to install the library and optimize your models. The library now works on most CPU and GPU and will soon support TPU and other deep learning-specific ASIC.


Almost no one knows how easily you can optimize your AI models

#artificialintelligence

The situation is fairly simple. Your model could run 10 times faster by adding a few lines to your code, but you weren't aware of it. Let me expand on that. This problem bothered me for a long time, so with a couple of buddies at Nebuly (all former MIT, ETH and EPFL), we put a lot of energy into an open-source library called nebullvm to make DL compiler technology accessible to any developer, even for those who know nothing about hardware, as I did. It speeds up your DL models by 5–20x by testing the best DL compilers out there and selecting the optimal one to best couple your AI model with your machine (GPU, CPU, etc.).


Nebullvm, an open-source library to accelerate AI inference by 5–20x in a few lines of code

#artificialintelligence

It takes your AI model as input and outputs an optimized version that runs 5–20 times faster on your hardware. In other words, nebullvm tests multiple deep learning compilers to identify the best…