Collaborating Authors


Boosting machine learning workflows with GPU-accelerated libraries


Abstract: In this article, we demonstrate how to use RAPIDS libraries to improve machine learning CPU-based libraries such as pandas, sklearn and NetworkX. We use a recommendation study case, which executed 44x faster in the GPU-based library when running the PageRank algorithm and 39x faster for the Personalized PageRank. Scikit-learn and Pandas are part of most data scientists' toolbox because of their friendly API and wide range of useful resources-- from model implementations to data transformation methods. However, many of these libraries still rely on CPU processing and, as far as this thread goes, libraries like Scikit-learn do not intend to scale up to GPU processing or scale out to cluster processing. To overcome this drawback, RAPIDS offers a suite of Python open source libraries that takes these widely used data science solutions and boost them up by including GPU-accelerated implementations while still providing a similar API.