No matter the industry, data science has become a universal toolkit for businesses. Data analytics and machine learning give organizations insights and answers that shape their day-to-day actions and future plans. Being data-driven has become essential to lead any industry. While the world's data doubles each year, CPU computing has hit a brick wall with the end of Moore's law. For this reason, scientific computing and deep learning have turned to NVIDIA GPU acceleration.
At a keynote at the GPU Technology Conference in Munich today, Nvidia, the video/graphics company turned Artificial Intelligence (AI) juggernaut, is today going another step forward in the AI direction. This time though, Nvidia isn't announcing a new Graphics Processing Unit (GPU) platform, or a new proprietary SDK for deep learning, but is instead announcing new a set of new open source libraries for GPU-accelerated analytics and machine learning (ML). Rapid AI movement Dubbed RAPIDS, the new library set will offer Python interfaces similar to those provided by Scikit Learn and Pandas, but which will leverage the company's CUDA platform for acceleration across one or multiple GPUs. According to Nvidia CEO Jensen Huang, who briefed a number of technology journalists by phone on Tuesday, Nvidia has seen 50x speed up in training times when using RAPIDS versus a CPU-only implementation. Integrations and partners RAPIDS apparently incorporates in-memory columnar data technology Apache Arrow, and is designed to run on Apache Spark.
Nvidia has been more than a hardware company for a long time. As its GPUs are broadly used to run machine learning workloads, machine learning has become a key priority for Nvidia. In its GTC event this week, Nvidia made a number of related points, aiming to build on machine learning and extend to data science and analytics. Nvidia wants to "couple software and hardware to deliver the advances in computing power needed to transform data into insights and intelligence." Jensen Huang, Nvidia CEO, emphasized the collaborative aspect between chip architecture, systems, algorithms and applications.
NVIDIA has launched an open source project called Real-time Acceleration Platform for Integrated Data Science (RAPIDS) that aims to deliver end-to-end data science infrastructure based on GPUs. GPU-backed machines play an essential role in generating machine learning models. Data scientists run training jobs that are computationally intensive on GPUs. Massive datasets that are converted into complex matrices of numbers are used as an input for machine learning and deep learning models. During the training process, these matrices are added, multiplied and subtracted from other complex matrices.
NEWSBYTE IBM has announced a new partnership with AI and GPU hardware giant Nvidia, bringing the latter's Rapids open source data science toolkit into IBM's data science platform for on-premise, hybrid, and multi-cloud environments. Rapids will bring GPU acceleration capabilities to IBM's offerings, taking advantage of an ecosystem that includes the Web-based big data platform, Anaconda (an open source distribution of the Python and R programming languages for data science and machine learning), Apache Arrow, Pandas, and scikit-learn. Rapids is also supported by open-source contributors, including BlazingDB, Graphistry, NERSC, PyData, INRIA, and Ursa Labs. IBM's Power 9 with PowerAI environment will be among those benefiting from the tie-up. It will use Rapids to expand the options available to data scientists with new open-source machine learning and analytics libraries.