On behalf of the entire community, we are proud to announce Kubeflow 1.0, our first major release. Kubeflow was open sourced at Kubecon USA in December 2017, and during the last two years the Kubeflow Project has grown beyond our wildest expectations. There are now hundreds of contributors from over 30 participating organizations. Kubeflow's goal is to make it easy for machine learning (ML) engineers and data scientists to leverage cloud assets (public or on-premise) for ML workloads. You can use Kubeflow on any Kubernetes-conformant cluster.
It's a catalog of reusable models that can be quickly deployed to one of the execution environments of AI Platform. The catalog has a collection of models based on popular frameworks such as Tensorflow, PyTorch, Keras, XGBoost and Scikit-learn. Each of the models is packaged in a format that can be deployed in Kubeflow, deep learning VMs backed by GPU or TPU, Jupyter Notebooks, or Google's own AI APIs. Each model is tagged with labels that make it easy to search and discover content based on a variety of attributes. AI Platform Deep Learning VM Image makes it easy and fast to instantiate a VM image containing the most popular deep learning and machine learning frameworks on a Google Compute Engine instance.
Kubernetes and Machine Learning Kubernetes has quickly become the hybrid solution for deploying complicated workloads anywhere. While it started with just stateless services, customers have begun to move complex workloads to the platform, taking advantage of rich APIs, reliability and performance provided by Kubernetes. One of the fastest growing use cases is to use Kubernetes as the deployment platform of choice for machine learning. Building any production-ready machine learning system involves various components, often mixing vendors and hand-rolled solutions. Connecting and managing these services for even moderately sophisticated setups introduces huge barriers of complexity in adopting machine learning. Infrastructure engineers will often spend a significant amount of time manually tweaking deployments and hand rolling solutions before a single model can be tested. Worse, these deployments are so tied to the clusters they have been deployed to that these stacks are immobile, meaning that moving a model from a laptop to a highly scalable cloud cluster is effectively impossible without significant re-architecture. All these differences add up to wasted effort and create opportunities to introduce bugs at each transition.
"Hybrid data environments are coming – quickly – and early adopters are benefiting from the change. Here's everything you need to know about how Kubernetes clusters can accelerate your big data development." In a few short years, containers have established themselves as indispensable tools for managing portable, stateless applications like Web servers and microservices. But they have taken time to catch on in the world of data science where they have been viewed as too lightweight to package and manage complex, stateful services dealing with big data. Users and vendors are starting to embrace containers and Kubernetes, the most popular orchestration platform, as tools to facilitate deployments of big data systems and applications.