Goto

Collaborating Authors

kubernete


How to setup TensorFlow on Ubuntu

#artificialintelligence

How to setup TensorFlow on Ubuntu - This tutorial will help you set up TensorFlow 1.12 on Ubuntu 16.04 with a GPU using Docker and nvidia-docker. TensorFlow is one of the most popular deep-learning libraries. It was created by Google and was released as an open-source project in 2015. TensorFlow is used for both research and production environments. Installing TensorFlow can be cumbersome.


Telstra throws deep learning at its network challenges

#artificialintelligence

Telstra is running deep learning algorithms over its Networks data to predict equipment failures before they occur and to find ways to address voice and SMS scams. Data science (Networks) team manager Tim Osborne revealed the project, which is codenamed Telstra AI Lab or TAIL, in a presentation to IBM's Think 2020 conference overnight. TAIL is operating on a still-evolving applied data science platform pieced together with IBM's assistance. It uses a mix of existing Cisco UCS C240s and new IBM Power System AC922s for compute, and a Kubernetes-based stack on top, including Kubeflow, which is used to run machine learning algorithms on Kubernetes. Osborne said TAIL was supported by a team of 25 data scientists and date engineers, who worked "with network engineering folks end-to-end across the business, looking to solve some of their most challenging problems with data science."


NetApp working on Application-Integrated Data Management for Kubernetes - Express Computer

#artificialintelligence

NetApp the leader in cloud data services, today introduced Project Astra, a vision for a software-defined platform that is currently in development with the Kubernetes community. Project Astra will deliver the industry's most robust, easy-to-consume, enterprise-class storage and data services platform for Kubernetes that enables both application and data portability for stateful applications. Although companies everywhere are rapidly adopting Kubernetes, many organizations lack reliable data and application services, and have difficulty making application data as portable as the applicationsthemselves arein Kubernetes. Yet to meet the standards that CIOs expect, IT teams and site reliability engineers must find a way to store, govern, protect, and replicate the data for both stateless and stateful cloud-native applications with enterprise-class cloud storage and data services. Project Astra is being purpose-built for and in collaboration withKubernetes developers and operations managers to help bridge the fundamental gap that exists between the popularity of containers today,the capabilities and user experience they require, and their ability to deliver true, comprehensive portability.


Run:AI Leverages Kubernetes to Virtualize GPUs - Container Journal

#artificialintelligence

Run:AI this week announced the general availability of a namesake platform based on Kubernetes that enables IT teams to virtualize graphical processor unit (GPU) resources. Company CEO Omri Geller says the goal is to enable IT teams to maximize investments in expensive GPUs by leveraging a single line of code to plug in its platform on top of Kubernetes. That would enable IT teams to take advantage of container orchestration to schedule artificial intelligence (AI) workloads across multiple GPUs, and allows certain AI workloads to be prioritized over others, he says. Geller notes that GPUs don't lend themselves well to traditional virtual machines. Kubernetes provides an alternative approach to virtualizing bare-metal GPU resources, which are among the most expensive IT infrastructure resource any IT organization can invoke in the cloud or deploy in on-premises IT environments.


Kubeflow 1.0 solves machine learning workflows with Kubernetes

#artificialintelligence

Kubeflow, Google's solution for deploying machine learning stacks on Kubernetes, is now available as an official 1.0 release. Kubeflow was built to address two major issues with machine learning projects: the need for integrated, end-to-end workflows, and the need to make deploments of machine learning systems simple, manageable, and scalable. Kubeflow allows data scientists to build machine learning workflows on Kubernetes and to deploy, manage, and scale machine learning models in production without learning the intricacies of Kubernetes or its components. Kubeflow is designed to manage every phase of a machine learning project: writing the code, building the containers, allocating the Kubernetes resources to run them, training the models, and serving predictions from those models. The Kubeflow 1.0 release provides tools, such as Jupyter notebooks for working with data experiments and a web-based dashboard UI for general oversight, to help with each phase.


What would machine learning look like if you mixed in DevOps? Wonder no more, we lift the lid on MLOps

#artificialintelligence

Achieving production-level governance with machine-learning projects currently presents unique challenges. A new space of tools and practices is emerging under the name MLOps. The space is analogous to DevOps but tailored to the practices and workflows of machine learning. Machine learning models make predictions for new data based on the data they have been trained on. Managing this data in a way that can be safely used in live environments is challenging, and one of the key reasons why 80 per cent of data science projects never make it to production – an estimate from Gartner.


Kubeflow 1.0 Brings a Production-Ready Machine Learning Toolset to Kubernetes - The New Stack

#artificialintelligence

For developers looking to more easily parallelize (and more) their machine learning (ML) workloads using Kubernetes, the open source project Kubeflow has reached version 1.0 this week. The now production-ready offers "a core set of stable applications needed to develop, build, train, and deploy models on Kubernetes efficiently." The project was first open sourced in December 2017 at KubeCon CloudNativeCon and has since grown to hundreds of contributors from more than 30 participating organizations such as Google, Cisco, IBM, Microsoft, Red Hat, Amazon Web Services and Alibaba. Alongside the blog post from the Kubeflow team itself, Google has offered a post on how Kubeflow works with Anthos, while IBM's Animesh Singh explores the "highlights of the work where we collaborated with the Kubeflow community leading toward an enterprise-grade Kubeflow 1.0." In an interview with The New Stack, Singh explained the origins of Kubeflow as one attempting to simply bring TensorFlow to Kubernetes.


Kubernetes Gets an Automated ML Workflow

#artificialintelligence

A stable version of an automation tool released this week aims to make life easier machine learning developers training and scaling models, then deploying ML workloads atop Kubernetes clusters. Roughly two years after its open source release, Kubeflow 1.0 leverages the de facto standard cluster orchestrator to aid data scientists and ML developers in tapping cloud resources to run those workloads in production. Among the stable workflow applications released on Monday (March 2) are a central dashboard, Jupyter notebook controller and web application along with TensorFlow and PyTorch operators for distributed training. Contributors from Google, IBM, Cisco Systems, Microsoft and data management specialist Arrikto said Jupyter notebooks can be used to streamline model development. Other tools can then be used to build application containers and leverage Kubernetes resources to train models.


Ten strategies to implement AI on the Cloud and Edge

#artificialintelligence

The deployment of Machine Learning and Deep Learning algorithms on Edge devices is a complex undertaking. In this post, I list the strategies for deploying AI to Edge devices end-to-end i.e. for the full pipeline covering machine learning (building modules) and deployment (devops) I welcome your comments on additional ideas that could be included. In subsequent posts, I will elaborate these ideas in detail and ultimately, this will a free book on Data Science Central. I will take a use-case based approach i.e. each section would start with a use case. Many IoT applications are simple telemetry applications i.e. data is captured using a single sensor and action is undertaken based on the data.


GPU-as-a-Service on KubeFlow: Fast, Scalable and Efficient ML

#artificialintelligence

Machine Learning (ML) and Deep Learning (DL) involve compute and data intensive tasks. In order to maximize our model accuracy, we want to train on larger datasets, evaluate a variety of algorithms, and try out different parameters for each algorithm (hyper-parameter tuning). As our datasets and model complexity grow, so does the time we need to wait for our jobs to complete, leading to inefficient use of our time. We end up running fewer iterations and tests or working on smaller datasets as a result. NVIDIA GPUs are a great tool to accelerate our data science work.