Collaborating Authors

Scaling Apache Airflow for Machine Learning Workflows


Apache Airflow is a popular platform to create, schedule and monitor workflows in Python. It has more than 15k stars on Github and it's used by data engineers at companies of all sizes including Twitter, Airbnb and Spotify. If you're using Apache Airflow, your architecture has probably evolved based on the number of tasks and their requirements. While working at Skillup, we first had a few hundred DAGs to execute all our data engineering tasks. Then we started doing machine learning.

DLHub: Model and Data Serving for Science Machine Learning

Abstract--While the Machine Learning (ML) landscape is evolving rapidly, there has been a relative lag in the development of the "learning systems" needed to enable broad adoption. Furthermore, few such systems are designed to support the specialized requirements of scientific ML. Here we present the Data and Learning Hub for science (DLHub), a multi-tenant system that provides both model repository and serving capabilities witha focus on science applications. First, its selfservice modelrepository allows users to share, publish, verify, reproduce, and reuse models, and addresses concerns related to model reproducibility by packaging and distributing models and all constituent components. Second, it implements scalable and low-latency serving capabilities that can leverage parallel and distributed computing resources to democratize access to published modelsthrough a simple web interface. Unlike other model serving frameworks, DLHub can store and serve any Python 3-compatible model or processing function, plus multiple-function pipelines. We show that relative to other model serving systems including TensorFlow Serving, SageMaker, and Clipper, DLHub provides greater capabilities, comparable performance without memoization and batching, and significantly better performance when the latter two techniques can be employed. We also describe early uses of DLHub for scientific applications. I. INTRODUCTION Machine Learning (ML) is disrupting nearly every aspect of computing. Researchers now turn to ML methods to uncover patterns in vast data collections and to make decisions with little or no human input. As ML becomes increasingly pervasive, newsystems are required to support the development, adoption, and application of ML. We refer to the broad class of systems designed to support ML as "learning systems." Learning systems need to support the entire ML lifecycle (see Figure 1), including model development [1, 2]; scalable training across potentially tens of thousands of cores and GPUs [3]; model publication and sharing [4]; and low latency and highthroughput inference[5]; all while encouraging best-practice software engineering when developing models [6].

How to run my Python script on Docker? - Geeky Humans


Writing a Python script seems easy enough. All you need to do is open up your favorite text editor, type in the code, and run the script. Well, it sounds easy enough at first, but you have to remember that your code will need to work on any PC that you are using. If you are working in an environment where it doesn't have the right libraries to run the program, then you are out of luck. With Docker, you can use containers to run your code.

Railyard: how we rapidly train machine learning models with Kubernetes


Stripe uses machine learning to respond to our users' complex, real-world problems. Machine learning powers Radar to block fraud, and Billing to retry failed charges on the network. Stripe serves millions of businesses around the world, and our machine learning infrastructure scores hundreds of millions of predictions across many machine learning models. These models are powered by billions of data points, with hundreds of new models being trained each day. Over time, the volume, quality of data, and number of signals have grown enormously as our models continuously improve in performance.

A First Course on Deploying Python Projects


After all the hard work on developing a project in Python, we want to share our project with other people. It can be your friend or your colleagues. Maybe they do not interested in your code, but they want to run it and make some real use of it. An example is you created a regression model that can predict a value based on input features. Your friend wants to provide their own feature and see what value your model predicts.