Collaborating Authors

End-to-End Machine Learning in JavaScript Using Danfo.js and TensorFlow.js (part 3)


This is the third and final part of a three-part series. I suggest you read parts 1 and 2 first for better understanding. In the first part of the series, we got introduced to danfo.js, a new JavaScript package that provides fast, flexible, and expressive data structures designed to make working with "relational" or "labeled" data both easy and intuitive. The second part dealt extensively with data pre-processing for model building, training, and evaluation with TensorFlow.js and danfo.js in an Observable notebook. In Pythonic data science end-to-end projects, notebooks are converted into scripts during deployment or package building.

End-to-End Machine Learning in JavaScript Using Danfo.js and TensorFlow.js (part 2)


This is the second part of a three-part series. In the first part of the series, we were introduced to danfo.js, a new JavaScript package that provides fast, flexible, and expressive data structures designed to make working with "relational" or "labeled" data easier and more intuitive. In part 1, we analyzed our data to spot trends that can be very useful for feature engineering, model building, and interpretation. If we check the dataset, we can see that the features have all been converted into numerical features, and there's little or no need for feature engineering without a base model. Therefore, we can start with the normalization of the dataset.

Serverless Model Serving for Data Science Artificial Intelligence

Machine learning (ML) is an important part of modern data science applications. Data scientists today have to manage the end-to-end ML life cycle that includes both model training and model serving, the latter of which is essential, as it makes their works available to end-users. Systems for model serving require high performance, low cost, and ease of management. Cloud providers are already offering model serving options, including managed services and self-rented servers. Recently, serverless computing, whose advantages include high elasticity and fine-grained cost model, brings another possibility for model serving. In this paper, we study the viability of serverless as a mainstream model serving platform for data science applications. We conduct a comprehensive evaluation of the performance and cost of serverless against other model serving systems on two clouds: Amazon Web Service (AWS) and Google Cloud Platform (GCP). We find that serverless outperforms many cloud-based alternatives with respect to cost and performance. More interestingly, under some circumstances, it can even outperform GPU-based systems for both average latency and cost. These results are different from previous works' claim that serverless is not suitable for model serving, and are contrary to the conventional wisdom that GPU-based systems are better for ML workloads than CPU-based systems. Other findings include a large gap in cold start time between AWS and GCP serverless functions, and serverless' low sensitivity to changes in workloads or models. Our evaluation results indicate that serverless is a viable option for model serving. Finally, we present several practical recommendations for data scientists on how to use serverless for scalable and cost-effective model serving.

Super Simple Scikit-Learn APIs in AWS


The cloud is an amazing place! You can start a company with world-class infrastructure for pennies in very little code. It seems too good to be true but this is the world of today. That being said, it's not all & . I find whenever I try and do things in the cloud it always takes me WAY longer to figure it out than I expect.

Heartbeat Newsletter: Volume 11


This week, we want to do something a little different and highlight a couple of new and exciting developments from our contributor community. First, we want to congratulate Rising Odegua, Stephen Oni, and their colleagues on the release of danfo.js, a new library for processing structured data in JavaScript. This new library is, quite simply, a game changer for JavaScript developers who want to work more intensively with data, and especially for those who want to work with ML (i.e. It mirrors Pandas in functionality--if you've worked in the data science field, then chances are you're quite familiar with this widely-used Python library for data analysis. Danfo.js was built to offer an equally-rich data library in JavaScript, which, before now, wasn't available in a single library.