Goto

Collaborating Authors

Towards Automated Machine Learning: Evaluation and Comparison of AutoML Approaches and Tools

arXiv.org Machine Learning

--There has been considerable growth and interest in industrial applications of machine learning (ML) in recent years. ML engineers, as a consequence, are in high demand across the industry, yet improving the efficiency of ML engineers remains a fundamental challenge. Automated machine learning (AutoML) has emerged as a way to save time and effort on repetitive tasks in ML pipelines, such as data pre-processing, feature engineering, model selection, hyperparameter optimization, and prediction result analysis. In this paper, we investigate the current state of AutoML tools aiming to automate these tasks. We conduct various evaluations of the tools on many datasets, in different data segments, to examine their performance, and compare their advantages and disadvantages on different test cases. Automated Machine Learning (AutoML) promises major productivity boosts for data scientists, ML engineers and ML researchers by reducing repetitive tasks in machine learning pipelines. There are currently a number of different tools and platforms (both open-source and commercially available solutions) that try to automate these tasks. The goal of this paper is to address the following questions: (i) what are the available ML functionalities provided by the tools; (ii) how the tools perform when facing a wide spectrum of real world datasets; (iii) find the tradeoff between optimization speed and accuracy of the results; and (iv) the reproducibility of the results (a.k.a.


Dataiku review: Data science fit for the enterprise

#artificialintelligence

Dataiku Data Science Studio (DSS) is a platform that tries to span the needs of data scientists, data engineers, business analysts, and AI consumers. In addition, Dataiku DSS tries to span the machine learning process from end to end, i.e. from data preparation through MLOps and application support. The Dataiku DSS user interface is a combination of graphical elements, notebooks, and code, as we'll see later on in the review. As a user, you often have a choice of how you'd like to proceed, and you're usually not locked into your initial choice, given that graphical choices can generate editable notebooks and scripts. During my initial discussion with Dataiku, their senior product marketing manager asked me point blank whether I preferred a GUI or writing code for data science.


Is AutoML ready for Business?

#artificialintelligence

Do (will) we still need Data Scientists? AutoML tools have been gaining traction for the last couple of years, both due to technological advancements and their potential to be leveraged by'Citizen Data Scientists'. Citizen Data Science, is an interesting (often controversial) aspect of Data Science (DS) that aims to automate the design of Machine Learning (ML)/Deep Learning (DL) models, making it more accessible to people without the specialized skills of a Data Scientist. In this article, we will try to understand AutoML, its promise, what is possible today?, where AutoML fails (today)?, is it meant only for Citizen Data Scientists, or does it hold some value for skilled Data Scientists as well? Let us start with a very high-level primer on Machine Learning (ML).


Is AutoML ready for Business?

#artificialintelligence

AutoML tools have been gaining traction for the last couple of years, both due to technological advancements and their potential to be leveraged by'Citizen Data Scientists'. Citizen Data Science, is an interesting (often controversial) aspect of Data Science (DS) that aims to automate the design of Machine Learning (ML)/Deep Learning (DL) models, making it more accessible to people without the specialized skills of a Data Scientist. In this article, we will try to understand AutoML, its promise, what is possible today?, where AutoML fails (today)?, is it meant only for Citizen Data Scientists, or does it hold some value for skilled Data Scientists as well? Let us start with a very high-level primer on Machine Learning (ML). Most of today's ML models are supervised and applied on a prediction/classification task.


Amazon Gets Into the AutoML Race with AutoGluon: Some AutoML Architectures You Should Know About

#artificialintelligence

A few days ago, Amazon announced the release of AutoGloun, a new toolkit that simplifies the creation of deep learning models with just a few lines of code. The release marks Amazon's entrance in the ultra-competitive Automated machine learning(AutoML) space which is becoming one of the hottest trends for enterprise machine learning platforms. With some many news around the AutoML ecosystem, sometimes it becomes hard to differentiate signal from noise. Today, I would like to explore some of the most innovative AutoML stacks in the market that don't receive that much publicity. AutoML is becoming one of the most popular topics in modern data science applications.