AutoML tools are the need of the hour for data scientists to reduce their workloads in the world where the generation of data is only increasing exponentially. Readily available AutoML tools make the data science practitioner's work more comfortable and covers necessary foundations needed to create automated machine learning modules. And with the spur in data and the potential that this data holds, data scientists will benefit more by using AutoML capabilities. As we approach the midpoint of 2020, it is slowly being recognised that this year will see an increase in adaptation of AutoML. With the massive potential of AutoML about to burst, non-data science professionals and data science practitioners will look to get a more comprehensive view on the technology.
AI-FastStart was born as a direct response to a rapidly changing BI & Analytics world. AI/ML has become a critical technology investment but most organizations still suffer from scaling AI/ML practices. The program was designed around four core principles: The right platform, education, providing fast time-to-value, and to be easy to deploy and implement. We provide the best software, host it on the best possible platform, bundle the right depth and amount of education for unlimited users, and tailor services to enable an operational first use case in as little time as feasible. Whether your BI team has no experience with AI/ML, or are full experts, dotData AI-FastStart will help them become more proficient, more successful and will ultimately provide an exceptional predictive analytics foundation for your organization for years to come.
Building a machine learning (ML) pipeline in an automated way is a crucial and complex task as it is constrained with the available time budget and resources. This encouraged the research community to introduce several solutions to utilize the available time and resources. A lot of work is done to suggest the most promising classifiers for a given dataset using sundry of techniques including meta-learning based techniques. This gives the autoML framework the chance to spend more time exploiting those classifiers and tuning their hyper-parameters. In this paper, we empirically study the hypothesis of improving the pipeline performance by exploiting the most promising classifiers within the limited time budget. We also study the effect of increasing the time budget over the pipeline performance. The empirical results across autoSKLearn, TPOT and ATM, show that exploiting the most promising classifiers does not achieve a statistically better performance than exploring the entire search space. The same conclusion is also applied for long time budgets.
In the last few years, Automated Machine Learning (AutoML) has gained much attention. With that said, the question arises whether AutoML can outperform results achieved by human data scientists. This paper compares four AutoML frameworks on 12 different popular datasets from OpenML; six of them supervised classification tasks and the other six supervised regression ones. Additionally, we consider a real-life dataset from one of our recent projects. The results show that the automated frameworks perform better or equal than the machine learning community in 7 out of 12 OpenML tasks.