Collaborating Authors

dotData's AI-FastStart Program Helps BI teams Adopt AI/ML with AutoML 2.0 dotData AutoML 2.0 Solutions for Enterprise


AI-FastStart was born as a direct response to a rapidly changing BI & Analytics world. AI/ML has become a critical technology investment but most organizations still suffer from scaling AI/ML practices. The program was designed around four core principles: The right platform, education, providing fast time-to-value, and to be easy to deploy and implement. We provide the best software, host it on the best possible platform, bundle the right depth and amount of education for unlimited users, and tailor services to enable an operational first use case in as little time as feasible. Whether your BI team has no experience with AI/ML, or are full experts, dotData AI-FastStart will help them become more proficient, more successful and will ultimately provide an exceptional predictive analytics foundation for your organization for years to come.

AutoML: Exploration v.s. Exploitation Machine Learning

Building a machine learning (ML) pipeline in an automated way is a crucial and complex task as it is constrained with the available time budget and resources. This encouraged the research community to introduce several solutions to utilize the available time and resources. A lot of work is done to suggest the most promising classifiers for a given dataset using sundry of techniques including meta-learning based techniques. This gives the autoML framework the chance to spend more time exploiting those classifiers and tuning their hyper-parameters. In this paper, we empirically study the hypothesis of improving the pipeline performance by exploiting the most promising classifiers within the limited time budget. We also study the effect of increasing the time budget over the pipeline performance. The empirical results across autoSKLearn, TPOT and ATM, show that exploiting the most promising classifiers does not achieve a statistically better performance than exploring the entire search space. The same conclusion is also applied for long time budgets.

Can AutoML outperform humans? An evaluation on popular OpenML datasets using AutoML Benchmark Machine Learning

In the last few years, Automated Machine Learning (AutoML) has gained much attention. With that said, the question arises whether AutoML can outperform results achieved by human data scientists. This paper compares four AutoML frameworks on 12 different popular datasets from OpenML; six of them supervised classification tasks and the other six supervised regression ones. Additionally, we consider a real-life dataset from one of our recent projects. The results show that the automated frameworks perform better or equal than the machine learning community in 7 out of 12 OpenML tasks.

GAMA: a General Automated Machine learning Assistant Machine Learning

The General Automated Machine learning Assistant (GAMA) is a modular AutoML system developed to empower users to track and control how AutoML algorithms search for optimal machine learning pipelines, and facilitate AutoML research itself. In contrast to current, often black-box systems, GAMA allows users to plug in different AutoML and post-processing techniques, logs and visualizes the search process, and supports easy benchmarking. It currently features three AutoML search algorithms, two model post-processing steps, and is designed to allow for more components to be added.