Most of the business leaders claim that the lack of qualified personnel with skills in artificial intelligence (AI) has been a major barrier to its implementation across businesses. With more and more companies choosing to leverage artificial intelligence and its applications like machine learning, deep learning, and more, the knowledge gap can prove a costly affair. While data scientists are helping to build AI applications for data mining, analyzing, filtering the raw data, along with selecting an algorithm to train machine learning models, their roles are diversifying in recent years. Coupled with the acute talent shortage, as the job positions for data scientists are rapidly increasing, companies face the risk of trailing behind in the race of digital transformations. Therefore, business leaders are looking to automate some of the roles and responsibilities of a data scientist.
When you're trying to train the best machine learning model for your data automatically, there's AutoML, or automated machine learning, and then there's Google Cloud AutoML. Google Cloud AutoML is a cut above. In the past I've reviewed H2O Driverless AI, Amazon SageMaker, and Azure Machine Learning AutoML. Driverless AI automatically performs feature engineering and hyperparameter tuning, and claims to perform as well as Kaggle masters. Azure Machine Learning AutoML automatically sweeps through features, algorithms, and hyperparameters for basic machine learning algorithms; a separate Azure Machine Learning hyperparameter tuning facility allows you to sweep specific hyperparameters for an existing experiment.
AutoML tools are the need of the hour for data scientists to reduce their workloads in the world where the generation of data is only increasing exponentially. Readily available AutoML tools make the data science practitioner's work more comfortable and covers necessary foundations needed to create automated machine learning modules. And with the spur in data and the potential that this data holds, data scientists will benefit more by using AutoML capabilities. As we approach the midpoint of 2020, it is slowly being recognised that this year will see an increase in adaptation of AutoML. With the massive potential of AutoML about to burst, non-data science professionals and data science practitioners will look to get a more comprehensive view on the technology.
There's an irony around Artificial Intelligence (AI) work: it involves a lot of manual, trial and error effort to build predictive models with the highest accuracy. With a seemingly continuous emergence of machine learning and deep learning frameworks, and updates to them, as well as changes to tooling platforms, it's no wonder that so much AI work is so ad hoc. But still, why would a technology that's all about automation involve so much bespoke effort? A few companies, like DataRobot, specialize in it. Other AI startups, like Dataiku, H20, and RapidMiner, and established enterprise software companies like Tibco, have broad AI platforms that feature AutoML capabilities too.
The General Automated Machine learning Assistant (GAMA) is a modular AutoML system developed to empower users to track and control how AutoML algorithms search for optimal machine learning pipelines, and facilitate AutoML research itself. In contrast to current, often black-box systems, GAMA allows users to plug in different AutoML and post-processing techniques, logs and visualizes the search process, and supports easy benchmarking. It currently features three AutoML search algorithms, two model post-processing steps, and is designed to allow for more components to be added.