Goto

Collaborating Authors

The Death of Data Scientists – will AutoML replace them? - KDnuggets

#artificialintelligence

One cannot introduce AutoML without mentioning the machine learning project's life cycle, which includes data cleaning, feature selection/engineering, model selection, parameter optimization, and finally, model validation. As advanced as technology has become, the traditional data science project still incorporates a lot of manual processes and remains time-consuming and repetitive. AutoML came into the picture to automate the entire process from data cleaning to parameter optimization. It provides tremendous value for machine learning projects in terms of both time savings and performance. Launched in 2018, Google Cloud AutoML quickly gained popularity with its user-friendly interface and high performance.


Data-driven Advice for Applying Machine Learning to Bioinformatics Problems

arXiv.org Machine Learning

As the bioinformatics field grows, it must keep pace not only with new data but with new algorithms. Here we contribute a thorough analysis of 13 state-of-the-art, commonly used machine learning algorithms on a set of 165 publicly available classification problems in order to provide data-driven algorithm recommendations to current researchers. We present a number of statistical and visual comparisons of algorithm performance and quantify the effect of model selection and algorithm tuning for each algorithm and dataset. The analysis culminates in the recommendation of five algorithms with hyperparameters that maximize classifier performance across the tested problems, as well as general guidelines for applying machine learning to supervised classification problems.


The 'Big Bang' of Data Science and ML Tools

#artificialintelligence

The tools used for data science are rapidly changing at the moment, according to Gartner, which said we're in the midst of a "big bang" in its latest report on data science and machine learning platforms. "The data science and ML market is healthy and vibrant, with a broad mix of vendors offering a range of capabilities," Gartner says in its Magic Quadrant for Data Science and Machine Learning Platforms published January 28. "The market is experiencing a'big bang' that is redefining not only who does data science and ML, but how it is done." The analyst group defines a data science platform as an integrated place where data scientists, citizen data scientists, and developers can get all of the core capabilities that they need to not only build data science application, but to embed them into existing business processes and manage and maintain them over time. Integration and cohesion are keys, in Gartner's view, and applications that simply bundle various packages and libraries – especially open source offerings -- are not considered true platforms.


Most popular programming language frameworks and tools for machine learning - TechRepublic

#artificialintelligence

If you're wondering which of the growing suite of programming language libraries and tools are a good choice for implementing machine-learning models then help is at hand. More than 1,300 people mainly working in the tech, finance and healthcare revealed which machine-learning technologies they use at their firms, in a new O'Reilly survey. The list is a mix of software frameworks and libraries for data science favorite Python, big data platforms, and cloud-based services that handle each stage of the machine-learning pipeline. Most firms are still at the evaluation stage when it comes to using machine learning, or AI as the report refers to it, and the most common tools being implemented were those for'model visualization' and'automated model search and hyperparameter tuning'. Unsurprisingly, the most common form of ML being used was supervised learning, where a machine-learning model is trained using large amounts of labelled data.


Most popular programming language frameworks and tools for machine learning

#artificialintelligence

If you're wondering which of the growing suite of programming language libraries and tools are a good choice for implementing machine-learning models then help is at hand. More than 1,300 people mainly working in the tech, finance and healthcare revealed which machine-learning technologies they use at their firms, in a new O'Reilly survey. The list is a mix of software frameworks and libraries for data science favorite Python, big data platforms, and cloud-based services that handle each stage of the machine-learning pipeline. Most firms are still at the evaluation stage when it comes to using machine learning, or AI as the report refers to it, and the most common tools being implemented were those for'model visualization' and'automated model search and hyperparameter tuning'. Unsurprisingly, the most common form of ML being used was supervised learning, where a machine-learning model is trained using large amounts of labelled data.