Goto

Collaborating Authors

Exploring the ML Tooling Landscape (Part 2 of 3)

#artificialintelligence

In the previous blog post in this series, we examined overall machine learning (ML) maturity in industry with a specific focus on machine learning operations (MLOps). The two main takeaways were the striking lack of ML maturing in industry as a whole, as well as the complexities involved in fully embracing MLOps, which can be taken to represent the apogee of ML maturation. In this blog post, we will consider the implications for tooling adoption in industry and the wider ML tooling market. This blog post is concerned with the second question. As with the previous post, the same disclaimer applies: This series of blog posts is by no means meant to be exhaustive -- or necessarily even correct in places! I wrote this to try to organise my thinking on the reading I've done in recent weeks and I want this to become a jumping off point for further discussion.


State of the Art in Automated Machine Learning

#artificialintelligence

In recent years, machine learning has been very successful in solving a wide range of problems. In particular, neural networks have reached human, and sometimes super-human, levels of ability in tasks such as language translation, object recognition, game playing, and even driving cars. Aerospike is the global leader in next-generation, real-time NoSQL data solutions for any scale. Aerospike's patented Hybrid Memory Architecture delivers an unbreakable competitive advantage by unlocking the full potential of modern hardware, delivering previously unimaginable value from vast amounts of data at the edge, to the core and in the cloud. With this growth in capability has come a growth in complexity. Data scientists and machine learning engineers must perform feature engineering, design model architectures, and optimize hyperparameters. Since the purpose of the machine learning is to automate a task normally done by humans, naturally the next step is to automate the tasks of data scientists and engineers. This area of research is called automated machine learning, or AutoML. There have been many exciting developments in AutoML recently, and it's important to take a look at the current state of the art and learn about what's happening now and what's coming up in the future. InfoQ reached out to the following subject matter experts in the industry to discuss the current state and future trends in AutoML space. InfoQ: What is AutoML and why is it important?


State of the Art in Automated Machine Learning

#artificialintelligence

In recent years, machine learning has been very successful in solving a wide range of problems. In particular, neural networks have reached human, and sometimes super-human, levels of ability in tasks such as language translation, object recognition, game playing, and even driving cars. Prevent out-of-control infrastructure and remove blockers to deployments. With this growth in capability has come a growth in complexity. Data scientists and machine learning engineers must perform feature engineering, design model architectures, and optimize hyperparameters. Since the purpose of the machine learning is to automate a task normally done by humans, naturally the next step is to automate the tasks of data scientists and engineers. This area of research is called automated machine learning, or AutoML. There have been many exciting developments in AutoML recently, and it's important to take a look at the current state of the art and learn about what's happening now and what's coming up in the future. InfoQ reached out to the following subject matter experts in the industry to discuss the current state and future trends in AutoML space. InfoQ: What is AutoML and why is it important? Francesca Lazzeri: AutoML is the process of automating the time consuming, iterative tasks of machine learning model development, including model selection and hyperparameter tuning.


Is Data-First AI the Next Big Thing?

#artificialintelligence

We are roughly a decade removed from the beginnings of the modern machine learning (ML) platform, inspired largely by the growing ecosystem of open-source Python-based technologies for data scientists. It's a good time for us to reflect back upon the progress that has been made, highlight the major problems enterprises have with existing ML platforms, and discuss what the next generation of platforms will be like. As we'll discuss, we believe the next disruption in the ML platform market will be the growth of data-first AI platforms. It is sometimes easy to forget now (or, tragically, maybe it's all too real for some), but there was once a time when building machine learning models required a substantial amount of work. In days not too far gone, this would involve implementing your own algorithms, writing tons of code in the process, and hoping you make no crucial errors in translating academic work into a functional library.


How Data-Centric Platforms Solve the Biggest Challenges for MLOps

#artificialintelligence

Recently, I learned that the failure rate for machine learning projects is still astonishingly high. Studies suggest that between 85-96% of projects never make it to production. These numbers are even more remarkable given the growth of machine learning (ML) and data science in the past five years. For businesses to be successful with ML initiatives, they need a comprehensive understanding of the risks and how to address them. In this post, we attempt to shed light on how to achieve this by moving away from a model-centric view of ML systems towards a data-centric view. Of course, everyone knows that data is the most important component of ML. Nearly every data scientist has heard: "garbage in, garbage out" and "80% of a data scientist's time is spent cleaning data".