Automate feature engineering pipelines with Amazon SageMaker

#artificialintelligence 

The process of extracting, cleaning, manipulating, and encoding data from raw sources and preparing it to be consumed by machine learning (ML) algorithms is an important, expensive, and time-consuming part of data science. Managing these data pipelines for either training or inference is a challenge for data science teams, however, and can take valuable time away that could be better used towards experimenting with new features or optimizing model performance with different algorithms or hyperparameter tuning. Many ML use cases such as churn prediction, fraud detection, or predictive maintenance rely on models trained from historical datasets that build up over time. The set of feature engineering steps a data scientist defined and performed on historical data for one time period needs to be applied towards any new data after that period, as models trained from historic features need to make predictions on features derived from the new data. Instead of manually performing these feature transformations on new data as it arrives, data scientists can create a data preprocessing pipeline to perform the desired set of feature engineering steps that runs automatically whenever new raw data is available.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found