databrew
Preparing data for ML models using AWS Glue DataBrew in a Jupyter notebook
AWS Glue DataBrew is a new visual data preparation tool that makes it easy for data analysts and data scientists to clean and normalize data to prepare it for analytics and machine learning (ML). In this post, we examine a sample ML use case and show how to use DataBrew and a Jupyter notebook to upload a dataset, clean and normalize the data, and train and publish an ML model. We look for anomalies by applying the Amazon SageMaker Random Cut Forest (RCF) anomaly detection algorithm on a public dataset that records power consumption for more than 300 random households. To make it easier for you to get started, we created an AWS CloudFormation template that automatically configures a Jupyter notebook instance with the required libraries and installs the plugin. We used Amazon Deep Learning AMI to configure the out-of-the-box Jupyter server.
- Information Technology > Artificial Intelligence > Machine Learning (1.00)
- Information Technology > Data Science > Data Mining > Big Data (0.72)
- Information Technology > Data Science > Data Mining > Anomaly Detection (0.62)