Collaborating Authors

Review: AWS AI and Machine Learning stacks up


Amazon Web Services claims to have the broadest and most complete set of machine learning capabilities. I honestly don't know how the company can claim those superlatives with a straight face: Yes, the AWS machine learning offerings are broad and fairly complete and rather impressive, but so are those of Google Cloud and Microsoft Azure. Amazon SageMaker Clarify is the new add-on to the Amazon SageMaker machine learning ecosystem for Responsible AI. SageMaker Clarify integrates with SageMaker at three points: in the new Data Wrangler to detect data biases at import time, such as imbalanced classes in the training set, in the Experiments tab of SageMaker Studio to detect biases in the model after training and to explain the importance of features, and in the SageMaker Model Monitor, to detect bias shifts in a deployed model over time. Historically, AWS has presented its services as cloud-only.

Amazon SageMaker JumpStart Simplifies Access to Pre-built Models and Machine Learning Solutions


Today, I'm extremely happy to announce the availability of Amazon SageMaker JumpStart, a capability of Amazon SageMaker that accelerates your machine learning workflows with one-click access to popular model collections (also known as "model zoos"), and to end-to-end solutions that solve common use cases. In recent years, machine learning (ML) has proven to be a valuable technique in improving and automating business processes. Indeed, models trained on historical data can accurately predict outcomes across a wide range of industry segments: financial services, retail, manufacturing, telecom, life sciences, and so on. Yet, working with these models requires skills and experience that only a subset of scientists and developers have: preparing a dataset, selecting an algorithm, training a model, optimizing its accuracy, deploying it in production, and monitoring its performance over time. In order to simplify the model building process, the ML community has created model zoos, that is to say, collections of models built with popular open source libraries, and often pretrained on reference datasets.

Perform interactive data processing using Spark in Amazon SageMaker Studio Notebooks


Amazon SageMaker Studio is the first fully integrated development environment (IDE) for machine learning (ML). With a single click, data scientists and developers can quickly spin up Studio notebooks to explore datasets and build models. You can now use Studio notebooks to securely connect to Amazon EMR clusters and prepare vast amounts of data for analysis and reporting, model training, or inference. You can apply this new capability in several ways. For example, data analysts may want to answer a business question by exploring and querying their data in Amazon EMR, viewing the results, and then either alter the initial query or drill deeper into the results.

Get started with machine learning in this Amazon SageMaker tutorial


Amazon Sagemaker makes machine learning accessible. Developers and data scientists can use it to build and deploy machine learning models on AWS without additional infrastructure management tasks. Amazon SageMaker provides pre-built algorithms and support for open source Jupyter notebook instances to make it easier to get a machine learning model running in applications. In this Amazon SageMaker tutorial, we'll breakdown how to get a notebook instance up and running and how to train and validate your machine learning model. To get started, set up the necessary AWS Identity and Access Management (IAM) roles and permissions and then create a Jupyter notebook that will run Python code.

Preparing data for ML models using AWS Glue DataBrew in a Jupyter notebook


AWS Glue DataBrew is a new visual data preparation tool that makes it easy for data analysts and data scientists to clean and normalize data to prepare it for analytics and machine learning (ML). In this post, we examine a sample ML use case and show how to use DataBrew and a Jupyter notebook to upload a dataset, clean and normalize the data, and train and publish an ML model. We look for anomalies by applying the Amazon SageMaker Random Cut Forest (RCF) anomaly detection algorithm on a public dataset that records power consumption for more than 300 random households. To make it easier for you to get started, we created an AWS CloudFormation template that automatically configures a Jupyter notebook instance with the required libraries and installs the plugin. We used Amazon Deep Learning AMI to configure the out-of-the-box Jupyter server.