AWS Machine Learning Blog

Category: Amazon SageMaker Autopilot

Automate a shared bikes and scooters classification model with Amazon SageMaker Autopilot

February 9, 2024: Amazon Kinesis Data Firehose has been renamed to Amazon Data Firehose. Read the AWS What’s New post to learn more. Amazon SageMaker Autopilot makes it possible for organizations to quickly build and deploy an end-to-end machine learning (ML) model and inference pipeline with just a few lines of code or even without […]

Run AutoML experiments with large parquet datasets using Amazon SageMaker Autopilot

Starting today, you can use Amazon SageMaker Autopilot to tackle regression and classification tasks on large datasets up to 100 GB. Additionally, you can now provide your datasets in either CSV or Apache Parquet content types. Businesses are generating more data than ever. A corresponding demand is growing for generating insights from these large datasets […]

Add AutoML functionality with Amazon SageMaker Autopilot across accounts

AutoML is a powerful capability, provided by Amazon SageMaker Autopilot, that allows non-experts to create machine learning (ML) models to invoke in their applications. The problem that we want to solve arises when, due to governance constraints, Amazon SageMaker resources can’t be deployed in the same AWS account where they are used. Examples of such […]

Use integrated explainability tools and improve model quality using Amazon SageMaker Autopilot

Whether you are developing a machine learning (ML) model for reducing operating cost, improving efficiency, or improving customer satisfaction, there are no perfect solutions when it comes to producing an effective model. From an ML development perspective, data scientists typically go through stages of data exploration, feature engineering, model development, and model training and tuning […]

Develop and deploy ML models using Amazon SageMaker Data Wrangler and Amazon SageMaker Autopilot

Data generates new value to businesses through insights and building predictive models. However, although data is plentiful, available data scientists are far and few. Despite our attempts in recent years to produce data scientists from academia and elsewhere, we still see a huge shortage that will continue into the near future. To accelerate model building, […]

Creating high-quality machine learning models for financial services using Amazon SageMaker Autopilot

Machine learning (ML) is used throughout the financial services industry to perform a wide variety of tasks, such as fraud detection, market surveillance, portfolio optimization, loan solvency prediction, direct marketing, and many others. This breadth of use cases has created a need for lines of business to quickly generate high-quality and performant models that can […]

Customizing and reusing models generated by Amazon SageMaker Autopilot

Amazon SageMaker Autopilot automatically trains and tunes the best machine learning (ML) models for classification or regression problems while allowing you to maintain full control and visibility. This not only allows data analysts, developers, and data scientists to train, tune, and deploy models with little to no code, but you can also review a generated […]

Explaining Amazon SageMaker Autopilot models with SHAP

Machine learning (ML) models have long been considered black boxes because predictions from these models are hard to interpret. However, recently, several frameworks aiming at explaining ML models were proposed. Model interpretation can be divided into local and global explanations. A local explanation considers a single sample and answers questions like “Why does the model […]

Deploying your own data processing code in an Amazon SageMaker Autopilot inference pipeline

The machine learning (ML) model-building process requires data scientists to manually prepare data features, select an appropriate algorithm, and optimize its model parameters. It involves a lot of effort and expertise. Amazon SageMaker Autopilot removes the heavy lifting required by this ML process. It inspects your dataset, generates several ML pipelines, and compares their performance […]