AWS Machine Learning Blog

Category: Analytics

For an existing data lake registered with Lake Formation, the following diagram illustrates the proposed implementation.

Control and audit data exploration activities with Amazon SageMaker Studio and AWS Lake Formation

May 2024: This post was reviewed and updated to use a new dataset, reflect the updated Studio experience and AWS IAM Identity Center. Certain industries are required to audit all access to their data. This includes auditing exploratory activities performed by data scientists, who usually query data from within machine learning (ML) notebooks. This post […]

Using streaming ingestion with Amazon SageMaker Feature Store to make ML-backed decisions in near-real time

August 30, 2023: Amazon Kinesis Data Analytics has been renamed to Amazon Managed Service for Apache Flink. Read the announcement in the AWS News Blog and learn more. Businesses are increasingly using machine learning (ML) to make near-real time decisions, such as placing an ad, assigning a driver, recommending a product, or even dynamically pricing […]

Automated model refresh with streaming data

In today’s world, being able to quickly bring on-premises machine learning (ML) models to the cloud is an integral part of any cloud migration journey. This post provides a step-by-step guide for launching a solution that facilitates the migration journey for large-scale ML workflows. This solution was developed by the Amazon ML Solutions Lab for […]

Real-time anomaly detection for Amazon Connect call quality using Amazon OpenSearch

September 8, 2021: Amazon Elasticsearch Service has been renamed to Amazon OpenSearch Service. See details. If your contact center is serving calls over the internet, network metrics like packet loss, jitter, and round-trip time are key to understanding call quality. In the post Easily monitor call quality with Amazon Connect, we introduced a solution that […]

Video streaming and deep learning: Using Amazon Kinesis Video Streams with Deep Java Library

Amazon Kinesis Video Streams allows you to easily ingest video data from connected devices for processing. One of the most effective ways to process this video data is using the power of deep learning. You can create an efficient service infrastructure to run these computations with a Java server, but Java support for deep learning […]

Building a medical image search platform on AWS

Improving radiologist efficiency and preventing burnout is a primary goal for healthcare providers. A nationwide study published in Mayo Clinic Proceedings in 2015 showed radiologist burnout percentage at a concerning 61% [1]. In additon, the report concludes that “burnout and satisfaction with work-life balance in US physicians worsened from 2011 to 2014. More than half […]

Moving from notebooks to automated ML pipelines using Amazon SageMaker and AWS Glue

A typical machine learning (ML) workflow involves processes such as data extraction, data preprocessing, feature engineering, model training and evaluation, and model deployment. As data changes over time, when you deploy models to production, you want your model to learn continually from the stream of data. This means supporting the model’s ability to autonomously learn […]

Data visualization and anomaly detection using Amazon Athena and Pandas from Amazon SageMaker

Many organizations use Amazon SageMaker for their machine learning (ML) requirements and source data from a data lake stored on Amazon Simple Storage Service (Amazon S3). The petabyte scale source data on Amazon S3 may not always be clean because data lakes ingest data from several source systems, such as like flat files, external feeds, […]

Automating the analysis of multi-speaker audio files using Amazon Transcribe and Amazon Athena

In an effort to drive customer service improvements, many companies record the phone conversations between their customers and call center representatives. These call recordings are typically stored as audio files and processed to uncover insights such as customer sentiment, product or service issues, and agent effectiveness. To provide an accurate analysis of these audio files, […]

Accessing data sources from Amazon SageMaker R kernels

Amazon SageMaker notebooks now support R out-of-the-box, without needing you to manually install R kernels on the instances. Also, the notebooks come pre-installed with the reticulate library, which offers an R interface for the Amazon SageMaker Python SDK and enables you to invoke Python modules from within an R script. You can easily run machine […]