AWS Machine Learning Blog

Category: Amazon SageMaker Ground Truth

Create a data labeling project with Amazon SageMaker Ground Truth Plus

Amazon SageMaker Ground Truth is a powerful data labeling service offered by AWS that provides a comprehensive and scalable platform for labeling various types of data, including text, images, videos, and 3D point clouds, using a diverse workforce of human annotators. In addition to traditional custom-tailored deep learning models, SageMaker Ground Truth also supports generative […]

How Northpower used computer vision with AWS to automate safety inspection risk assessments

How Northpower used computer vision with AWS to automate safety inspection risk assessments

In this post, we share how Northpower has worked with their technology partner Sculpt to reduce the effort and carbon required to identify and remediate public safety risks. Specifically, we cover the computer vision and artificial intelligence (AI) techniques used to combine datasets into a list of prioritized tasks for field teams to investigate and mitigate.

Align Meta Llama 3 to human preferences with DPO, Amazon SageMaker Studio, and Amazon SageMaker Ground Truth

Align Meta Llama 3 to human preferences with DPO, Amazon SageMaker Studio, and Amazon SageMaker Ground Truth

In this post, we show you how to enhance the performance of Meta Llama 3 8B Instruct by fine-tuning it using direct preference optimization (DPO) on data collected with SageMaker Ground Truth.

Use LangChain with PySpark to process documents at massive scale with Amazon SageMaker Studio and Amazon EMR Serverless

Use LangChain with PySpark to process documents at massive scale with Amazon SageMaker Studio and Amazon EMR Serverless

In this post, we explore how to build a scalable and efficient Retrieval Augmented Generation (RAG) system using the new EMR Serverless integration, Spark’s distributed processing, and an Amazon OpenSearch Service vector database powered by the LangChain orchestration framework. This solution enables you to process massive volumes of textual data, generate relevant embeddings, and store them in a powerful vector database for seamless retrieval and generation.

Use IP-restricted presigned URLs to enhance security in Amazon SageMaker Ground Truth

While presigned URLs offer a convenient way to grant temporary access to S3 objects, sharing these URLs with people outside of the workteam can lead to unintended access of those objects. To mitigate this risk and enhance the security of SageMaker Ground Truth labeling tasks, we have introduced a new feature that adds an additional layer of security by restricting access to the presigned URLs to the worker’s IP address or virtual private cloud (VPC) endpoint from which they access the labeling task. In this blog post, we show you how to enable this feature, allowing you to enhance your data security as needed, and outline the success criteria for this feature, including the scenarios where it will be most beneficial.

How Krikey AI harnessed the power of Amazon SageMaker Ground Truth to accelerate generative AI development

This post is co-written with Jhanvi Shriram and Ketaki Shriram from Krikey. Krikey AI is revolutionizing the world of 3D animation with their innovative platform that allows anyone to generate high-quality 3D animations using just text or video inputs, without needing any prior animation experience. At the core of Krikey AI’s offering is their powerful […]

Incorporate offline and online human – machine workflows into your generative AI applications on AWS

Recent advances in artificial intelligence have led to the emergence of generative AI that can produce human-like novel content such as images, text, and audio. These models are pre-trained on massive datasets and, to sometimes fine-tuned with smaller sets of more task specific data. An important aspect of developing effective generative AI application is Reinforcement […]

Auto labeling workflow

Build an active learning pipeline for automatic annotation of images with AWS services

This blog post is co-written with Caroline Chung from Veoneer. Veoneer is a global automotive electronics company and a world leader in automotive electronic safety systems. They offer best-in-class restraint control systems and have delivered over 1 billion electronic control units and crash sensors to car manufacturers globally. The company continues to build on a […]

Skeleton-based pose annotation labeling using Amazon SageMaker Ground Truth

Pose estimation is a computer vision technique that detects a set of points on objects (such as people or vehicles) within images or videos. Pose estimation has real-world applications in sports, robotics, security, augmented reality, media and entertainment, medical applications, and more. Pose estimation models are trained on images or videos that are annotated with […]

How AWS Prototyping enabled ICL-Group to build computer vision models on Amazon SageMaker

This is a customer post jointly authored by ICL and AWS employees. ICL is a multi-national manufacturing and mining corporation based in Israel that manufactures products based on unique minerals and fulfills humanity’s essential needs, primarily in three markets: agriculture, food, and engineered materials. Their mining sites use industrial equipment that has to be monitored […]