Generative AI | Containers

Accelerate container troubleshooting with the fully managed Amazon ECS MCP server (preview)

Amazon ECS today launched a fully managed, remote Model Context Protocol (MCP) server in preview, enabling AI agents to provide deep contextual knowledge of ECS workflows, APIs, and best practices for more accurate guidance throughout your application lifecycle. In this post, we walk through how to streamline your container troubleshooting using the Amazon ECS MCP server, which offers intelligent AI-assisted inspection and diagnostics through natural language queries in CLI tools like Kiro, IDEs like Cline and Cursor, and directly within the Amazon ECS console through Amazon Q.

Featured image: Amazon EKS 100K nodes per cluster

Amazon EKS enables ultra scale AI/ML workloads with support for 100K nodes per cluster

We’re excited to announce that Amazon Elastic Kubernetes Service (Amazon EKS) now supports up to 100,000 worker nodes in a single cluster, enabling customers to scale up to 1.6 million AWS Trainium accelerators or 800K NVIDIA GPUs to train and run the largest AI/ML models. This capability empowers customers to pursue their most ambitious AI […]

How Vannevar Labs cut ML inference costs by 45% using Ray on Amazon EKS

This blog is authored by Colin Putney (ML Engineer at Vannevar Labs), Shivam Dubey (Specialist SA Containers at AWS), Apoorva Kulkarni (Sr.Specialist SA, Containers at AWS), and Rama Ponnuswami (Principal Container Specialist at AWS). Vannevar Labs is a defense tech startup, successfully cut machine learning (ML) inference costs by 45% using Ray and Karpenter on Amazon Elastic Kubernetes Service (Amazon EKS). […]

Amazon EKS optimized Amazon Linux 2023 accelerated AMIs now available

Introduction Earlier this year we announced support for Amazon EKS optimized AL2023 AMIs that provided many enhancements in terms of security and performance. Amazon Linux 2023 (AL2023) is the next generation of Amazon Linux from Amazon Web Services (AWS) and is designed to provide a secure, stable, and high-performance environment to develop and run your […]

Applying Generative AI to CVE remediation – early vulnerability patching in Continuous Integration Pipelines

Cloud technologies are a rapidly evolving landscape. Securing cloud applications is everyone’s responsibility, meaning application development teams are needed to follow strict security guidelines from the earliest development stages, and to make sure of continuous security scans throughout the whole application lifecycle. The rise of generative AI enables new innovative approaches for addressing longstanding challenges with […]

Train Llama2 with AWS Trainium on Amazon EKS

Introduction Generative AI is not only transforming the way businesses function but also accelerating the pace of innovation within the broader AI field. This transformative force is redefining how businesses use technology, equipping them with capabilities to create human-like text, images, code, and audio, which were once considered beyond reach. Generative AI offers a range […]

Build Generative AI apps on Amazon ECS for SageMaker JumpStart

Introduction The rise in popularity of Generative AI (GenAI) reflects a broader shift toward intelligent automation in the business landscape, which enables enterprises to innovate at an unprecedented scale, while adhering to dynamic market demands. While the promise of GenAI is exciting, the initial steps toward its adoption can be overwhelming. This post aims to […]

Building multi-tenant JupyterHub Platforms on Amazon EKS

Introduction In recent years, there’s been a remarkable surge in the adoption of Kubernetes for data analytics and machine learning (ML) workloads in the tech industry. This increase is underpinned by a growing recognition that Kubernetes offers a reliable and scalable infrastructure to handle these demanding computational workloads. Furthermore, a recent wave of Generative AI […]

Build a multi-tenant chatbot with RAG using Amazon Bedrock and Amazon EKS

Introduction With the availability of Generative AI models, many customers are exploring ways to build chatbot applications that can cater to a wide range of their end-customers, with each instance of chatbot specializing on a specific tenant’s contextual information, and run such multi-tenant applications at scale with a cost-efficient infrastructure familiar to their development teams. […]

Deploy Generative AI Models on Amazon EKS

Introduction Generative Artificial Intelligence (Gen AI) is transforming the way businesses function and is accelerating the pace of innovation. In general, the AI field is changing the way businesses utilize technology. Generative AI technology involves tuning and deploying Large Language Models (LLM), and gives developers access to those models to execute prompts and conversations. Platform […]

Containers

Category: Generative AI