AWS Cloud Operations Blog

Category: Management Tools

Use AWS Config inventory and compliance dashboards for a unified view of resource inventory and compliance

We recently announced AWS Config compliance and inventory dashboards, a new AWS Config feature, that provides unified dashboards for AWS resource configurations and compliance across AWS accounts, AWS regions, or an AWS Organization. In this blog post, I will walk you through the dashboards and widgets that are included as of today for this launch. […]

Analyzing Amazon Lex conversation log data with Amazon Managed Grafana

To support business and internal processes, organizations are increasing their use of conversational interfaces. They offer opportunities for more availability, improved service levels, and reduced costs. As these conversational services become more important, so, does the need to monitor performance and effectiveness of these interfaces with analytics and dashboards. This analysis is used to drive […]

Know Before You Go: AWS-re-Invent-2023, AWS Management Console

Know Before You Go – AWS re:Invent 2023 | AWS Management Console

New this year, the AWS Customer Experience team has tips to help you enhance your re:Invent experience and learn about various improvements that make AWS even easier to use. Meet us at our kiosks in the AWS Village and be sure to check out the sessions below. Our sessions will cover best practices for managing […]

Build a Cloud Automation Practice for Operational Excellence: Best Practices from AWS Managed Services

Introduction In today’s fast-paced business environment, organizations are actively pursuing operational excellence to maintain a competitive edge. Automation is a critical foundation for achieving better efficiency, reliability, and scalability in operations. However, integrating automation into cloud practice entails more than simply implementing software or tools. Building a cloud automation practice requires a transformative journey that […]

Creating a correction of errors document

This blog post will walk you through an example of creating a Correction of Errors (COE) document. At Amazon, operational excellence is in our DNA. One best practice that we have learned at Amazon is to have a standard mechanism for post-incident analysis. The COE process facilitates learning from an event to avoid reoccurrences in […]

Monitoring GPU workloads on Amazon EKS using AWS managed open-source services

As machine learning (ML) workloads continue to grow in popularity, many customers are looking to run them on Kubernetes with graphics processing unit (GPU) support. Amazon Elastic Compute Cloud (Amazon EC2) instances powered by NVIDIA GPUs deliver the scalable performance needed for fast ML training and cost-effective ML inference. Monitoring GPU utilization gives valuable information for researchers working […]

Announcing Amazon CloudWatch Container Insights with Enhanced Observability for Amazon EKS on EC2

Announcing Amazon CloudWatch Container Insights with Enhanced Observability for Amazon EKS on EC2

Amazon CloudWatch Container Insights is a fully managed monitoring and observability service that provides DevOps engineers, developers, SREs, and IT managers with out-of-the-box visibility into their containerized applications and microservice environments. With Amazon CloudWatch Container Insights, you can monitor, isolate, and diagnose issues in your Kubernetes clusters with minimal effort. It delivers infrastructure telemetry like […]

How to email your Amazon CloudWatch dashboard

How to email your Amazon CloudWatch dashboard

Amazon CloudWatch enables customers to collect monitoring and operational data in the form of logs, metrics, alarms, and events, thereby allowing easy workload visualization and notifications. Many customers use Amazon CloudWatch  dashboards to monitor applications and infrastructure insights in order to have a unified dashboard for monitoring. Traditionally, operational health data access was only viewable for […]

Automating Amazon EC2 Auto Scaling with Amazon CloudWatch custom metrics and AWS CDK

Automating Amazon EC2 Auto Scaling with Amazon CloudWatch custom metrics and AWS CDK

Introduction As customers migrate legacy workloads to AWS Cloud, they may need to rehost or replatform applications to Amazon EC2 servers. To benefit from the scalability of cloud, customers need to be able to scale these EC2 servers up or down, on demand and on schedule. Amazon EC2 Auto Scaling Groups provide the on-demand scaling […]

Amazon Connect real-time monitoring using Amazon Managed Grafana and Amazon Timestream

Amazon Connect is an easy-to-use cloud contact center solution that helps companies of any size deliver superior customer service at a lower cost. Connect has many real-time monitoring capabilities. For requirements that go beyond those supported out of the box, Amazon Connect also provides you with data and APIs you can use to implement your […]