AWS Cloud Operations Blog

Use Amazon CloudWatch Contributor Insights for general analysis of Apache logs

Customers build, deploy, and maintain millions of web applications on AWS and many customers deploy these applications using the Apache web application server. Web application performance is a key metric in modern enterprise applications. On AWS customers leverage Amazon CloudWatch to monitor response times, uptime, and provide SLAs. Engineering teams that run large scale applications […]

Gain operational insights for NVIDIA GPU workloads using Amazon CloudWatch Container Insights

As machine learning models grow more advanced, they require extensive computing power to train efficiently. Many organizations are turning to GPU-accelerated Kubernetes clusters for both model training and online inference. However, properly monitoring GPU usage is critical for machine learning engineers and cluster administrators to understand model performance and to optimize infrastructure utilization. Without visibility […]

Get Disk Utilization of Your Fleet Using AWS Systems Manager Custom Inventory Types

Get Disk Utilization of Your Fleet Using AWS Systems Manager Custom Inventory Types

Some of my customers need assistance while operating their Amazon Elastic Compute Cloud (Amazon EC2) infrastructure. They need to: Review the disk usage of various volumes/ disks within an EC2 instance. To do it in a scalable way, one does not need to access the instance either through a Remote Desktop Session (RDP) or use […]

Resiliency Journey : exploring how AWS Resilience Hub and Migration Acceleration Program come together

In today’s rapidly evolving digital landscape, the cloud has become the backbone of innovation, scalability, and efficiency for businesses worldwide. As customers embark on their cloud migration journeys, whether the migration has been motivated by the intention of accelerating innovation, reducing operational and infrastructure costs, or exiting your on-prem datacenter, migrating to the cloud presents […]

Image with a blue background with the following text Accelerate VMware Migrations to AWS using AWS Migration Hub Journeys

Accelerate VMware Migrations to AWS using AWS Migration Hub Journeys

In January 2024, we introduced Migration Hub Journeys to guide and accelerate the migration and modernization of applications. Journeys help optimize planning, execution, and tracking through task-based templates with expert guidance, specialized tools, and cross-team collaboration, enabling you to migrate and modernize applications seamlessly. Today, we’re excited to publish new migration journey templates for AWS […]

Automate CloudWatch Dashboard creation for your AWS Elemental Mediapackage and AWS Elemental Medialive

Introduction Monitoring the health and performance of your media services is critical to ensuring a seamless viewing experience for your customers. Amazon CloudWatch provides powerful monitoring capabilities for Amazon Web Services (AWS) resources. Setting up comprehensive dashboards can be a time-consuming process, especially for organizations managing large number of resources across multiple regions. The Automatic CloudWatch […]

Ten Ways to Improve Your AWS Operations

Introduction When I take my car in for service for a simple oil change, the technician often reads off a litany of other services my car needs that I had put off since the previous service (and maybe the service before that, too). I tend to wait for the “check engine” light to come on […]

Project management in a cloud first world

Introduction In this blog, you will learn how to choose the right project management methodology to accelerate cloud transformations. According to the Harvard Business Review, over 70% of digital transformations fail. One of the reasons is the lack of proper governance leading to poor cross-functional alignment. To avoid this common pitfall, organizations must choose a […]

Testing Amazon Cognito backed APIs using Amazon CloudWatch Synthetics

Testing Amazon Cognito backed APIs using Amazon CloudWatch Synthetics

Customers who develop APIs can control access to them using Amazon Cognito user pools as an authorizer. Testing these APIs should take into account the additional security controls in place to effectively validate that the APIs are working, and Amazon CloudWatch Synthetics enables proactive testing of these APIs. If you are using Amazon Cognito User […]

How SLAs, SLOs, and SLIs interact

Improve application reliability with effective SLOs

At AWS, we consider reliability as a capability of services to withstand major disruptions within acceptable degradation parameters and to recover within an acceptable timeframe. Service reliability goes beyond traditional disciplines, such as availability and performance, to achieve its goal. Components of a system or application will eventually fail over time. Like our CTO Werner Vogels […]