AWS Cloud Operations Blog
Category: Management Tools
Lowering MTTR with Amazon CloudWatch and AWS X-Ray
Customers running microservice-based workloads in a serverless environment frequently have issues with troubleshooting incidents as the data they need can be distributed across hundreds or thousands of components. In this blog post, I will demonstrate how you can reduce the mean time to resolution (MTTR, or the average time it takes to repair or mitigate […]
Unlocking the power: The keys to delivering successful Cloud Migrations
Despite the many benefits of moving to the Cloud, large enterprises frequently struggle to deliver migrations (and the related business transformation) in the planned timeframe. Why? What are the key factors that ensure a successful migration that becomes an oft-quoted industry benchmark for a Cloud driven transformation; rather than a moribund initiative where a number […]
Self-service Account Provisioning Using AWS Service Management Connector for ServiceNow
Many customers are looking to adopt a multi-account strategy within their AWS environment. This allows customers to isolate their workloads into different environments including test, dev, and production in addition to separating workloads based on regulatory requirements. As customers scale their multi-account environments, one strategy to increase agility is to offer business units their own […]
Best practices for managing AWS account meta-data at scale
As we all know, using multiple accounts on your AWS environment is one of the recommended best practices when organizing your workloads and your environment. Using multiple accounts brings multiple benefits allowing you to better leverage AWS services. However, AWS accounts are additional resources that you need to manage. In this blog post, you will […]
Observe dynamic sites with Amazon CloudWatch Synthetics and AWS Systems Manager Parameter Store
Overview Maintaining and improving end user experience is key and as your business grows, the number of endpoints you need to observe can grow quickly. It can become more challenging and time consuming to build multiple canaries to observe them. This solution is designed to show how you can use a consistent and automated approach […]
Centralize image administration for virtual machines and containers using EC2 Image Builder
Customers may have different processes for image building across virtual machines, containers, or both. This variation in processes introduces operational overhead in managing images, including the initial configuration and the ongoing updates. From the AWS Well-Architected Operational Excellence Pillar, section “Document and share lessons learned”, these images should be standardized, configured with the latest patches, […]
Observability using native Amazon CloudWatch and AWS X-Ray for serverless modern applications
Introduction In this blog post, we will share how you can use AWS-native observability tools to measure the current state of your modern serverless applications and how to get started with the minimal effort. We will review tools like Amazon CloudWatch and AWS X-Ray and explore how these services can help you instrument your application […]
Estimating Total Cost of Ownership (TCO) for modernizing workloads on AWS using Containerization – Part 2
Introduction Part one of this series described the methodology used to calculate the TCO for containerization and we covered the first scenario of estimating TCO with server inventory information. In the second part we focus on second scenario where we will estimate TCO with application level information. Scenario 2: Estimating TCO with only application level […]
Monitor IoT device health at scale with Amazon Managed Grafana
Businesses today employ IoT devices to monitor the health of their equipment, ranging from machines on a factory floor to inventory tracking sensor locations. Insights from these IoT device fleets make them part of critical business infrastructure, however deriving meaningful insights from these IoT device fleets at scale is a common challenge customers face. IT […]
Monitor Amazon EKS Control Plane metrics using AWS Open Source monitoring services
Have you encountered situations where your Kubernetes API calls are constantly throttled by the control plane? Did you see the 429 HTTP response code “Too many requests” all over the place and have no clue on what’s wrong with your cluster? In this blog post, we will talk about monitoring some of the key metrics […]