AWS Cloud Operations Blog
Category: Monitoring and observability
Lowering MTTR with Amazon CloudWatch and AWS X-Ray
Customers running microservice-based workloads in a serverless environment frequently have issues with troubleshooting incidents as the data they need can be distributed across hundreds or thousands of components. In this blog post, I will demonstrate how you can reduce the mean time to resolution (MTTR, or the average time it takes to repair or mitigate […]
Using Tag-Based Filtering to Manage AWS Health Monitoring and Alerting at Scale
AWS provides customers regular updates of service notifications and planned activities via e-mail to the root account owners or the operational, security and billing contacts. AWS also provides granular notifications to customers via AWS Health allowing them to fine-tune their alerts on issues relating directly to them. Alongside Health Dashboard’s monitoring capabilities, customers can also […]
Choice Hotels adopts Amazon Managed Service for Prometheus for operational excellence and cost efficiency
This post was co-written with Stephen Cihak, Senior Director , Abhiram Madadi, Principal Engineer and Gopi Akula, Senior Manager at Choice Hotels Who is Choice Hotels? Choice Hotels International is one of the largest lodging franchisors in the world. A challenger in the upscale segment and a leader in midscale and extended stay, Choice has […]
Best practices: Implementing observability with AWS
As customers deploy cloud-based solutions, they need to be able to ensure that systems are running smoothly, and that they can quickly remediate issues when they arise. Deploying observability at scale can be challenging for customers, especially when it involves tens and hundreds of services across their enterprise. Customers want best practice recommendations, guidance in […]
Monitor your Databricks Clusters with AWS managed open-source Services
Organizations rely heavily on cloud-based data processing and analytics platforms in today’s data-driven world to unlock valuable insights and make informed decisions. Databricks, a unified analytics platform, has emerged as a popular choice due to its seamless integration with Apache Spark, and its ability to efficiently handle large-scale data processing tasks. Many customers have implemented […]
Getting Started with CloudWatch agent and collectd
Observability helps you understand the health, usage, performance, and customer experience for your workloads. Observability can support many use cases, from detecting incidents and supporting incident resolution, to understanding the impact of new features on your users and workflow. Establishing the right solution depends on being able to gather the right data for your situation. […]
Monitor hybrid and multicloud environments using AWS Systems Manager and Amazon CloudWatch
As customers accelerate their migrations to the cloud and transform their businesses, some find themselves in situations where they have to manage IT operations in a hybrid or multicloud environment. These customers are faced with additional complexity when it comes to operating their applications and infrastructure. They often must use solutions from multiple providers to […]
Announcing AWS CDK Observability Accelerator for Amazon EKS
Today we are happy to announce the all-new AWS CDK Observability Accelerator – a set of opinionated modules to help you set up observability for your AWS environments with AWS Native services and AWS-managed observability services such as Amazon Managed Service for Prometheus, Amazon Managed Grafana, AWS Distro for OpenTelemetry (ADOT) and Amazon CloudWatch. AWS […]
Using Curated Packages and AWS managed Open Source services to observe your On Premises Kubernetes environment
Customers who run containerized workloads on Kubernetes clusters on their hardware use Amazon EKS Anywhere (Amazon EKS-A). Customers look for prescriptive guidance for the observability of their modern applications running on EKS-A. Using AWS-managed open-source services such as AWS Distro for OpenTelemetry (ADOT), Amazon Managed Service for Prometheus, and Amazon Managed Grafana helps customers to offload […]
Announcing Live Tail feature for Amazon CloudWatch Logs
Learn with Shree and Jim about the newly released Amazon CloudWatch Logs Live Tail.