AWS Cloud Operations Blog

Category: Intermediate (200)

Announcing AWS CloudTrail Lake – a managed audit and security Lake

Organizations managing cloud infrastructure in AWS need effective mechanisms to audit operations in their AWS accounts for security and compliance. In November 2013, we announced AWS CloudTrail as the auditing platform for AWS. Since then, millions of customers have adopted this service. We believe CloudTrail is so important to AWS customers’ success that every new […]

Managing configuration compliance across your organization with AWS Systems Manager Quick Setup

When running your applications on AWS, the number of resources you use increases as the demand of your applications keeps growing. Eventually, keeping track of your AWS resources and the relationships between them becomes challenging from a governance perspective. AWS Config lets you more easily assess, audit, and evaluate the configurations of your AWS resources. […]

Monitoring AWS Lambda errors using Amazon CloudWatch

When we troubleshoot failed invocations from our Lambda functions, we often must identify the invocations that failed (from among all of the invocations), identify the root cause, and reduce mean time to resolution (MTTR). In this post, we will demonstrate how to utilize Amazon CloudWatch to identify failed AWS Lambda invocations. Likewise, we will show how […]

Visualize Amazon EC2 based VPN metrics with Amazon CloudWatch Logs

Organizations have many options for connecting to on-premises networks or third parties, including AWS Site-to-Site VPN. However, some organizations still need to use an Amazon Elastic Compute Cloud (Amazon EC2) instance running VPN software, such as strongSwan. Gaining insight into Amazon EC2-based VPN metrics can be challenging when compared to AWS native VPN services that […]

Create metrics and alarms for specific web pages with Amazon CloudWatch RUM

Amazon CloudWatch RUM makes it easy for AWS customers to access real-world performance metrics from web applications, thereby giving insights into the end-user experience. These user experiences are quantified into discrete metrics that you can then create alarms for. But what if you must have different load time alarms for certain pages? Or you’re testing […]

How to fix SSH issues on EC2 Linux instances using AWS Systems Manager

In a previous blog post, we provided a walkthrough of how to fix unreachable Amazon EC2 Windows instances using the EC2Rescue for Windows tool. In this blog post, I will walk you through how to utilize EC2Rescue for Linux to fix unreachable Linux instances. This Knowledge Center Article describes how EC2Rescue for Linux can be used to […]

Identify operational issues quickly by using Grafana and Amazon CloudWatch Metrics Insights (Preview)

Amazon CloudWatch has recently launched Metrics Insights (Preview) – a fast, flexible, SQL-based query engine that enables you to identify trends and patterns across millions of operational metrics in real-time. With Metrics Insights, you can easily query and analyze your metrics to gain better visibility into the health and performance of your infrastructure and large scale […]

Monitoring Service Level Objectives (“SLOs”) Made Easier with Nobl9 and Amazon CloudWatch Metrics Insights

The updated version (June 2022) that follows is based on working backward from a customer need to understand Service Level Objectives (“SLOs”) and the benefits from monitoring SLOs. This post was originally written in Nov 2021 by Natalia Sikora-Zimna, Product Owner at Nobl9. A service can be provided by infrastructure, a platform, software, or people. […]

Share your Amazon CloudWatch Dashboards with anyone using AWS Single Sign-On

Amazon CloudWatch enables customers to collect monitoring and operational data in the form of logs, metrics, alarms, and events, thereby allowing easy workload visualization and notifications. Traditionally, operational health data access was only viewable for technical support staff, thereby making operational health opaque to a wider business audience. However, actionable and valuable business insights can […]

How Projects Can be Tracked on AWS to Increase Accountability and Reduce Cost

This post was co-authored by Amy McVey and Jarrod Lewis from AER As AWS usage within a business increases over time, it can become difficult to track the AWS resources that have been created (e.g. EC2 instances, S3 buckets) and who is responsible for them. This can lead to unnecessary costs from resources that are […]