AWS Cloud Operations Blog
Category: Intermediate (200)
Manage Amazon EC2 instance clock accuracy using Amazon Time Sync Service and Amazon CloudWatch – Part 2
In part 1 of this series, I cover important concepts about measuring the accuracy of time on Amazon EC2 instances . I discussed calculating ClockErrorBound (?) and using its value as a range between which system time is accurate. In this part, I walk through the process of using Amazon CloudWatch to measure and monitor […]
Manage Amazon EC2 instance clock accuracy using Amazon Time Sync Service and Amazon CloudWatch – Part 1
This two-part series discusses the measurement and management of time accuracy on Amazon EC2 instances. Part 1 covers the important concepts related to system and reference time. Part 2 covers the mechanism of measure, monitor, and maintain accurate system time on EC2 instances. A large and diverse set of customer workloads depends on the observed […]
Use AWS Lambda and Amazon QuickSight to Build a Dashboard for AWS Health Events in Organizational View
Centralized DevOps teams responsible for the operation of Amazon Web Services (AWS) resources across an organization want to have a consistent approach for receiving and visualizing notifications for AWS Health events. It’s challenging and time-consuming to collect this data from individual accounts through email notifications, by managing separate event data, or even by manually clicking […]
Use Amazon Athena and Amazon QuickSight to build custom reports of AWS Well-Architected Reviews
AWS Well-Architected helps cloud architects build secure, high-performing, resilient, and efficient infrastructure for their applications and workloads. Based on five pillars — operational excellence, security, reliability, performance efficiency, and cost optimization — AWS Well-Architected provides a consistent approach for customers and partners to evaluate architectures, and implement designs that can scale over time. You can […]
Orchestrating multi-step, custom patch processes using AWS Systems Manager Patch Manager
The ongoing management of operating system and application-level patching is critical for ensuring that your organization’s software is up to date and meets compliance policies. Patching is not always a straightforward process. You often need to orchestrate custom procedures, workflows, and scripts to ensure that applications can be safely stopped, started, and verified during the […]
Automate AWS Backups with AWS Service Catalog
If you’re an organization with multiple AWS accounts and independent teams, cloud governance can seem a daunting task. The complexities of balancing developer velocity with centralized governance risks can slow down the innovation you’re trying to speed up. Fortunately, AWS Service Catalog, and AWS Backup help to implement a well-architected approach to self-service while meeting […]
View AWS Config rules across multiple accounts and Regions using AWS Systems Manager Explorer
AWS Systems Manager Explorer is a customizable operations dashboard that displays an aggregated view of operations data from across your AWS accounts and AWS Regions. Explorer provides context into how operational issues are distributed, trend over time, and vary by category. In this blog post, I explain how Explorer gathers the compliance status of AWS […]
How BT uses Amazon CloudWatch to monitor millions of devices
In this guest post, Ciaran Kearney, Data Engineer at multinational telecommunications company BT discusses how BT built a monitoring solution using Amazon CloudWatch dashboards, composite alarms, and embedded metric format to support the monitoring of millions of devices. Customers with high-cardinality monitoring use cases often face challenges when it comes to implementing observability. Monitoring high-cardinality workloads […]
Create immutable servers using EC2 Image Builder and AWS CodePipeline
When you run an application on multiple Amazon Elastic Compute Cloud (Amazon EC2) instances, you want to avoid differences between the instances because they can cause unpredictable behavior and make it hard to troubleshoot and solve issues. The best way to prevent differences is to replace your instances whenever you want to make a change—to […]
Viewing permission issues with service-linked roles
Each AWS service requires explicit access to resources, endpoints, and objects that reside in the domain of another service. This is referred to as the permission boundary. Services like AWS Config, Amazon Macie, and AWS GuardDuty require an AWS Identity and Access Management (IAM) role that grants access to resources outside of its control. Understanding […]