AWS Cloud Operations Blog

Tag: Cloud Operations

Know Before You Go: AWS-re-Invent-2023, AWS Management Console

Know Before You Go – AWS re:Invent 2023 | AWS Management Console

New this year, the AWS Customer Experience team has tips to help you enhance your re:Invent experience and learn about various improvements that make AWS even easier to use. Meet us at our kiosks in the AWS Village and be sure to check out the sessions below. Our sessions will cover best practices for managing […]

Build a Cloud Automation Practice for Operational Excellence: Best Practices from AWS Managed Services

Introduction In today’s fast-paced business environment, organizations are actively pursuing operational excellence to maintain a competitive edge. Automation is a critical foundation for achieving better efficiency, reliability, and scalability in operations. However, integrating automation into cloud practice entails more than simply implementing software or tools. Building a cloud automation practice requires a transformative journey that […]

Creating a correction of errors document

This blog post will walk you through an example of creating a Correction of Errors (COE) document. At Amazon, operational excellence is in our DNA. One best practice that we have learned at Amazon is to have a standard mechanism for post-incident analysis. The COE process facilitates learning from an event to avoid reoccurrences in […]

Self-service Account Provisioning Using AWS Service Management Connector for ServiceNow

Many customers are looking to adopt a multi-account strategy within their AWS environment. This allows customers to isolate their workloads into different environments including test, dev, and production in addition to separating workloads based on regulatory requirements. As customers scale their multi-account environments, one strategy to increase agility is to offer business units their own […]

Using AWS AppConfig to Manage Multi-Tenant SaaS Configurations

Using AWS AppConfig to Manage Multi-Tenant SaaS Configurations

As a Software as a Service (SaaS) provider, you can benefit from a SaaS operating model in a number of ways. One of the most impactful benefits you can realize is improvements to your operational efficiency, and one of the fundamental techniques you can leverage is to maintain a single software version for all your […]

Automate insights for your EC2 fleets across AWS accounts and regions

Automate insights for your EC2 fleets across AWS accounts and regions

Introduction Gaining insights and managing large Amazon Elastic Compute Cloud (Amazon EC2) fleet that is spread across multiple accounts and regions can be a challenging task. It’s crucial to have a quick and efficient method to identify which instances are managed by AWS Systems Manager (SSM) and gather detailed information about the instances that are […]

Centralize AWS Cost Anomaly Detection using Amazon Managed Grafana

AWS Cost Anomaly Detection uses advanced Machine Learning to identify anomalous spend and root causes, empowering the customers to take action quickly. Currently, in order to view the AWS Cost Anomalies in AWS Cost Explorer, it requires the user to have IAM user access privileges on the AWS Management Console. The ability to centrally monitor and […]

Centralized Dashboard for AWS Config and AWS Security Hub

Back in July 2022, we announced AWS config compliance scores for conformance packs which helps you quantify your compliance posture as an Amazon CloudWatch metric. It’s a quantitative measure of compliance status. While customers can have hundreds of AWS accounts where AWS Config is enabled and each account and each AWS Region have a different compliance score. While […]

Using the Fault Tolerance Analyser Tool to Identify Potential Issues

Introduction Ensuring resilience, the ability for a system to recover from a failure induced by load, attacks, and other issues, is a shared responsibility that underpins the reliability of your workloads. While AWS provides the resilient underlying cloud infrastructure, customers are tasked with maintaining the resilience of their applications. In this landscape of joint responsibility, […]

Provision products and raise patch change requests in AWS via ServiceNow

ServiceNow is a popular cloud-based IT Service Management (ITSM) platform. Organizations use ServiceNow to manage incidents, track scheduled and planned infrastructure changes, manage new service requests and track configuration items across IT systems. Common questions I’ve had from customers include how they can use ServiceNow to provision new instances. Or, how to use ServiceNow to […]