1. 方案介绍
在 AWS 云原生环境下,各项服务都为用户提供了直观的控制台、以及灵活的 API 接口,以创建对应的监控告警,这对于需要整合监控告警到自身平台的用户非常友好。但由于默认没有一键批量开启某项服务告警的入口,这给希望开箱即用的用户带来了一定的配置成本,常见的场景为:
- 在默认情况下,AWS 有为众多服务的实例配置 CloudWatch 监控图表,但并无配置告警,客户也不能为这些实例的某项指标一键配置告警;
- 逐一为单个实例配置告警后,如果实例被删除,对应配置的告警是不会被级联删除的;
- 新增实例时,也需要为该实例重新配置告警,这在 AutoScaling 等一些场景下,并不能依赖手动创建。
因此本方案旨在实现「自动创建告警」与「定制化告警信息」,为企业提供便利,将 AWS 云原生的 CloudWatch 告警引流至企业内部告警平台,而实现了以下功能:
- 为某项服务的所有集群 / 实例指标统一配置告警。例如为 EC2 配置持续5分钟 CPU 利用率超过90%的告警等,只要求该服务的实例支持CloudWatch监控;
- 对告警信息进行定制化,如中文显示、增加紧急程度信息等,并支持推送到企业内部告警平台、即时通信工具(如国内的钉钉、国外的Slack或Chime)、邮件、短信、电话等;
- 获取问题实例的相关属性,例如Tag标签分组等,方便企业对不同资源产生的告警进行紧急程度的区分
- 创建该服务的实例时,自动创建对应的告警;
- 删除该服务的实例时,自动级联删除对应的告警配置;
- 如现在已经为该服务的某个实例配置了对应的告警,此方案会将该实例加入白名单,不额外配置告警,也不会覆盖现有的告警设置,达到良好的补充效果;
- 笔者对EBS、RDS、ALB、NLB、ElastiCache、ElasticSearch、EMR等一些AWS主要服务,创建了自动创建告警的Lambda代码模板;
- 现已支持使用AWS CLI、AWS 控制台两种部署方式,并对告警指标名称等灵活度较高的变量进行了统一收敛。
2. 实现效果
3. 方案整体架构
4. 实现思路
- 创建 / 删除 支持CloudWatch云监控的AWS托管服务资源;
- 通过CloudWatch的定时任务规则,自动调用Lambda 1获取所有的对应资源列表、已创建CloudWatch监控的资源列表,并进行比对,实现自动创建、级联删除CloudWatch监控告警的功能;
- CloudWatch告警被触发后,会将通知发送给SNS 1,传递通知到Lambda 2进行告警信息定制化,完成后经由SNS 2发送给相关团队的成员 / 对接企业内部监控告警平台。
部署方案分为2部分:
- 定制化告警信息
- SNS 1 → Lambda 2 → SNS 2,采取从后往前的部署步骤
- 自动创建监控告警
- CloudWatch定时任务 → Lambda 1 → CloudWatch alarm,先部署Lambda 1,再设置CloudWatch定时任务
本文以RDS的告警指标Database Connections为例,进行部署指引。另考虑到各位云上开发者可能会使用Windows、Mac等不同平台的终端,因此此方案选择了AWS控制台 / AWS CLI命令行、以及Powershell脚本的方式,尽可能地做到终端平台的无关性。
建议各位读者可先参阅本文下述的部署指引,使用 AWS 控制台 的方式进行第一次部署;而后如需同样地为其它AWS服务一键批量开启CloudWatch告警,可参照《Auto Create Customized CloudWatch Alarms – AWS CLI部署方式》,极大地缩短部署所需时间。
5. 部署指引
5.1 创建定制化告警信息推送端 – SNS 2
SNS 2用于将定制化的告警通知发生给相关的团队。
创建SNS主题<To-DBA_team>,选择SNS standard。
订阅SNS主题通知,支持发送Email,或者是通过webHook API机制,发送到HTTP(S) Endpoint。
建议可配置个人邮箱,以确认定制化告警信息的最终实际效果。
5.2 创建定制化告警信息处理脚本 – Lambda 2
Lambda 2用于针对自动创建监控的告警信息 进行定制化。
5.2.1 创建执行Lambda的IAM角色
创建IAM角色,可以直接根据下图进行选择:使用案例 – Lambda。
5.2.2 Attach权限策略:
- Describe / Create / Drop CloudWatch alarms:
arn:aws:iam::aws:policy/CloudWatchFullAccess
arn:aws:iam::aws:policy/service-role/AWSLambdaRole
arn:aws:iam::aws:policy/AmazonSNSFullAccess
arn:aws:iam::aws:policy/AWSLambdaExecute
- Describe / Alter RDS Attributes, Add Tags to RDS:
arn:aws:iam::aws:policy/aws-service-role/AWSApplicationAutoscalingRDSClusterPolicy
角色名称:lambdaExecRole-autoCreateCxCwAlarms_RDS
角色描述:Lambda execution role for Auto create customized CloudWatch alarms for RDS.
5.2.3 创建Lambda layer
安装PyTZ library,用于本地化时区。
anqdian@3c22fb7680e6 autoCreateCxCw % mkdir python
anqdian@3c22fb7680e6 autoCreateCxCw % ls
python
anqdian@3c22fb7680e6 autoCreateCxCw % /usr/bin/pip3 install -t ./python pytz
Collecting pytz
Using cached pytz-2021.1-py2.py3-none-any.whl (510 kB)
Installing collected packages: pytz
Successfully installed pytz-2021.1
在刚创建的python目录下,创建changeAlarmToLocalTimeZone.py文件,添加以下内容,并进行打包:
import json
import boto3
import datetime
import pytz
import re
import urllib
import pytz
import re
def searchAvailableTimezones(zone):
for s in pytz.all_timezones:
if re.search(zone, s, re.IGNORECASE):
print('Matched Zone: {}'.format(s))
def getAllAvailableTimezones():
for tz in pytz.all_timezones:
print (tz)
def changeAlarmToLocalTimeZone(event,timezoneCode,localTimezoneInitial,platform_endpoint):
tz = pytz.timezone(timezoneCode)
#exclude the Alarm event from the SNS records
AlarmEvent = json.loads(event['Records'][0]['Sns']['Message'])
#extract event data like alarm name, region, state, timestamp
alarmName=AlarmEvent['AlarmName']
descriptionexist=0
if "AlarmDescription" in AlarmEvent:
description= AlarmEvent['AlarmDescription']
descriptionexist=1
reason=AlarmEvent['NewStateReason']
region=AlarmEvent['Region']
state=AlarmEvent['NewStateValue']
previousState=AlarmEvent['OldStateValue']
timestamp=AlarmEvent['StateChangeTime']
Subject= event['Records'][0]['Sns']['Subject']
alarmARN=AlarmEvent['AlarmArn']
RegionID=alarmARN.split(":")[3]
AccountID=AlarmEvent['AWSAccountId']
#get the datapoints substring
pattern = re.compile('\[(.*?)\]')
#test if pattern match and there is datapoints
if pattern.search(reason):
Tempstr = pattern.findall(reason)[0]
#get in the message all datapoints timestamps and convert to localTimezone using same format
pattern = re.compile('\(.*?\)')
m = pattern.finditer(Tempstr)
for match in m:
Tempstr=match.group()
tempStamp = datetime.datetime.strptime(Tempstr, "(%d/%m/%y %H:%M:%S)")
tempStamp = tempStamp.astimezone(tz)
tempStamp = tempStamp.strftime('%d/%m/%y %H:%M:%S')
reason=reason.replace(Tempstr, '('+tempStamp+')')
#convert timestamp to localTimezone time
timestamp = timestamp.split(".")[0]
timestamp = datetime.datetime.strptime(timestamp, "%Y-%m-%dT%H:%M:%S")
localTimeStamp = timestamp.astimezone(tz)
localTimeStamp = localTimeStamp.strftime("%A %B, %Y %H:%M:%S")
#create Custom message and change timestamps
customMessage='You are receiving this email because your Amazon CloudWatch Alarm "'+alarmName+'" in the '+region+' region has entered the '+state+' state, because "'+reason+'" at "'+localTimeStamp+' '+localTimezoneInitial +'.'
# Add Console link
customMessage=customMessage+'\n\n View this alarm in the AWS Management Console: \n'+ 'https://'+RegionID+'.console.thinkwithwp.com/cloudwatch/home?region='+RegionID+'#s=Alarms&alarm='+urllib.parse.quote(alarmName)
#Add Alarm Name
customMessage=customMessage+'\n\n Alarm Details:\n- Name:\t\t\t\t\t\t'+alarmName
# Add alarm description if exist
if (descriptionexist == 1) : customMessage=customMessage+'\n- Description:\t\t\t\t\t'+description
customMessage=customMessage+'\n- State Change:\t\t\t\t'+previousState+' -> '+state
# Add alarm reason for changes
customMessage=customMessage+'\n- Reason for State Change:\t\t'+reason
# Add alarm evaluation timeStamp
customMessage=customMessage+'\n- Timestamp:\t\t\t\t\t'+localTimeStamp+' '+localTimezoneInitial
# Add AccountID
customMessage=customMessage+'\n- AWS Account: \t\t\t\t'+AccountID
# Add Alarm ARN
customMessage=customMessage+'\n- Alarm Arn:\t\t\t\t\t'+alarmARN
#push message to SNS topic
response = platform_endpoint.publish(
Message=customMessage,
Subject=Subject,
MessageStructure='string'
)
anqdian@3c22fb7680e6 autoCreateCxCw % zip -r SNSSubscribtion-pytzLayer.zip ./python/*
创建Lambda layer
名称:customizedAlarms-RDS_DatabaseConnections
描述:Customize CloudWatch alarms for RDS – DatabaseConnections.
Runtime:Python 3.8
上传.zip文件:SNSSubscribtion-pytzLayer.zip
5.2.4 Powershell on Mac
下载Powershell,选择MacOS 10.13+
https://github.com/PowerShell/PowerShell
安装Powershell on Mac,需要在 系统偏好设置 → 安全性与隐私,允许安装Powershell。
安装AWS工具模块、AWS CLI和升级URLlib
https://docs.thinkwithwp.com/zh_cn/powershell/latest/userguide/pstools-getting-set-up-linux-mac.html
# Windows:
# [Net.ServicePointManager]::SecurityProtocol = [Net.SecurityProtocolType]::Tls12
# 启动Powershell:
pwsh
Install-Module -Name AWS.Tools.Installer -Force
Install-Module -Name AWS.Tools.Common
Install-Module -Name AWS.Tools.Lambda,AWS.Tools.SecurityToken
Install-Module AWSPowerShell
Install-Module AWSLambdaPSCore
pip install --upgrade "urllib3==1.26" awscli
5.2.5 部署Lambda 2
准备以下4个文件:Deploy.ps1、index.py、requirements.txt、setup.cfg,将这4个文件放在单独的文件夹《autoCreateCxCw_RDS-Lambda2》。
在Powershell当中运行Deploy.ps1,部署Lambda。
《Deploy.ps1》
Set-DefaultAWSRegion -Region <us-west-2>
Set-Location -Path $PSScriptRoot
$ZipFileName = 'lambda2-autoCreateCxCw.zip'
Write-Host -Object 'Restoring dependencies ...'
pip3 install -r $PSScriptRoot/requirements.txt -t $PSScriptRoot/
Write-Host -Object 'Compressing files ...'
Get-ChildItem -Recurse | ForEach-Object -Process {
$NewPath = $PSItem.FullName.Substring($PSScriptRoot.Length + 1)
zip -u "$PSScriptRoot/$ZipFileName" $NewPath
# Windows:
# Compress-Archive -Path $NewPath -Update -DestinationPath "$PSScriptRoot\$ZipFileName"
}
Write-Host -Object 'Deploying Lambda function'
$Function = @{
FunctionName = 'CustomizeCloudWatchAlarmsNotifications-RDS_DatabaseConnections'
Runtime = 'python3.8'
Description = 'Customize CloudWatch alarms notification for RDS - DatabaseConnections. '
ZipFilename = $ZipFileName
Handler = 'index.lambda_handler'
Role = '<arn:aws:iam::532134256174:role/lambdaExecRole-autoCreateCxCwAlarms_RDS>'
Environment_Variable = @{
NotificationSNSTopic = '<arn:aws:sns:us-west-2:532134256174:To-DBA_team>'
TimeZoneCode = 'Asia/Hong_Kong'
TimezoneInitial = 'UTC+8'
# CHIME_WEBHOOK = 'https://hooks.chime.aws/incomingwebhooks/3c8fd66f-6e40-4375-9fe8-0ba6a57cb375?token=aWVuczdtTUd8MXxCZC05SmNIZ3RqUFMydXpydllNTUx2em15WU5YZVNrX0ZodWc3THljdFg0'
}
MemorySize = 512
Timeout = 60
Layer = "<arn:aws:lambda:us-west-2:532134256174:layer:customizedAlarms-RDS_DatabaseConnections:1>"
}
Remove-LMFunction -FunctionName $Function.FunctionName -Force
Publish-LMFunction @Function
Write-Host -Object 'Deployment completed' -ForegroundColor Green
《index.py》
import boto3
import os
from changeAlarmToLocalTimeZone import *
#Get SNS Topic ARN from Environment variables
NotificationSNSTopic = os.environ['NotificationSNSTopic']
#Get timezone corresponding to your localTimezone from Environment variables
timezoneCode = os.environ['TimeZoneCode']
#Get Your local timezone Initials, E.g UTC+2, IST, AEST...etc from Environment variables
localTimezoneInitial=os.environ['TimezoneInitial']
#Get SNS resource using boto3
SNS = boto3.resource('sns')
#Specify the SNS topic to push message to by ARN
platform_endpoint = SNS.PlatformEndpoint(NotificationSNSTopic)
def lambda_handler(event, context):
#Call Main function
changeAlarmToLocalTimeZone(event,timezoneCode,localTimezoneInitial,platform_endpoint)
#Print All Available timezones
#getAllAvailableTimezones()
#search if Timezone/Country exist
#searchAvailableTimezones('sy')
《requirements.txt》
requests
《setup.cfg》
5.2.6 Lambda 2设置调优
运行时设置:由于在“Powershell on Mac”步骤当中,我们准备的文件名为index.py,且lambda_handler是Lambda的主入口,因此在Lambda中需要确保“运行时设置”中的处理程序为:index.lambda_handler
可以使用以下JSON作为测试事件。由于SNS发出的通知与下述JSON文件不同,因此不适宜用于此Lambda 2与前置SNS 1的连通性测试。
{
"Records": [
{
"EventSource": "aws:sns",
"EventVersion": "1.0",
"EventSubscriptionArn": "arn:aws:lambda:us-west-2:532134256174:function:CustomizeCloudWatchAlarmsNotifications-RDS_DatabaseConnections",
"Sns": {
"Type": "Notification",
"MessageId": "f9f5ed56-3d38-57c8-b4ea-b51588f5f871",
"TopicArn": "arn:aws:sns:us-west-2:532134256174:customizedAlarmAction-RDS_DatabaseConnections",
"Subject": "ALARM: \"Test LocalTime\" in China, Asia (Hong Kong)",
"Message": "{\"AlarmName\":\"RDS_DatabaseConnections\",\"AlarmDescription\":\"Auto-created customized CloudWatch Alarm <RDS_DatabaseConnections>\",\"AWSAccountId\":\"532134256174\",\"NewStateValue\":\"ALARM\",\"NewStateReason\":\"Threshold Crossed: 1 out of the last 1 datapoints [0.0 (04/12/20 03:56:00)] was greater than or equal to the threshold (0.0) (minimum 1 datapoint for OK -> ALARM transition).\",\"StateChangeTime\":\"2020-12-04T03:57:01.659+0000\",\"Region\":\"US West (Oregon)\",\"AlarmArn\":\"arn:aws:cloudwatch:us-west:532134256174:alarm:RDS_DatabaseConnections LocalTime\",\"OldStateValue\":\"OK\",\"Trigger\":{\"Period\":60,\"EvaluationPeriods\":1,\"ComparisonOperator\":\"GreaterThanOrEqualToThreshold\",\"Threshold\":0.0,\"TreatMissingData\":\"- TreatMissingData: missing\",\"EvaluateLowSampleCountPercentile\":\"\",\"Metrics\":[{\"Expression\":\"FILL(m1, 0)\",\"Id\":\"e1\",\"Label\":\"Expression1\",\"ReturnData\":true},{\"Id\":\"m1\",\"MetricStat\":{\"Metric\":{\"Dimensions\":[{\"value\":\"API\",\"name\":\"Type\"},{\"value\":\"DescribeAlarms\",\"name\":\"Resource\"},{\"value\":\"CloudWatch\",\"name\":\"Service\"},{\"value\":\"None\",\"name\":\"Class\"}],\"MetricName\":\"CallCount\",\"Namespace\":\"AWS/Usage\"},\"Period\":60,\"Stat\":\"Average\"},\"ReturnData\":false}]}}",
"Timestamp": "2020-12-04T03:57:01.702Z",
"SignatureVersion": "1",
"Signature": "WcgVMPrlQsJY3yqbds968tqKPC6KKDWHSjIwEmzKVHZYg6foN9F5sm2Tp5IWPgaM9wMmYg8dpQjkxSm4q9V9iP1PbLp81RgJS2NghdeHNVnyxyzywXFMDztYZpgB2pjzfT101RVGpUwVPntOpBeBq2KAs/NrFX1nS2aTK/OX+gyOxwYZxRftzd+ttHA+PCh0kKlym7nnxaWuO9hgSrnupH2YttuvsdTSAOZ4MGhBON/sMmmlcxzfiFD+jJaqlHFmQ0DncjSe1NNwceOpwNsue6//sMYU1QzV6bO34I343KmQdXYw/KISDz7qH70Odm7nRLN3ExSOhtC/FS0/dXGl4Q==",
"SigningCertUrl": "https://sns.us-west-2.amazonaws.com/SimpleNotificationService-010a507c1833636cd94bdb98bd93083a.pem",
"UnsubscribeUrl": "https://sns.us-west-2.amazonaws.com/?Action=Unsubscribe&SubscriptionArn=arn:aws:sns:us-west-2:532134256174:customizedAlarmAction-RDS_DatabaseConnections",
"MessageAttributes": {}
}
}
]
}
5.3 定制化告警信息 – SNS 1
SNS 1用于接收告警信息,并转发到Lambda 2对告警通知进行定制化。
创建SNS主题<customizedAlarmAction-RDS_DatabaseConnections>,选择SNS standard。
创建SNS主题订阅,订阅的终端节点选择Lambda <CustomizeCloudWatchAlarmsNotifications-RDS_DatabaseConnections>的ARN。
至此,本方案的第一部分《定制化告警信息》业已完成。
5.4 自动创建监控告警 – Lambda 1
Lambda 1 用于为指定的AWS托管服务下所有的实例 自动创建特定的监控告警。
5.4.1 部署Lambda 1
准备以下4个文件:Deploy.ps1、index.py、requirements.txt、setup.cfg,将这4个文件放在单独的文件夹《autoCreateCxCw_RDS-Lambda1》。
在Powershell当中运行Deploy.ps1,部署Lambda。
《Deploy.ps1》
Set-DefaultAWSRegion -Region <us-west-2>
Set-Location -Path $PSScriptRoot
$ZipFileName = 'lambda1-autoCreateCxCw.zip'
Write-Host -Object 'Restoring dependencies ...'
pip3 install -r $PSScriptRoot/requirements.txt -t $PSScriptRoot/
Write-Host -Object 'Compressing files ...'
Get-ChildItem -Recurse | ForEach-Object -Process {
$NewPath = $PSItem.FullName.Substring($PSScriptRoot.Length + 1)
zip -u "$PSScriptRoot/$ZipFileName" $NewPath
# Windows:
# Compress-Archive -Path $NewPath -Update -DestinationPath "$PSScriptRoot\$ZipFileName"
}
Write-Host -Object 'Deploying Lambda function'
$Function = @{
FunctionName = 'AutoCreateCloudWatchAlarms-RDS_DatabaseConnections'
Runtime = 'python3.8'
Description = 'Auto create customized CloudWatch alarms for RDS - DatabaseConnections. '
ZipFilename = $ZipFileName
Handler = 'index.handler'
Role = '<arn:aws:iam::532134256174:role/lambdaExecRole-autoCreateCxCwAlarms_RDS>'
Environment_Variable = @{
MetricName = 'DatabaseConnections'
MaxItems = '3'
SNS_topic_suffix = 'RDS_DatabaseConnections'
# CHIME_WEBHOOK = 'https://hooks.chime.aws/incomingwebhooks/3c8fd66f-6e40-4375-9fe8-0ba6a57cb375?token=aWVuczdtTUd8MXxCZC05SmNIZ3RqUFMydXpydllNTUx2em15WU5YZVNrX0ZodWc3THljdFg0'
}
MemorySize = 512
Timeout = 60
}
Remove-LMFunction -FunctionName $Function.FunctionName -Force
Publish-LMFunction @Function
Write-Host -Object 'Deployment completed' -ForegroundColor Green
建议先将 最大创建RDS CloudWatch alarms数量的参数 MaxItems 设置为3,作为全面铺开本监控告警方案之前的效果实测。
《index.py》
笔者对RDS、ElasticSearch、ElastiCache、EMR、ELB、EBS等AWS常用服务都进行了适配。例如,需要为RDS实例的CPU利用率创建自动告警,则应完成以下两步:
- 可使用《RDS – CPUUtilization》的模板作为Lambda1 – index.py里面的内容、并确认校正当中指定的告警阈值;
- 在上述Lambda1 – Deploy.ps1 – Environment_Variable – MetricName环境变量中,指定对应的CloudWatch告警指标名称(MetricName = ‘CPUUtilization’)。
AWS部分常用服务的自动创建告警Lambda代码模板GitHub链接:
RDS – DatabaseConnections:
RDS DatabaseConnections – vProd
RDS Truncate – vProd
RDS – CPUUtilization:
RDS CPUUtilization – vProd
ElasticSearch – JVMMemoryPressure:
ES JVMMemoryPressure – vProd
ElastiCache – DatabaseMemoryUsagePercentage:
EC DatabaseMemoryUsagePercentage – vProd
EMR – HDFSUtilization:
EMR HDFSUtilization – vProd
ALB – HTTPCode_Target_5XX_Count:
ALB HTTPCode_Target_5XX_Count – vProd SourceCode
ALB Truncate – vProd
NLB – ActiveFlowCount:
NLB ActiveFlowCount – vProd SourceCode
EBS – BurstBalance:
EBS BurstBalance – vProd
《requirements.txt》
requests
《setup.cfg》
5.4.2 Lambda 1设置调优
- 运行时设置:由于在“Powershell on Mac”步骤当中,我们准备的文件名为py,且handler是Lambda的主入口,因此在Lambda中需要确保“运行时设置”中的处理程序为:index.handler
测试Lambda
在Lambda中配置测试事件,选择sns-notification作为事件模板。也可以将以下真实告警通知替换到测试事件模板当中。
{
"Records": [
{
"EventSource": "aws:sns",
"EventVersion": "1.0",
"EventSubscriptionArn": "arn:aws:sns:us-west-2:532134256174:chimewebhook:9809d03a-21a0-4aba-8a2f-d2554cdeac34",
"Sns": {
"Type": "Notification",
"MessageId": "c07ee68e-9dfb-5b65-924e-becec206c0f1",
"TopicArn": "arn:aws:sns:us-west-2:532134256174:chimewebhook",
"Subject": "ALARM: 'CW-chime' in US West (Oregon)",
"Message": {
"AlarmName":"CW-chime",
"AlarmDescription":null,
"AWSAccountId":"532134256174",
"NewStateValue":"ALARM",
"NewStateReason":"Threshold Crossed: 1 out of the last 1 datapoints [1.0 (01/12/20 15:12:00)] was greater than or equal to the threshold (1.0) (minimum 1 datapoint for OK -> ALARM transition).",
"StateChangeTime":"2020-12-01T15:14:05.915+0000",
"Region":"US West (Oregon)",
"AlarmArn":"arn:aws:cloudwatch:us-west-2:532134256174:alarm:CW-chime",
"OldStateValue":"INSUFFICIENT_DATA",
"Trigger":{
"MetricName":"HTTPCode_ELB_5XX_Count",
"Namespace":"AWS/ApplicationELB",
"StatisticType":"Statistic",
"Statistic":"AVERAGE",
"Unit":null,
"Dimensions":[{
"value":"app/ELB-CW-SearchTest/860a113ea68c543f",
"name":"LoadBalancer"
}],
"Period":60,
"EvaluationPeriods":1,
"ComparisonOperator":"GreaterThanOrEqualToThreshold",
"Threshold":1.0,
"TreatMissingData":"- TreatMissingData: missing",
"EvaluateLowSampleCountPercentile":""
}
},
"Timestamp": "2020-12-01T15:14:05.969Z",
"SignatureVersion": "1",
"Signature": "jD0bB3UVT7Boy/SEyVkDy0JCNynjkeMBb4WlqG7Vm3+HDatnXDQrBHAayQ8VQmDgyA9pbdESKeJufdhE77R/73dQ+XX27CnsMQore46J+dNTqEeIKwThT8lmZtUWGypu1fPxpVFZl8eKcZhqjN5pK+OC8u+KdglnJGPkFok/UHZLwMe321oVVvxQznEF/zJGRC+tEUd+3aN/IlaNNZHjFduFnOt0WZDvAK42/3jnsfEk2DzpE7hsRd2+eUJfRIZbCnpxdsFZnsh42fzt44mjNXAoqk9TjbddWtaS5ERkS5vuJHTNDSqes1xCJIgpljkaCO4xQbRg/ZH0+4dGyNDSXw==",
"SigningCertUrl": "https://sns.us-west-2.amazonaws.com/SimpleNotificationService-010a507c1833636cd94bdb98bd93083a.pem",
"UnsubscribeUrl": "https://sns.us-west-2.amazonaws.com/?Action=Unsubscribe&SubscriptionArn=arn:aws:sns:us-west-2:532134256174:chimewebhook:9809d03a-21a0-4aba-8a2f-d2554cdeac34",
"MessageAttributes": {}
}
}
]
}
5.5 自动创建监控告警 – CloudWatch定时任务
创建CloudWatch定时任务,定时调用Lambda 1 创建监控告警。
创建CloudWatch事件规则:
事件源:计划
固定频率:1分钟
目标:Lambda函数
名称:AutoCreateCloudWatchAlarms
描述:Scheduler to run Lambda function <AutoCreateCloudWatchAlarms> every 1 min.
请注意,创建该规则后,Lambda函数<AutoCreateCloudWatchAlarms-RDS_DatabaseConnections>会被每分钟执行一次,请确认是否选择启用规则。此外,如需在全面铺开本监控告警方案之前进行效果实测,可参考《自动创建监控告警 – Lambda 1》章节,将<Prepare target RDS list>代码段中,最大创建RDS CloudWatch alarms的数量 调整为2。
完成以上步骤,本文旨在实现的功能业已实现。后期我们可以根据企业需求,直接在Lambda当中,修改index.py的函数代码,让Lambda支持更多的定制化监控告警的自动部署,也更好地定制企业专属的告警方式。
文章最后附上boto3的开发文档,里面有充足的指引与典型样例,以供参考。
https://boto3.amazonaws.com/v1/documentation/api/latest/guide/quickstart.html
本篇作者