Source monitoring for AWS Elemental MediaLive via Amazon Rekognition

Introduction

AWS Elemental MediaLive is a broadcast-grade video encoder with a comprehensive suite of metrics that can be monitored for stream quality and health, along with a full set of alerts that can be acted upon to ensure the best viewer experience. There are scenarios where a source is present on a MediaLive input but the video is impaired in such a way that it would not trigger an alert. For example, this could happen if an upstream source is routed incorrectly to an unused input on a router or a playout server freezes, thus producing black video, color bars, or frozen frames. These incidents can go undetected unless an operator is watching the channel. The following solution below can help.

The solution

MediaLive by default, generates input source thumbnails for all encode pipelines and it is these thumbnails that are requested and passed to Amazon Rekognition for analysis. No inference training is required in order to use the image properties analysis feature. The following is an example of the first few lines of a typical response received to an “IMAGE_PROPERTIES” request:

  "ImageProperties": {
    "Quality": {
      "Brightness": 86.21904754638672,
      "Sharpness": 57.17457580566406,
      "Contrast": 68.0351791381836
    },
    "DominantColors": [
      {
        "Red": 169,
        "Blue": 169,
        "Green": 169,
        "HexCode": "#a9a9a9",
        "CSSColor": "darkgrey",
        "SimplifiedColor": "grey",
        "PixelPercent": 37.68227005004883
      }
      ...
      etc.

A vast amount of information can be returned regarding the color make-up of the supplied image. We are only interested in the quality section at the top, as these constantly changing values provide enough information to target detection. These values can be extracted, summed and pushed to Amazon CloudWatch to produce a graph of values over time. If a black frame is received, these values will all fall to near-zero and will remain at this level while the input remains black. A similar principle is employed to catch an input that has frozen, as the values will remain at a constant value, above zero.

With the addition of some CloudWatch math, we can generate a difference graph of the values that can be used to trigger an alarm when the calculated metrics fail to change for a number of consecutive data points. This can be used to alert operators of a problem or take automatic action, such as switching to another input source. Two consecutive data points have been chosen in this example rather than three, as the first of a group of three will generally be the spike as the image changes to the black or frozen frame.

Following is an example of a simple version of the solution, using CloudWatch Scheduler to trigger an AWS Lambda at regular intervals:

Diagram of a typical MediaLive channel with detection via Lambda and CloudWatch Scheduler

The CloudWatch Scheduler can be configured to poll a maximum of once every minute, meaning it will take over 3 minutes to obtain the datapoints for a confident alarm trigger. This is useful for a proof-of-concept or for low-tier channels, while a more advanced solution is detailed later on for greatly increased detection speeds.

Example output of the Lambda in CloudWatch from a MediaLive channel

In the previous CloudWatch example, the ImageProperties metric is Brightness, Contrast, and Sharpness as combined in the Lambda script. The Difference metric is generated using the following CloudWatch math expression: ABS(DIFF(m1)), where m1 is ImageProperties. This equation calculates the absolute difference from the previous sample.

The same expression is used when configuring a CloudWatch Alarm. Setting the alarm to trigger when the difference metric is approximately 10 to 15 is a good starting point.

Example configuration of a CloudWatch Alarm

The basic Python Lambda code to demonstrate this is as follows:

import base64
import boto3
import logging
import os
import sys
from datetime import datetime

log = logging.getLogger(__name__)
logging.basicConfig(stream=sys.stderr, level=logging.WARNING)
log.setLevel(logging.WARNING)

# -- Configure the following environment variables before use --
ChannelId = os.environ.get("ChannelId")
PipelineNum = int(os.environ.get("PipelineNum"))
Namespace = os.environ.get("MetricNamespace")

def lambda_handler(event, context):
    # Obtain thumbnail from MediaLive Pipeline
    JpegBody = Request_Thumbnail()
    if JpegBody:
        # Decode JSON response to JPEG binary
        JpegImage=base64.b64decode(JpegBody)
            # Run Rekognition
        RekogJson = Detect_Image_Propeties(JpegImage) 
        # Post Metrics
        if RekogJson:
            Brightness=RekogJson["Brightness"]
            Sharpness=RekogJson["Sharpness"]
            Contrast=RekogJson["Contrast"]
            Response_Body = Send_To_CloudWatch(Brightness, Sharpness, Contrast)            
    return {
        "statusCode": 200,
        "body": "Completed"
    }

def Request_Thumbnail():
    eml = boto3.client("medialive")
    Thumbnail = None
    try:
        jpegjson = eml.describe_thumbnails(
            ChannelId=str(ChannelId),
            PipelineId=str(PipelineNum),
            ThumbnailType="CURRENT_ACTIVE"
        )
        Thumbnail = jpegjson["ThumbnailDetails"][0]["Thumbnails"][0]["Body"]
    except Exception as e:
        log.error(f"Error {e} in Request_Thumbnail for {ChannelId}:{PipelineNum}")
    return Thumbnail

def Detect_Image_Propeties(JpegImage):
    rekognition_client = boto3.client("rekognition")
    try:
        response = rekognition_client.detect_labels(
            Image={"Bytes": JpegImage},
            Features = ["IMAGE_PROPERTIES"]
            )
    except Exception as e:
        log.error(f"Couldn't detect labels in image - {e}")
        return None
    else:
        # Return image properties JSON
        return response["ImageProperties"]["Quality"]

def Send_To_CloudWatch(Brightness, Sharpness, Contrast):
    unit="Percent"
    CombinedValues = Brightness + Sharpness + Contrast
    try:
        cloudwatch = boto3.client("cloudwatch")
        cloudwatch.put_metric_data(
            MetricData=[
                {
                    "MetricName": "ImageProperties",
                    "Value": CombinedValues,
                    "Unit": unit,
                    "StorageResolution": 1,
                    "Timestamp": datetime.now(),
                    "Dimensions": [
                        {
                            "Name": "ChannelId",
                            "Value": str(ChannelId)
                        },
                        {
                            "Name": "Pipeline",
                            "Value": str(PipelineNum)
                        }
                    ]
                }
            ],
            Namespace=Namespace
        )
    except Exception as e:
        log.error(f"Couldn't put data for metric - {e}")
        raise
    return

Note that you will need to configure the environment variables for Medialive channel ID, pipeline number, and Cloudwatch namespace before using the Lambda function.

Advanced solution

For lower latency detection on high-tier content, for example and to more easily handle a higher number of channels in a more effective manner, we can build on the previous solution, with AWS Fargate.

Diagram of a typical MediaLive channel with detection via Fargate

The updated solution can take more frequent samples, for example, every 10 seconds. This brings the detection time down to just 30 seconds to trigger an alert. A separate Python thread for each channel is used in place of the CloudWatch Scheduler / Lambda pair. As MediaLive channels are started, the events are passed via Amazon EventBridge to an SQS queue to automatically create or destroy the threads. This ensures channels are only monitored when they are running. High-resolution metrics are employed, over the standard one minute metrics.

The remainder of the solution remains as before, with alarms to be configured on the CloudWatch math metric, but there is no reason why the difference detection could not be done in the Python code itself and a simple zero or one emitted to trigger a CloudWatch alarm. A similar principle could be employed to monitor the audio levels emitted by Medialive to CloudWatch for silence detection.

The advanced solution is available on GitHub.

Summary

The two solutions described in this blog add additional monitoring confidence on MediaLive channels, above and beyond that already provided by the service. These solutions can also serve as building blocks for additional monitoring, tailoring to specific customer requirements where appropriate.

AWS for M&E Blog

Source monitoring for AWS Elemental MediaLive via Amazon Rekognition

Introduction

The solution

Advanced solution

Summary

Resources

Follow

Learn

Resources

Developers

Help