The Internet of Things on AWS – Official Blog

How to build smart applications using Protocol Buffers with AWS IoT Core

Introduction to Protocol Buffers

Protocol Buffers, or Protobuf, provide a platform-neutral approach for serializing structured data. Protobuf is similar to JSON, except it is smaller, faster, and is capable of automatically generating bindings in your preferred programming language.

AWS IoT Core is a managed service that lets you connect billions of IoT devices and route trillions of messages to AWS services, enabling you to scale your application to millions of devices seamlessly. With AWS IoT Core and Protobuf integration, you can also benefit from Protobuf’s lean data serialization protocol and automated code binding generation.

Agility and security in IoT with Protobuf code generation

A key advantage comes from the ease and security of software development using Protobuf’s code generator. You can write a schema to describe messages exchanged between the components of your application. A code generator (protoc or others) interprets the schema and implements the encoding and decoding function in your programming language of choice. Protobuf’s code generators are well maintained and widely used, resulting in robust, battle-tested code.

Automated code generation frees developers from writing the encoding and decoding functions, and ensures its compatibility between programming languages. Allied with the new launch of AWS IoT Core’s Rule Engine support for Protocol Buffer messaging format, you can have a producer application written in C running on your device, and an AWS Lambda function consumer written in Python, all using generated bindings.

Other advantages of using Protocol Buffers over JSON with AWS IoT Core are:

  • Schema and validation: The schema is enforced both by the sender and receiver, ensuring that proper integration is achieved. Since messages are encoded and decoded by the auto-generated code, bugs are eliminated.
  • Adaptability: The schema is mutable and it is possible to change message content maintaining backward and forward compatibility.
  • Bandwidth optimization: For the same content, message length is smaller using Protobuf, since you are not sending headers, only data. Over time this provides better device autonomy and less bandwidth usage. A recent research on Messaging Protocols and Serialization Formats revealed that a Protobuf formatted message can be up to 10 times smaller than its equivalent JSON formatted message. This means fewer bytes effectively go through the wire to transmit the same content.
  • Efficient decoding: Decoding Protobuf messages is more efficient than decoding JSON, which means recipient functions run in less time. A benchmark run by Auth0 revealed that Protobuf can be up to 6 times more performant than JSON for equivalent message payloads.

This blog post will walk you through deploying a sample application that publishes messages to AWS IoT Core using Protobuf format. The messages are then selectively filtered by the AWS IoT Core Rules Engine rule.

Let’s review some of the basics of Protobuf.

Protocol Buffers in a nutshell

The message schema is a key element of Protobuf. A schema may look like this:

syntax = "proto3";

import "google/protobuf/timestamp.proto";

message Telemetry
{
  enum MsgType
  {
    MSGTYPE_NORMAL = 0;
    MSGTYPE_ALERT = 1;
  }
  MsgType msgType = 1;
  string instrumentTag = 2;
  google.protobuf.Timestamp timestamp = 3;
  double value = 4;
}

The first line of the schema defines the version of Protocol Buffers you are using. This post will use proto3 version syntax, but proto2 is also supported.

The following line indicates that a new message definition called Telemetry will be described.

This message in particular has four distinct fields:

  • A msgType field, which is of type MsgType and can only take on enumerated values "MSGTYPE_NORMAL" or "MSGTYPE_ALERT"
  • An instrumentTag field, which is of type string and identifies the measuring instrument sending telemetry data
  • A timestamp field of type google.protobuf.Timestamp which indicates the time of the measurement
  • A value field of type double which contains the value measured

Please consult the complete documentation for all possible data types and additional information on the syntax.

A Telemetry message written in JSON looks like this:

{
  "msgType": "MSGTYPE_ALERT",
  "instrumentTag": "Temperature-001",
  "timestamp": 1676059669,
  "value": 72.5
}

The same message using protocol Buffers (encoded as base64 for display purposes) looks like this:

0801120F54656D70657261747572652D3030311A060895C89A9F06210000000000205240

Note that the JSON representation of the message is 115 bytes, versus the Protobuf one at only 36 bytes.

Once the schema is defined protoc can be used to:

  1. Create bindings in your programming language of choice
  2. Create a FileDescriptorSet, that is used by AWS IoT Core to decode received messages.

Using Protocol Buffers with AWS IoT Core

Protobuf can be used in multiple ways with AWS IoT Core. The simplest way is to publish the message as binary payload and have recipient applications decode it. This is already supported by AWS IoT Core Rules Engine and works for any binary payload, not just Protobuf.

However, you get the most value when you want to decode Protobuf messages for filtering and forwarding. Filtered messages can be forwarded as Protobuf, or even decoded to JSON for compatibility with applications that only understand this format.

The recently launched AWS IoT Rules Engine support for Protocol Buffer messaging format allows you to do just that with minimal effort, in a managed way. In the following sections we will guide you through deploying and running a sample application.

Prerequisites
To run this sample application you must have the following:

Sample application: Filtering and forwarding Protobuf messages as JSON

To deploy and run the sample application, we will perform 7 simple steps:

  1. Download the sample code and install Python requirements
  2. Configure your IOT_ENDPOINT and AWS_REGION environment variables
  3. Use protoc to generate Python bindings and message descriptors
  4. Run a simulated device using Python and the Protobuf generated code bindings
  5. Create AWS Resources using AWS CloudFormation and upload the Protobuf file descriptor
  6. Inspect the AWS IoT Rule that matches, filters and republishes Protobuf messages as JSON
  7. Verify transformed messages are being republished

Step 1: Download the sample code and install Python requirements

To run the sample application, you need to download the code and install its dependencies:

  • First, download and extract the sample application from our AWS github repository: https://github.com/aws-samples/aws-iotcore-protobuf-sample
  • If you downloaded it as a ZIP file, extract it
  • To install the necessary python requirements, run the following command within the folder of the extracted sample application
pip install -r requirements.txt

The command above will install two required Python dependencies: boto3 (the AWS SDK for Python) and protobuf.

Step 2: Configure your IOT_ENDPOINT and AWS_REGION environment variables

Our simulated IoT device will connect to the AWS IoT Core endpoint to send Protobuf formatted messages.

If you are running Linux or Mac, run the following command. Make sure to replace <AWS_REGION> with the AWS Region of your choice.

export AWS_REGION=<AWS_REGION>
export IOT_ENDPOINT=$(aws iot describe-endpoint --endpoint-type iot:Data-ATS --query endpointAddress --region $AWS_REGION --output text)

Step 3: Use protoc to generate Python bindings and message descriptor

The extracted sample application contains a file named msg.proto similar to the schema example we presented earlier.

Run the commands below to generate the code bindings your simulated device will use to generate the file descriptor.

protoc --python_out=. msg.proto
protoc --include_imports -o filedescriptor.desc msg.proto

After running these commands, you should see in your current folder two new files:

filedescriptor.desc msg_pb2.py

Step 4: Run the simulated device using Python and the Protobuf generated code bindings

The extracted sample application contains a file named simulate_device.py.

To start a simulated device, run the following command:

python3 simulate_device.py

Verify that messages are being sent to AWS IoT Core using the MQTT Test Client on the AWS console.

Subscribe to a topic

  1. Access the AWS IoT Core service console: https://console.thinkwithwp.com/iot; make sure you are in the correct AWS Region.
  2. Under Test, select MQTT test client.
  3. Under the Topic filter, fill in test/telemetry_all
  4. Expand the Additional configuration section and under MQTT payload display select Display raw payloads.
  5. Click Subscribe and watch as Protobuf formatted messages arrive into the AWS IoT Core MQTT broker.

View subscriptions

Step 5: Create AWS Resources using AWS CloudFormation and upload the Protobuf file descriptor

The extracted sample application contains an AWS CloudFormation template named support-infrastructure-template.yaml.

This template defines an Amazon S3 Bucket, an AWS IAM Role and an AWS IoT Rule.

Run the following command to deploy the CloudFormation template to your AWS account. Make sure to replace <YOUR_BUCKET_NAME> and <AWS_REGION> with a unique name for your S3 Bucket and the AWS Region of your choice.

aws cloudformation create-stack --stack-name IotBlogPostSample \
--template-body file://support-infrastructure-template.yaml \
--capabilities CAPABILITY_NAMED_IAM \
--parameters ParameterKey=FileDescriptorBucketName,ParameterValue=<YOUR_BUCKET_NAME> \
--region=<AWS_REGION>

AWS IoT Core’s support for Protobuf formatted messages requires the file descriptor we generated with protoc. To make it available we will upload it to the created S3 bucket. Run the following command to upload the file descriptor. Make sure to replace <YOUR_BUCKET_NAME> with the same name you chose when deploying the CloudFormation template. aws s3 cp filedescriptor.desc s3://<YOUR_BUCKET_NAME>/msg/filedescriptor.desc

Step 6: Inspect the AWS IoT Rule that matches, filters, and republishes Protobuf messages as JSON

Let’s assume you want to filter messages that have a msgType of MSGTYPE_ALERT, because these indicate there might be dangerous operating conditions. The CloudFormation template creates an AWS IoT Rule that decodes the Protobuf formatted message our simulated device is sending to AWS IoT Core, it then selects those that are alerts and republishes, in JSON format, so that another MQTT topic responder can subscribe to. To inspect the AWS IoT Rule, perform the following steps:

  1. Access the AWS IoT Core service console: https://console.thinkwithwp.com/iot
  2. On the left-side menu, under Message Routing, click Rules
  3. The list will contain an AWS IoT Rule named ProtobufAlertRule, click to view the details
  4. Under the SQL statement, note the SQL statement, we will go over the meaning of each element shortly
  5. Under Actions, note the single action to Republish to AWS IoT topic
SELECT
  VALUE decode(encode(*, 'base64'), "proto", "<YOUR_BUCKET_NAME>", "msg/filedescriptor.desc", "msg", "Telemetry")
FROM
  'test/telemetry_all'
WHERE
  decode(encode(*, 'base64'), "proto", "<YOUR_BUCKET_NAME>", "msg/filedescriptor.desc", "msg", "Telemetry").msgType = 'MSGTYPE_ALERT'

This SQL statement does the following:

  • The SELECT VALUE decode(...) indicates that the entire decoded Protobuf payload will be republished to the destination AWS IoT topic as a JSON payload. If you wish to forward the message still in Protobuf format, you can replace this with a simple SELECT *
  • The WHERE decode(...).msgType = 'MSGTYPE_ALERT' will decode the incoming Protobuf formatted message and only messages containing field msgType with value MSGTYPE_ALERT will be forwarded

Step 7: Verify transformed messages are being republished

If you click on the single action present in this AWS IoT Rule, you will note that it republishes messages to the topic/telemetry_alerts topic.

Republish to AWS IoT topic

The destination topic test/telemetry_alerts is part of the definition of the AWS IoT Rule action, available in the AWS CloudFormation template of the sample application.

To subscribe to the topic and see if JSON formatted messages are republished, follow these steps:

  1. Access the AWS IoT Core service console: https://console.thinkwithwp.com/iot
  2. Under Test, select MQTT test client
  3. Under the Topic filter, fill in test/telemetry_alerts
  4. Expand the Additional configuration section and under MQTT payload display make sure Auto-format JSON payloads option is selected
  5. Click Subscribe and watch as JSON-converted messages with msgType MSGTYPE_ALERT arrive

If you inspect the code of the simulated device, you will notice approximately 20% of the simulated messages are of MSGTYPE_ALERT type and messages are sent every 5 seconds. You may have to wait to see an alert message arrive.

View the decoded alerts

Clean Up

To clean up after running this sample, run the commands below:

# delete the file descriptor object from the Amazon S3 Bucket
aws s3 rm s3://<YOUR_BUCKET_NAME>/msg/filedescriptor.desc

# detach all policies from the IoT service role
aws iam detach-role-policy --role-name IoTCoreServiceSampleRole \
  --policy-arn $(aws iam list-attached-role-policies --role-name IoTCoreServiceSampleRole --query 'AttachedPolicies[0].PolicyArn' --output text)

# delete the AWS CloudFormation Stack
aws cloudformation delete-stack --stack-name IotBlogPostSample

Conclusion

As shown, working with Protobuf on AWS IoT Core is as simple as writing a SQL statement. Protobuf messages provide advantages over JSON both in terms of cost savings (reduced bandwidth usage, greater device autonomy) and ease of development in any of the protoc supported programming languages.

For additional details on decoding Protobuf formatted messages using AWS IoT Core Rules Engine, consult the AWS IoT Core documentation.

The example code can be found in the github repository: https://github.com/aws-samples/aws-iotcore-protobuf-sample.

The decode function is particularly useful when forwarding data to Amazon Kinesis Data Firehose since it will accept JSON input without the need for you to write an AWS Lambda Function to perform the decoding.

For additional details on available service integrations for AWS IoT Rule actions, consult the AWS IoT Rule actions documentation.


About the authors




José Gardiazabal José Gardiazabal is a Prototyping Architect with the Prototyping And Cloud Engineering team at AWS where he helps customers realize their full potential by showing the art of the possible on AWS. He holds a BEng. degree in Electronics and a Doctoral degree in Computer Science. He has previously worked in the development of medical hardware and software.




Donato Azevedo Donato Azevedo is a Prototyping Architect with the Prototyping And Cloud Engineering team at AWS where he helps customers realize their full potential by showing the art of the possible on AWS. He holds a BEng. degree in Control Engineering and has previously worked with Industrial Automation for Oil & Gas and Metals & Mining companies.