Containers

Analyze EKS Fargate costs using Amazon Quicksight

Introduction

AWS Fargate is a serverless compute engine for running Amazon Elastic Kubernetes Service (Amazon EKS) and Amazon Elastic Container Service (Amazon ECS) workloads without managing the underlying infrastructure. AWS Fargate makes it easy to provision and scale secure, isolated, and right-sized compute capacity for containerized applications. As a result, teams are increasingly choosing AWS Fargate to run workloads in a Kubernetes clusters.

It is a common practice for multiple teams to share a single Kubernetes cluster. In such cases, cluster administrators often have the need to allocate cost based on a team’s resource usage. Amazon EKS customers can deploy the Amazon EKS optimized bundle of Kubecost for cluster cost visibility when using Amazon EC2. However, in this post, we show you how to analyze costs of running workloads on EKS Fargate using the data in the AWS Cost and Usage Report (CUR). Using Amazon QuickSight, you can visualize your AWS Fargate spend and allocate cost by cluster, namespace, and deployment.

Cost allocation in Amazon EKS on AWS Fargate

With AWS Fargate, there are no upfront costs and you pay only for the resources you use. AWS Fargate bills you for the amount of CPU, memory, and storage resources consumed by your applications. The cost and usage of your AWS Fargate resources is made available in AWS Cost and Usage Reports.

The CUR is your one-stop shop for accessing the most detailed information available about your AWS costs and usage. It can be generated at an hourly, daily, or monthly granularity. Each report contains line items for each unique combination of AWS products, usage type, and operations that you use in your account. You can use CUR to break down your AWS bill by period, product or product resource, and custom tags.

In this post, we’ll use AWS Glue DataBrew to process the data in CUR without writing any code. AWS Glue DataBrew is a visual data preparation tool making it easy to clean and normalize data. Then, we’ll use the processed data to visualize EKS Fargate costs using Amazon QuickSight, a serverless business intelligence (BI) service.

In this post, we have included steps to create dashboard that visualizes EKS Fargate costs by Amazon EKS cluster, Kubernetes namespace, or Kubernetes deployment during a specific timeframe.

Solution overview

Solution architecture: AWS Cost and Usage report -> Primary Amazon S3 bucket -> AWS Glue Databrew -> Secondary Amazon S3 bucket -> AWS Quicksight visualization

The overarching steps to visualize EKS Fargate costs are as follows:

  1. Generate a billing report with AWS Cost and Usage reports (CUR) stored in a primary Amazon S3 bucket.
  2. Modify the report for resource-level granularity using AWS Databrew and store modified report into a secondary Amazon S3 bucket.
  3. Import the dataset to Amazon QuickSight
  4. Create an Amazon QuickSight dashboard and visualizations

Prerequisites

You’ll need the following before proceeding:

  • An AWS account
  • An Amazon EKS cluster with pods running on AWS Fargate.
  • CSV-formatted CUR reports in an Amazon Simple Storage Service (Amazon S3) bucket
    • CUR reports should have data integration with Amazon QuickSight. This creates csv.gz compressed files that can be easily integrated during later steps.
    • NOTE: Skip to Verify CUR Report step if you already have a CUR report generated in your account, as per the recommended format.
  • A secondary Amazon S3 bucket to store modified CUR data
    • See SETUP 1 for an AWS Command Line Interface (AWS CLI) command to create an Amazon S3 bucket.
  • Permissions to setup AWS Glue DataBrew
  • An Amazon QuickSight Enterprise account in the source and target accounts. For instructions, see Setting Up Amazon QuickSight.

Walkthrough

Generate CUR Report

If not already generated, there are two options to configure and create the reports. Make sure you have access to the AWS Billing Console (or reports).

  • Option 1: Follow all steps in this documentation.
  • Option 2: Use an AWS CLI command. Follow SETUP 1 and SETUP2 completely, as described.

SETUP 1: Creating an Amazon S3 bucket with AWS CLI and adding a bucket policy.

Let’s define the essential variables.

export ACCOUNT_ID=$(aws sts get-caller-identity | jq .Account | tr -d '"')
export BUCKET_NAME=<Your-unique-S3-bucket-name>

To create an Amazon S3 bucket, use the following AWS CLI command.

aws s3api create-bucket --bucket $BUCKET_NAME --region us-east-1

Let’s add the appropriate bucket permissions to allow AWS Billing service to send CUR reports to this Amazon S3 bucket. First, create a bucket policy document locally using the exported environment variables.

cat > policy.json <<EOF
{
	"Version": "2008-10-17",
	"Statement": [
		{
			"Effect": "Allow",
			"Principal": {
				"Service": "billingreports.amazonaws.com"
			},
			"Action": [
				"s3:GetBucketAcl",
				"s3:GetBucketPolicy"
			],
			"Resource": "arn:aws:s3:::$BUCKET_NAME",
			"Condition": {
				"StringEquals": {
					"aws:SourceArn": "arn:aws:cur:us-east-1:$ACCOUNT_ID:definition/*",
					"aws:SourceAccount": "$ACCOUNT_ID"
				}
			}
		},
		{
			"Effect": "Allow",
			"Principal": {
				"Service": "billingreports.amazonaws.com"
			},
			"Action": [
				"s3:PutObject"
			],
			"Resource": "arn:aws:s3:::$BUCKET_NAME/*",
			"Condition": {
				"StringEquals": {
					"aws:SourceArn": "arn:aws:cur:us-east-1:$ACCOUNT_ID:definition/*",
					"aws:SourceAccount": "$ACCOUNT_ID"
				}
			}
		}
	]
}
EOF

To add the bucket policy to the Amazon S3 bucket, use the following AWS CLI command.

aws s3api put-bucket-policy --bucket $BUCKET_NAME --policy file://policy.json

In the next setup phase, we’ll create the CUR report.

SETUP 2: Creating CUR Reports with AWS CLI

First, create a report definition JSON file locally. Modify the following parameters: ReportName, S3Bucket, S3Prefix (e.g., fargate_report), and S3Region. Ensure that the Amazon S3 parameters match where you would like the CUR report delivered.

cat > report-definition.json <<EOF
{
	"ReportName": "eksfargatecurreport",
	"TimeUnit": "HOURLY",
	"Format": "textORcsv",
	"Compression": "GZIP",
	"AdditionalSchemaElements": [
		"RESOURCES"
	],
	"S3Bucket": "$BUCKET_NAME",
	"S3Prefix": "cur",
	"S3Region": "us-east-1",
	"AdditionalArtifacts": [
		"QUICKSIGHT"
	],
	"RefreshClosedReports": true,
	"ReportVersioning": "OVERWRITE_REPORT"
}
EOF

Use the following AWS CLI command and input your desired region. This creates your CUR report.

aws cur put-report-definition --region us-east-1 --report-definition file://report-definition.json

Note: If there is an error on bucket permission, then follow the bucket policy steps under SETUP 2. 

Verify CUR Report

Navigate to the Cost and Usage Reports tab under Billing. Below shows the console view of the CUR report after creation.

CUR report configuration details

Note: It can take up to 24 hours for AWS to start delivering reports to your Amazon S3 bucket. Wait until reports have populated before proceeding.

Upload a Recipe AWS Glue DataBrew

Let’s start by configuring AWS Glue DataBrew to process the CUR.

In the raw CUR, Amazon Resource Names (ARNs) are provided in this format:

arn:aws:eks:[region]:[accountID]:pod/[clusterName]/[namespace]/[deploymentId]/[podId]

The data needs to be modified before it can be used in filters and metrics on an Amazon QuickSight’s visual dashboard. We’ll use a recipe in Amazon Glue DataBrew to create separate columns for clusterName, namespace, deploymentId, and podId. It is compatible with .CSV files, and if files are in Parquet format, then few edits are needed to make it work.

Download billing-recipe.json from GitHub. Navigate to the Recipes tab in the AWS Glue DataBrew console, then select the Upload recipe button. Enter a recipe name and description in the corresponding input boxes (an example is provided).

Upload billing recipe view on AWS Glue Databew console.

Upload the JSON file, then create and publish recipe. Once that completes, select the recipe to view the steps.

Note: If your CUR is saved in Parquet format, then use parquet-recipe.json. The rest of the process remains the same.

AWS Glue Databrew recipe steps

The 12 steps standardize the date column formats for Amazon QuickSight. It then splits the ResourceId column into subsections, renames them accordingly, and removes any data unrelated to AWS Fargate usage.

Create a project to transform raw data

Select the Projects tab on the left. Choose Create Project and enter a project name, then select the Import steps from recipe checkbox. Select the uploaded recipe (billing-recipe.json). This will be applied to the chosen dataset.

AWS Glue Databrew project details console view

Connect your CUR data to AWS Glue DataBrew

We next create an AWS Glue DataBrew dataset for Amazon S3 files to be picked up monthly.

Scroll down in the new Project wizard and select New dataset. Enter a dataset name and scroll down to Connect to new dataset. Select your source Amazon S3 bucket (i.e., where your CUR lands) and navigate to the folder that contains the CUR file. Then, select the .CSV CUR file.

The path will look similar to:

[s3bucket]/[prefix]/[reportname]/yearmonthdate-yearmonthdate/reportname.csv.gz

Console view of source dataset configuration for AWS Glue Databrew

For our use case, we only need files from the last month. We need to parameterize the path by replacing the changing date and time portion with the corresponding parameter. We’ll create a monthly job later, and when it runs, it only runs on the file with the most recent month’s CUR data.

Highlight the date and time in the location field, and select Create custom parameter.

Console view of AWS Glue Databrew to create custom parameter.

Now you can define a parameter based on the file date. For Parameter name, type in file_creation_time. For Type, select Date.

AWS Data DataBrew needs information on the date and time format within the Amazon S3 path. We are only using the year and month, so we need to define the right format. For Date format, select Custom. Enter yyyyMMdd-yyyyMMdd, then choose Create.

Console view of AWS Glue Databrew for date and time format configuration.

Notice the new parameter reflected in the Amazon S3 location field instead of the original name, as well as all matching files containing the date and time in the Matching files section. We’ll further refine the parameter to only pull data from the last month.

Console view of AWS Glue data brew showing matching files based on custom parameter configured.

Under the Time range drop-down list, select Past month, and then select Save.

Console view of AWS Glue data brew showing time range value to edit.

The dataset now contains one file from the past month after we applied this filter. Scroll further down the new Project wizard and then next open the Sampling tab. From the dropdown, select the type as Random rows and select how many rows you would like to sample. You’ll have that that number of rows to visualize how the recipe modified the data set.

Console view of AWS Glue data brew showing sampling configuration.

Finally, at the bottom of the new Project wizard, add your permissions. You can either choose an existing AWS Identity and Access Management (AWS IAM) role or create a new one. For this post, we chose Create new IAM role.

Console view of AWS Glue data brew showing AWS IAM role permissions configuration

Select Create project.

Creating a monthly job to automate data augmentation

Once the project finishes initializing, we can create an AWS Glue DataBrew job that runs monthly. In the project, in the top right of the console, select Create job and enter a Job name. Under Job output, enter your secondary Amazon S3 bucket. Ensure your file type is chosen as CSV, with the same delimiters as below.

Console view of AWS Glue databrew showing monthly job creation step.

Under Associated schedules, expand the tab and select Create schedule, and enter a corresponding name. Under Run frequency, choose Enter CRON. CRON is a time-based job scheduler software utility.

  • For this post, we configured the job to run on the first day of each month, with the CRON entered below.
  • 0 8 1 * ? *
  • Underneath, you can preview the schedule of the next five occurrences that the job runs on.

Console view of AWS Glue data brew showing monthly job schedule configuration.

Under Permissions, select Create new IAM role.

Console view of AWS Glue data brew showing AWS IAM role configuration for monthly job

Select Create job.

NOTE: If you have existing CUR data and wish to test the job, select Create and run job.

When the job runs on schedule, it deposits your modified dataset into your defined Amazon S3 bucket.

Import data into Amazon QuickSight

Next, we need to create an Amazon QuickSight dataset with the new formatted data in our secondary Amazon S3 bucket.

  1. First, create a manifest file locally (written in JSON) instructing Amazon QuickSight where to find the data. In our case, it specifies the Amazon S3 bucket with data that needs to be added to your Amazon QuickSight analysis. For more information on the manifest file, see the Amazon QuickSight Documentation.
cat > demo-manifest.json <<EOF
{ 
	"fileLocations": [
		{
			"URIPrefixes": [
 				https://s3-us-east-1.amazonaws.com/your-bucket-name/
 			]
		}
	],
	"globalUploadSettings": {
		"format": “CSV”
	}
}
EOF

2. Once you have created the manifest file, upload the file to Amazon S3 in your secondary Amazon S3 bucket. After it has been uploaded, copy the object URL. Now, navigate to the Amazon QuickSight console, select Datasets and new dataset. The GIF below shows how to create a new Amazon S3 Dataset using the object URL. If you encounter AWS IAM permission errors or error codes saying that the Manifest is malformed, then ensure Amazon QuickSight has been given permission to access the secondary Amazon S3 bucket in the Amazon QuickSight Security and Permissions settings page.

Import_data_into_Quicksight

Create Amazon QuickSight graphs

After creating the dataset, you’ll be directed to the Amazon QuickSight Analysis page. In the following section, we have shown example visualizations you can create to show the Cost and Usage Reports for EKS Fargate in the Amazon QuickSight console. Underneath each example, we’ve shown which Fields are on each axis and what Filters have been used.

Invoice spend by cluster name

Invoice spend by cluster name

Filters:

  • lineItem/Operation – include FargatePod
  • lineItem_ResourceId_clusterName – include all

Axes:

  • X axis: lineItem_ResourceId_clusterName
  • Value: lineItem/BlendedCost (Sum)

Invoice spend by cluster name over time

Filters: • lineItem/Operation - include FargatePod • lineItem_ResourceId_clusterName - include all Axes: • X axis: lineItem_ResourceId_clusterName • Value: lineItem/BlendedCost (Sum) Invoice spend by cluster name over time

Filters:

  • lineItem_ResourceId_clusterName – include all
  • bill/BillingPeriodStartDate – between Apr 1, 2023 and Apr 30, 2023

Axes:

  • X axis: lineItem/UsageStartDate
  • Value: lineItem/BlendedCost (Sum)

Group/color:

  • lineItem_ResourceId_clusterName

AWS Fargate costs by cluster name sheet (example)

AWS Fargate costs by cluster name sheet example

The previous filters listed can be applied to more granular analyses, with additional filters built on top.

The previous filters listed can be applied to more granular analyses, with additional filters built on top.

Fargate cost by deployment ID example

Cost

Primary cost drivers in this solution are AWS Glue DataBrew and Amazon QuickSight. With these services, you only pay for what you use and don’t have to worry about managing the underlying infrastructure to run transformations and generate visualizations.

In Amazon QuickSight, authors can create and share dashboards with other users in the account, and readers can explore interactive dashboards, receive email reports, and download data. With Enterprise edition, authors can incur $24/month and readers can incur $0.30/session up to $5 max/month.

In AWS Glue DataBrew, one job runs per month. For the time that a job is running, you are charged an hourly rate of $0.48 per AWS Glue DataBrew node hour (i.e., for Oregon region). AWS Glue DataBrew allocates five nodes to each job as a default, and a single node provides four vCPUs and 16 GB of memory. There are no resources to manage, no upfront costs, and you are not charged for startup or shutdown time.

Cleaning up

If you no longer need the resources, you created using this post, then delete the following, to avoid incurring future costs:

  • AWS Glue DataBrew
    • Dataset
    • Project
    • Recipe
    • Job
  • Amazon QuickSight
    • Dashboards
    • Unsubscribe from Amazon QuickSight
  • The secondary Amazon S3 bucket used to store modified data

Conclusion

In this post, we showed you how to get visibility into your spend on EKS Fargate. We used CUR to get cost and usage data, processed raw data using AWS Glue DataBrew, and visualized it using Amazon QuickSight. We created Amazon QuickSight dashboards to break AWS Fargate costs by cluster, namespace, deployment, and podId.

We hope this solution helps you understand and optimize compute costs in your Kubernetes environment.