Networking & Content Delivery
Leverage Amazon CloudFront geolocation headers for state level geo-targeting
Introduction
When you provide content online, personalization is used to improve your customers’ experience, market effectively, and meet regulatory requirements. One common way you can personalize web content is based on the geographical location of your customers. Since 2014, Amazon CloudFront has supported country-level location based personalization with a feature called Geolocation Headers. Using the CloudFront-Viewer-Country
header, you can identify the country a request has originated from and customize the content it receives.
There are some cases were you will need additional, more granular, targeting. In the United States for example, you may want to present different variants of the website to viewers from different states. Historically, you have been able to do this with third-party products like ipgeolocation.io or IP2LocationLite which provide location data for IP addresses.
In July 2020, Amazon CloudFront announced support for additional geolocation headers including state, city and postal code to support granular location based web personalization. In this blog post, you will be walked through the implementation of these new headers, including caching and geo-targeting at an US state level with Amazon CloudFront, Lambda@Edge, and a static website hosted on Amazon S3.
Overview of solution
To demonstrate this functionality, imagine you have a static website hosted in an S3 bucket with an index.html
file and different variants of the banner image for US states (banner_ca.png
for the state California for example) and a default banner.png
image as fallback or for non-US viewers. When your user makes a request for the website, the following steps are performed:
- User makes a request to the website domain, in our case the CloudFront distribution domain name
- The HTTP request reaches CloudFront CDN which returns the cached version of the resource if it exists
- In case of a cache miss, CloudFront sets the cache key and geolocation headers
- Depending on the cache behavior defined in the CloudFront distribution (in our example all files matching the
*.png
pattern), a Lambda function checks the country of the viewer. If the viewer is in the US, it rewrites the url appending the state suffix and forwards the request to the S3 bucket - The requested resource is returned from the S3 bucket
Prerequisites
- An AWS account.
- Follow this video to setup Amazon CloudFront to serve a static website hosted on S3 using a S3 API as the origin within CloudFront. Don’t upload any files in the S3 bucket part of the setup. After this setup you should have created a CloudFront distribution with a S3 bucket as an origin.
Note: The example in the video assumes you use a custom domain name and SSL certificate. This is not mandatory for the purpose of this blog post. If you don’t have a custom domain you can skip the CNAME and custom SSL certificate selection.
Walkthrough
S3 bucket setup
- Create the
index.html
file.
<!DOCTYPE html>
<html>
<body>
<h1>Join the 2020 AWS Summit</h1>
<img src="banner.png">
</body>
</html>
- For the default version of the banner, download this banner image from the Global AWS Summit and rename it to
banner.png
. - We choose to demo this functionality for a viewer in California so as a variant for the default banner, download this banner image from the San Francisco AWS Summit and rename it to
banner_ca.png
. - Upload these three files in the S3 bucket.
CloudFront setup
- The first step will be to define the path pattern of the files for which you support multiple variants and want to be cached based on a cache key of your choice. In this case, we are looking at variants for the image file,
banner.png
. - Go to the CloudFront console page and select your distribution. Select the ‘Behaviors’ tab and create a new behavior.
- In our case, we choose the
*.png
path pattern - For ‘Viewer Protocol Policy’, select ‘Redirect HTTP to HTTPS’.
- For ‘Cache and origin request settings’ select ‘Use legacy cache settings’, we will change this later.
- Continue with the rest of the settings as default to create the cache behavior.
- In our case, we choose the
Lambda setup
- Create the Lambda function by going into the Lambda console page and selecting ‘Create function’ and ‘Author from scratch’. Name your function and select the Python runtime. Finally, create a new execution role and select the ‘Basic Lambda@Edge permissions’ policy which will allow CloudFront to trigger this function.
- Paste this code snippet into the editor.
import os
from urllib.parse import urlparse
def lambda_handler(event, context):
request = event['Records'][0]['cf']['request']
parsed_uri = urlparse(request['uri']).path
root = build_root_path(parsed_uri)
file_name= os.path.basename(parsed_uri).split(".")[0]
suffix = build_suffix(request['headers']);
extension = os.path.splitext(parsed_uri)[1]
request['uri'] = root + file_name + suffix + extension;
return request
def build_root_path(parsed_uri):
root = os.path.split(parsed_uri)[0]
return root if (root == "/") else rootDir + "/"
def build_suffix(headers):
country = headers['cloudfront-viewer-country'][0]['value']
if (country == 'US'):
return '_' + headers['cloudfront-viewer-country-region'][0]['value'].lower()
else:
return ''
- Deploy the function to run at the edge when CloudFront invokes it. From the ‘Actions’ dropdown, select ‘Deploy to Lambda@Edge’.
- In the next step, select the distribution and the created cache behaviour. Select the CloudFront event as the ‘Origin request’. As described in the documentation page, the function will execute only when CloudFront forwards a request to your origin.
Cache policy and cache behavior setup
- Go to the CloudFront console page and select your distribution. Select the ‘Behaviors’ tab and edit the created behavior.
- For the ‘Cache and origin request settings’, select ‘Use a cache policy and origin request policy’ and ‘Create a new policy’
- This will open a new page for creating the cache policy.
- Set the TTL settings.
- Select the contents of the cache key. In our case we are interested in caching based on the country and the region of the viewer so we will choose to whitelist the two headers:
CloudFront-Viewer-Country
CloudFront-Viewer-Country-Region
– for US, this header contains a code (up to three characters) that represent the viewer’s region. The region is the most specific subdivision of the ISO 3166-2 code.
- On the cache behavior page, select the newly created cache policy and save the behavior.
Testing the implementation
Now let’s test the two versions of the content by making a request from California and one from any other state in the US or country in the world. In the browser, paste either your custom domain name if you created one during the initial setup or the CloudFront distribution domain name.
Cleanup
Remove the CloudFront distribution, S3 bucket, and the Lambda function to avoid further costs.
Conclusion
In this post, you have seen how easy it is to leverage the geolocation headers available in Amazon CloudFront to cache and customise content based on the location of the viewer. You used the the two headers CloudFront-Viewer-Country
and CloudFront-Viewer-Country-Region
to create a cache key in the CloudFront and build logic executed at the Edge of the network to determine the correct path of the file to be returned to the requester. Other than these two, there are other headers that can be used to obtain information about the viewer’s location and build personalized experiences like: display content by city (with CloudFront-Viewer-City
), show accessible attractions close by (with CloudFront-Viewer-Postal-Code
), adjust times to the viewer’s timezone (with CloudFront-Viewer-Time-Zone
) and accurately identify viewer’s location (with CloudFront-Viewer-Latitude
and CloudFront-Viewer-Longitude
).