AWS Storage Blog
Disabling Amazon S3 access control lists with S3 Inventory
Access control lists (ACLs) define user access and the operations users can take on specific resources. Amazon Simple Storage Service (Amazon S3) was launched in 2006 with ACLs as its first authorization mechanism. Since 2011, Amazon S3 has also supported AWS Identity and Access Management (IAM) policies for managing access to S3 buckets, and it recommends using policies instead of ACLs. Although ACLs continue to be supported in Amazon S3, most use cases no longer require them.
In 2021, we introduced Amazon S3 Object Ownership to help customers fully disable ACLs for their S3 buckets and rely entirely on policies for access control. This simplifies access management for data stored in Amazon S3, because you can easily review, manage, and modify access to your data using only policies. Additionally, as of April 2023, we now disable ACLs by default for all new buckets. However, for existing workloads, customers tell us that they need visibility into how their ACLs are being used before switching over to policies. They want to be sure that this change won’t remove needed permissions, and thereby disrupt their applications.
To help you disable ACLs, we’ve recently added information to Amazon S3 server access logs and AWS CloudTrail so that you have visibility into requests that were dependent on an ACL for access to objects in Amazon S3. Learn more about this in the AWS Storage Blog post, “Disabling ACLs for existing Amazon S3 workloads with information in S3 server access logs and AWS CloudTrail”. This is an essential step when auditing your Amazon S3 ACL usage, so you can understand exactly how and where your applications are making API calls to Amazon S3 that require object ACLs when migrating to IAM and S3 bucket policies.
In this post, we show how customers can now use Amazon S3 Inventory reports to easily review ACLs. This lets you utilize the increased scalability and flexibility that policies offer for managing access to resources. Consolidating object-specific permissions into one policy is easier to audit and update than multiple ACLs, especially at scale.
New – Amazon S3 Inventory reports now include S3 ACL metadata
The Amazon S3 Inventory report lets you generate a list of objects and metadata to manage and review your storage. Retrieving information on all your objects’ ACLs via Amazon S3 Inventory report is a more cost-effective option when compared to making an API call to each of your S3 objects to retrieve the same information. The S3 Inventory report is available daily or weekly and generates granular data at the object level. This can include fields such as object size, last modified date, and encryption status.
On July 14, 2023, we added object ACLs to the Amazon S3 Inventory report. This feature simplifies the auditing of access permissions in Amazon S3 by providing customers with a comprehensive listing of all ACLs associated with each object. Adding object ACLs to the Amazon S3 Inventory report can also help save costs by no longer needing to call Amazon S3 APIs to determine which ACLs have been set on your objects. The new object ACLs field includes an Owner element that identifies the object owner, and a Grant element that identifies the grantee and the permission granted. This enables customers to identify existing object ACLs, migrate them to IAM/bucket policies, and disable object ACLs once they have validated ACLs that are no longer in use.
How ACLs are displayed in the S3 Inventory report
The object ACL field in the Amazon S3 Inventory report is defined in JSON format. The JSON data includes the following fields:
- version: The version of the object ACL field format in the Amazon S3 Inventory report. It’s in the date format yyyy-mm-dd.
- status: Possible values are AVAILABLE or UNAVAILABLE to indicate whether an object ACL is available for an object or not. You may see an UNAVAILABLE status in rare cases when the Amazon S3 Inventory report does not include recently added or deleted ACLs, which are likely to be included the next time an Amazon S3 Inventory report is generated.
- grants: Grantee-permission pairs that list the permission status of each grantee that is granted by the object ACL. The available values for a grantee are the canonical user and group. For more information on grantee, see the Grantees in ACL.
For a grantee with the Group type, a grantee-permission pair includes the following attributes:
-
- uri: A predefined Amazon S3 group.
- permission: The ACL permissions that are granted on the object.
- type: The type Group, which denotes that the grantee is group.
For a grantee with the CanonicalUser type, a grantee-permission pair includes the following attributes:
-
- canonicalId: The canonical ID is an alphanumeric identifier used to identify AWS accounts when granting Amazon S3 access via bucket or object ACLs.
- permission: The ACL permissions that are granted on the object.
- type: The type CanonicalUser, which denotes that the grantee is an AWS account.
The following example shows values for the object ACL field in JSON. The example ACL grants read access to the Amazon S3 predefined all users group. This allows signed or unsigned requests from anyone in the world.
{
"version": "2022-11-10",
"status": "AVAILABLE",
"grants": [{
"uri": "http://acs.amazonaws.com/groups/global/AllUsers",
"permission": "READ",
"type": "Group"
}]
}
Figure 1: Example object ACL field JSON
Solution overview
In this post, we cover the following:
- Enabling Amazon S3 Inventory for your S3 bucket
- Setting up Amazon Athena and create Athena tables from the Amazon S3 Inventory report
- Using Athena to query Amazon S3 Inventory and identify objects with Owner and/or Grant ACL elements
- Migrating Amazon S3 ACL permissions to an S3 bucket policy
- Disable Amazon S3 ACLs
Enable Amazon S3 Inventory for your S3 bucket
You can create an inventory configuration by navigating to the S3 bucket Management -> Inventory configurations -> Create inventory configuration. From there, specify the name and scope of the Amazon S3 Inventory report, along with the destination bucket where you want reports to be saved. Next, set the frequency and output format of the report, selecting from daily or weekly frequency and from CSV, Apache ORC, and Apache Parquet as output formats. For this solution, we use the Amazon S3 Inventory report in the CSV format. Before enabling the report, select the optional metadata fields on which to report, since the Amazon S3 Inventory report includes the bucket, object key, version ID, latest version, and delete marker fields by default. For this solution example, note that we only enable the object ACL and object owner additional metadata fields, as shown in the following figure.
Figure 2: Example Amazon S3 Inventory report configuration
Note that the Amazon S3 Inventory report can be made to run daily or weekly and can take up to a day to complete, with the first Amazon S3 Inventory report potentially taking up to 48 hours to be generated. Once your Amazon S3 Inventory report has been created and is available in your destination S3 bucket, you are ready to work with this report using Athena.
Set up Athena and create Athena tables from the Amazon S3 Inventory report
1. Browse to the S3 bucket containing your Amazon S3 Inventory report, and browse to the hive folder.
2. You should see folders in the bucket listed by Date of the last run with the dt=YYYY-MM-DD-HH-MM and object folder such as dt=2023-08-03-01-00/. Select the Amazon S3 folder location as an Amazon S3 URL string for use when creating the Athena table.
3. In the AWS Management Console, navigate to Athena. If this is your first time using Athena in this AWS account, then you should see a Get Started page with “Query your data” already selected. Select Launch query editor at that page to continue to the Athena Query Editor, as shown in the following figure.
Figure 3: Location of Launch query editor on Athena Get started page
4. If this is your first-time using Athena, then start by configuring your Athena settings. Select Edit settings to continue to enter the path to the S3 bucket where your query results are stored. Note that the S3 bucket selected must be located in the same AWS Region from which you are running Athena queries, as shown in the following figure.
Figure 4: Dialog box to enter location to store Athena query results
5. Once you’ve saved your Athena settings, we can create our table for Amazon S3 Inventory results. We enter this SQL statement to create a table named s3_inventory_csv in the default Athena database. You must enter the Amazon S3 Inventory location in the following LOCATION string with the appropriate values for your configuration.
CREATE EXTERNAL TABLE IF NOT EXISTS `default`.`s3_inventory_csv` (
`Bucket` string,
`Key` string,
`version_id` string,
`is_latest` boolean,
`is_delete_marker` boolean,
`ObjectAccessControlList` string,
`ObjectOwner` string
)
ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.OpenCSVSerde'
STORED AS INPUTFORMAT 'org.apache.hadoop.hive.ql.io.SymlinkTextInputFormat'
OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.IgnoreKeyTextOutputFormat'
LOCATION 's3://yourS3Bucketname/inventory/hive/dt=2023-06-13-01-00/';
Figure 5: Example SQL to create Amazon S3 Inventory table in Athena
Note that for the location field, we use the Amazon S3 Inventory location noted previously in Step 2. Select Run to create the table.
6. Once the SQL has been run, you see s3_inventory_csv in your list of tables and can run Athena queries against your Amazon S3 Inventory, as shown in the following figure.
Figure 6: Location of s3_inventory_csv table in Athena query editor
Use Athena to query Amazon S3 Inventory and identify objects with Owner and/or Grant ACL elements
We cover a few example scenarios that let us identify objects in our example environment that have already had ACLs placed on them.
Note that the Amazon S3 Inventory report in format displays the value for the object ACL field as a base64-encoded string, regardless of which format you choose (CSV, Parquet, ORC). Furthermore, it must be decoded during the Athena queries, for which we show examples in this post.
Example 1: Find the number of ACLs for all objects in the Amazon S3 Inventory report
This example query utilizes the from_base64 function to decode the objectaccesscontrollist column, which is encoded in base64 format, and the from_utf8 function to convert the binary data into text to determine the count of how many ACLs are granted on S3 objects, as shown in the following figure:
SELECT key,
from_utf8(from_base64(objectaccesscontrollist)),
json_array_length(
json_extract(
from_utf8(from_base64(objectaccesscontrollist)),
'$.grants'
)
) AS grants_count
FROM "default"."s3_inventory_csv";
Figure 7: Example SQL to find number of ACLs for all objects
For readability purposes, I have exported the query results to the CSV format and show an example of three objects where there are ACLs present, as shown in the following figure.
Figure 8: Finding number of ACLs for all objects – Athena query results
Example 2: Identify S3 objects that have ACLs with permission = Full Control
WITH grants AS
(SELECT key,
CAST(json_extract(objectaccesscontrollist,
'$.grants') AS ARRAY(MAP(VARCHAR, VARCHAR))) AS grants_array
FROM "default"."s3_inventory_csv" )
SELECT key,
grants_array,
grant
FROM grants, UNNEST(grants_array) AS t(grant)
WHERE element_at(grant, 'permission') = 'FULL_CONTROL'
Figure 9: Example SQL to identify object ACLs with Full Control
For readability purposes, I have exported the query results to the CSV format and shown an example of an object named xaaaau that has a canonical user account ID which has been granted full control, as shown in the following figure.
Figure 10: Identify S3 objects with Full Control ACL Athena query results
Example 3: Find S3 objects that have cross-account ownership using ACLs
You must find the canonical user ID for your AWS account to place it in the following query. To find the canonical user ID for your AWS account, browse to an S3 bucket in your account and select the Permission tab. Browse down to ACL to find the canonical user ID for your AWS account, as shown in the following figure.
Figure 11: Amazon S3 console location of canonical user ID
Now that you have the canonical user ID, you can replace ‘CanonicalID’ with your canonical user ID in the following example SQL query, and run the query in Athena:
SELECT key,objectowner FROM "default"."s3_inventory_csv" WHERE objectowner <> 'CanonicalID';
Figure 12: Example SQL to identify ACLs with cross-account access
The results from this query are S3 objects that are all owned by different AWS accounts, which are shown in the canonical user ID format which owns the S3 object.
Figure 13: Identify ACLs with cross-account access Athena query results
Migrate Amazon S3 ACL permissions to an S3 bucket policy with a cross-account IAM role
Now that we’ve identified a number of objects utilizing Amazon S3 ACLs in the preceding examples, we can show how to write an S3 bucket policy and IAM role that can allow for cross-account access to a specific S3 bucket to bring it in-line with our security best practices.
We have two entities in this example – Account 1 that contains the S3 bucket, and Account 2 that contains IAM resources that require access to the S3 bucket in Account 1. For example, the AWS account ID of Account 1 is 111111111111 and the AWS account ID of Account 2 is 222222222222.
Here’s an architecture diagram of our example design:
Figure 14: Example cross-account IAM role access to S3 bucket
Create IAM role in account 2
In Account 2, we create an IAM role that contains a trust policy that defines whom or what can assume the role, as well as a listing of allowed actions by navigating to the IAM console in your AWS account, selecting Roles, and then selecting Create role, as shown in the following figure.
Figure 15: Location of Roles and Create role
Now, select AWS Service as a trusted entity and select EC2, which lets us assign the role to an Amazon Elastic Compute Cloud (Amazon EC2) instance. Then, select Next to proceed, as shown in the following figure.
Figure 16: Location of EC2 trusted entity fields
On the Add permissions page, select Create policy so we can provide the IAM policy JSON for this role directly, as shown in the following figure.
Figure 17: Location of Create policy
In the IAM policy editor, we select the JSON button and paste the following example policy for the IAM role to allow PutObject and GetObject permissions against three specific objects. Modify this as needed for your specific use-case and select Next, as shown in the following figures.
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"s3:GetObject",
"s3:PutObject"
],
"Resource": [
"arn:aws:s3:::Account1BucketName/object1",
"arn:aws:s3:::Account1BucketName/object2",
"arn:aws:s3:::Account1BucketName/object3"
]
}
]
}
Figure 18: Example IAM policy
Figure 19: Example IAM permissions
Provide a name for your IAM policy and a description, and then select Create policy. Note that this only creates the IAM policy for the IAM role you are about to create, as shown in the following figure.
Figure 20: Name, review, and create IAM policy
Now that the IAM policy has been created, we can return to the IAM role creation tab screen, shown in the following figure, select the refresh button, and search for the IAM policy that we just created. Once you have found the policy, select it by checking the box next to the policy name, and select Next.
Figure 21: Add permissions to IAM role
On the next screen, shown in the following figure, name your IAM role, provide an optional description, and select Create role at the bottom of the page to create the IAM role.
Figure 22: Name, review, and create IAM role
Once your IAM role has been created, select View role, as shown in the following figure.
Figure 23: View IAM role
Now select the copy button next to the ARN to copy the IAM role ARN and save it in a scratchpad, since we eventually use it to create an S3 bucket policy to allow for access to the IAM role, as shown in the following figure:
Figure 24: Copy IAM role ARN
Create S3 bucket policy in account 1
Once that has been copied, switch to the Amazon S3 service by typing S3 in the console and selecting it to launch the Amazon S3 console. Then, select the S3 bucket to which you would like to apply the S3 bucket policy.
Select the Permissions tab, scroll down to Bucket policy, and select Edit to open the bucket policy editor, as shown in the following figure.
Figure 25: Edit S3 bucket policy
In the policy editor, we are adding the following policy JSON that allows for the IAM role we just created to perform Get and Put operations on three specific S3 objects. Modify this policy as appropriate for your specific use case, and select Save changes, as shown in the following figure. For additional details and examples on S3 bucket polices, visit Bucket policy examples.
Figure 26: Example S3 bucket policy
For further information on cross-account access to Amazon S3 resources, review the documentation.
Testing migrating to IAM and bucket policy access to Amazon S3
Once you have created the IAM Role and assigned the preceding S3 Bucket Policy, you want to attach the IAM role to an EC2 instance to begin testing cross-account access using the IAM Role/S3 Bucket policy combination.
In addition to the preceding steps, we highly recommend as part of your ACLs to IAM/Bucket Policy migration to confirm if your Amazon S3 API calls require ACLs for the API request to succeed. This can be accomplished by using Amazon S3 Server Access Logs, or CloudTrail, which are outlined in this AWS Storage Blog post.
Although we only covered a small number of objects in the preceding examples, customers may have significantly more objects with ACLs in the tens of millions or more. Although every customer’s use case is different, we recommend beginning with small scale testing in a non-production environment first, and then moving on to identifying production objects with ACLs that are candidates for migrating to policy-based methods. Depending on your account/bucket/prefix/object key naming strategy, you can look at ways to perform these tasks in small manageable chunks aligned with your specific usage to simplify the migration of permissions.
Disable Amazon S3 ACLs
Once your testing and validation is completed, we recommend that you continue to monitor your Amazon S3 Server Access Logs for a period of time to make sure that there are no more Yes entries in the aclRequired field of your Amazon S3 Server Access Logs or CloudTrail Data Events. This step helps provide a comprehensive audit of your usage of object ACLs. More information on the process to review Amazon S3 Server Access Logs or CloudTrail for aclRequired requests can be found in this post.
Once you are certain that there are no more requests that depend on object ACLs to succeed, you can disable Amazon S3 object ACLs on your bucket by navigating to the bucket, selecting the Permissions tab, and selecting Edit next to Object Ownership, where you can disable ACLs on your bucket and save the changes, as shown in the following figure. Once this is done, make sure to test your cross-account permissions to objects to make sure that access to the objects continues as before, and make changes to your bucket policy and/or IAM permissions accordingly. In the event that errors are seen at this stage, you can easily go back to the bucket and enable ACLs on the bucket if needed to reverse the change. If you would like to learn more about the Amazon S3 Ownership Setting to enable or disable S3 ACLs, then visit this relevant AWS News Blog post.
Figure 27: Edit Object Ownership example
Cleaning up
If you have created a new Amazon S3 Inventory report to test the object ACL and object owner fields and no longer want to be billed, you must remove the Amazon S3 Inventory report from your S3 bucket, delete the Amazon S3 Inventory configuration, and delete the Amazon S3 Inventory report Athena table. It is also recommended that you remove the IAM roles, IAM policies, and S3 bucket policy if you created these resources for testing purposes and no longer require them.
Conclusion
In this post, we described the two new Amazon S3 Inventory report fields for object ACLs and object owner in detail. We also demonstrated how you can use the new object ACL and object owner fields in Amazon S3 Inventory to help audit objects in Amazon S3 that have ACLs configured. Customers can now enjoy lower costs and better scale using Amazon S3 Inventory with object ACLs to audit ACLs when compared to using API-based methods. This is a key component of migrating from ACLs to policy-based access controls in conjunction with auditing ACL usage with Amazon S3 server access logs or AWS CloudTrail as detailed in this post. When paired together, Amazon S3 server access logs can identify active ACL usage. S3 Inventory reports can help identify S3 object ACLs even in inactive workloads. This provides customers with a comprehensive view of object ACLs in their S3 buckets.
For additional information on Amazon S3 Inventory with object ACLs, visit our documentation on working with the object ACL field.
For additional information on Amazon S3 Security best practices, visit our documentation on security best practices for Amazon S3.
Thank you for reading this post. If you have any comments or questions, feel free to post them in the comments section.