This blog post shows you how to share encrypted Amazon Simple Storage Service (Amazon S3) buckets across accounts on a multi-tenant data lake. Our objective is to show scalability over a larger volume of accounts that can access the data lake, in a scenario where there is one central account to share from. Most use cases involve multiple groups or customers that need to access data across multiple accounts, which makes data lake solutions inherently multi-tenant. Therefore, it becomes very important to associate data assets and set policies to manage these assets in a consistent way. The use of native AWS Key Management Service (AWS KMS) simplifies seamless integration with AWS services and offers improved data protection to ultimately enable data lake services (for example, Amazon EMR, AWS Glue, or Amazon Redshift).
Additionally, this blog post structures this approach and enables larger scale by applying an alternative to specifying the principal element. The policy size limitation for Amazon S3 and AWS KMS policies is overcome with the help of an aws:PrincipalOrgID condition key in the condition policy. The post focuses on services that are not integrated with AWS Lake Formation; direct access for S3 buckets is required for defining and enforcing access control policies. For AWS services that integrate with AWS Lake Formation and honor Lake Formation permissions, refer to the Lake Formation Developer Guide.
You will learn how to use attribute-based access control (ABAC) as an authorization strategy that defines permissions based on attributes. ABAC reduces the number of policies, because session tags are easier to manage and establish a differentiation in job policies. Using ABAC in conjunction with Amazon S3 policies, you can authorize users to read objects based on one or more tags that are applied to S3 objects and to the IAM role session of your users based on key-value pair attributes, named session tags. The session tags will be passed when you assume an IAM role or federate a user (through your identity provider) in AWS Security Token Service (AWS STS). This enables administrators to configure a SAML-based identity provider (IdP) to send specific employee attributes as tags in AWS. As a result, you can simplify the creation of fine-grained permissions for employees to get access only to the AWS resources with matching tags.
For simplicity, in this solution you will be using an AWS Command Line Interface (AWS CLI) request for temporary security credentials when generating a session. You will apply the global aws:PrincipalOrgID condition key in your resource-based policies to restrict access to accounts in your AWS organization. You can apply the wildcard character * to the Principal condition key for broader applicability across your specific organization, and we will suggest additional controls where feasible.
From a high-level overview perspective, the following items are a starting point when enabling cross-account access. In order to grant cross-account access to AWS KMS-encrypted S3 objects in Account A to a user in Account B, you must have the following permissions in place (objective A):
- The bucket policy in Account A must grant access to Account B
- The AWS KMS key policy in Account A must grant access to the user in Account B
- The AWS Identity and Access Management (IAM) policy in Account B must grant the user access to both the bucket and key in Account A
By establishing these permissions, you will learn how to maintain entitlements (objective B) at the bucket or object level, explore cross-account bucket sharing at scale, and overcome limitations such as inline policy size or bucket policy file size (you can learn more details in the Policies overview section). As an extension, you:
- Enable granular permissions
- Grant access to groups of resources by tags
At this point in time, not all AWS services support tag-based authorization, so you’ll scale with innovation. Configuration options can be challenging for cross-account access, especially when the objective is to scale across a large number of accounts to a multi-tenant data lake. This blog offers to orchestrate the various configurations options in such a way that both objectives A and B are met and challenges are addressed.
Our objective is to overcome challenges and design backwards with scalability in mind. The following table depicts the challenges and outlines recommendations for a better design.
|Use employee attributes from your corporate directory||You can configure your SAML-based or web identity provider to pass session tags to AWS. When your employees federate into AWS, their attributes are applied to their resulting principal in AWS. You can then use ABAC to allow or deny permissions based on those attributes.|
|Enable granular permissions||ABAC requires fewer policies, because differentiation in job policies is given through session tags, which are easier to manage. Permissions are granted automatically based on attributes.|
|Grant access to resources by tags||When you use ABAC, you can allow actions on all resources, but only if the resource tag matches the principal’s tag and/or the organization’s tag. It’s best practice to grant least privilege.|
|Not all services support tag-based authorization||Check for service updates and design around the limitations.|
|Scale with innovation||ABAC permissions scale with innovation, because it’s not necessary for the administrator to update existing policies to allow access to new resources.|
The high-level architecture is designed to show you the advantages of scaling ABAC across accounts and to reflect on a common model that applies to most large organizations.
Here is a brief introduction to the basic components that make up the solution:
- Identity provider (IdP) – Can be either on-premises or in the cloud. Your administrators configure your SAML-based IdP to allow workforce users federated access to AWS resources using credentials from your corporate directory (backed by Active Directory). Now you can configure your IdP to pass in user attributes as session tags in federated AWS sessions. These transitive session tags propagate to all use cases (such as executor roles, instance roles, and service roles).
- Central sharing account – Helps to establish the Admin/Writer policy. It takes in the user’s information from the IdP and confirms authorization to access resources. The X-Acct Control Manager creates the SAML role and trust relationship via OIDC (OpenID Connect) to allow users to assume roles and pass session tags. This account’s primary purpose is to produce data into the lake. This account can also consume data from the lake if it has such a use case.
- Multiple accounts (MAs) to connect to a multi-tenant data lake – Mirrors a large organization with a multitude of accounts that would like cross-account read access. Each account uses a pre-pave data manager role that can tag or untag any IAM roles in an account and has a cross-account trust to an IAM role in a specific central account that can assume the role. Any other roles should be prevented from tag and untag policies (this is known as a sentinel policy).
- Member Analytics organizational unit (OU) – Is where the EMR analytics clusters are connecting to the data in the consumptions layer to visualize via business intelligence (BI) tools such as Tableau. Access to the shared buckets is granted through bucket and IAM policy. MAs may also have access to buckets within their own accounts. Since this is a consumer account, this account is only joining the lake to consume data from the lake and will not be contributing any data to the data lake.
- Central Data Lake OU – Is the account that has Amazon S3 data location storage. The objects are encrypted, which will require the IAM role to have permissions to the specified AWS KMS key in the key policy. AWS KMS supports the use of the aws:ResourceTag/tag-key global condition context key, which lets you control access to KMS keys based on the tags on the KMS key.
For using SAML session tags for ABAC, you need to have the following:
- Access to a SAML-based IdP where you can create test users with specific attributes.
- For simplicity, you will be using an AWS CLI request for temporary security credentials when generating a session.
- You’ll be passing session tags using AssumeRoleWithSAML.
- AWS accounts where users can sign in. There are five accounts for interaction defined with accommodating policies and permission, as outlined in Figure 1. The numerals listed here refer to the labels on the figure:
- IdP account – IAM user with administrative permission (1)
- Central sharing account – Admin/Writer ABAC policy (2)
- Sample accounts to connect to multi-tenant data lake – Reader ABAC policy (3)
- Member Analytics OU account – Assume roles in shared bucket (4)
- Central Data Lake OU account – Pave or update virtual private cloud (VPC) condition and principals/PrincipalOrgId in a shared bucket (5)
- AWS resources, such as Amazon S3, AWS KMS, or Amazon EMR.
- Any third-party software or hardware, such as BI tools.
Our objective is to design an AWS perimeter where intended access is allowed only if necessary and sufficient conditions are met for getting inside the AWS perimeter. See the following table for more information.
|Boundary||Perimeter objective||AWS services used|
|Identity||Only My Resources
Only My Networks
|Identity-based policies and SCPs|
|Resource||Only My IAM Principals
Only My Networks
|Network||Only My IAM Principals
Only My Resources
|VPC endpoint (VPCE) policies|
There are multiple design options, but this post will focus on option A, which is a logical AND (∧) conjunction of principal organization, resource, and network:
- (Only My aws:PrincipalOrgID) ∧ (Only My Resource) ∧ (Only My Network)
- (Only My IAM Principals) ∧ (Only My Resource) ∧ (Only My Network)
- (Only My aws:PrincipalOrgID) ∧ (Only My IAM Principals) ∧ (Only My Resource) ∧ (Only My Network)
In order to properly design and consider control points as well as scalability limitations, the following table shows an overview of the policies applied in this blog article. It outlines the design limitations and briefly discusses the proposed solutions, which are described in more detail in the Establish an ABAC policy section.
|Policy type||Sample policies||Limitations to overcome||Solutions outlined in this blog|
|IAM user policy||
Inline JSON policy document is limited to (2048 bytes).
For using KMS keys, accounts are limited to specific AWS Regions.
Enable session tags, which expire and require credentials.
It is a best practice to grant least privilege permissions with AWS Identity and Access Management (IAM) policies.
|S3 bucket policy||
The bucket policy has a 20 KB maximum file size.
Data exfiltration or non-trusted writing to buckets can be restricted by a combination of policies and conditions.
Use aws:PrincipalOrgID to simplify specifying the Principal element in a resource-based policy.
No manual updating of account IDs required, if they belong to the intended organization.
|~ VPCE policy||
aws:PrincipalOrgID opens up broadly to accounts with single control—requiring additional controls.
Make sure that principals aren’t inadvertently allowed or denied access.
Restrict access to VPC with endpoint policies and deny statements.
|KMS key policy||
Listing all AWS account IDs in an organization.
Maximum key policy document size is 32 KB, which applies to KMS key.
Changing a tag or alias might allow or deny permission to a KMS key.
Specify the organization ID in the condition element, instead of listing all the AWS account IDs.
AWS owned keys do not count against these quotas, so use these where possible.
Assure visibility of API operations
By using a combination of these policies and conditions, you can mitigate accidental or intentional exfiltration of data from a non-trusted organization, IAM credentials, or VPC endpoints. You can also alleviate writing to a bucket that is owned by a non-trusted account or principal by using even more restrictive policies for write operations. For example, use the s3:ResourceAccount condition key to filter access to trusted S3 buckets that belong only to specific AWS accounts.
Ultimately, the scalability is limited by the number of organizations, through file size that can be listed in the S3 bucket policy (20 KB) and KMS key policy (32 KB). Under the assumptions of the following policies and the maximum file size available for listing organizations, you are constrained to enable the following maximum number of organizations for cross-account access:
- S3 bucket policy: approximately 84 organizations (maximum, fewer for more complex policies)
- KMS key policy: approximately 430 organizations (maximum, fewer for more complex policies)
Assuming that each organization can have 20 or potentially more accounts, potentially more than 1,000 accounts could be enabled for cross-account access.
Figure 2 demonstrates a sample scenario for the roles from a specific organization in AWS Organizations with defined tags accessing an S3 bucket that is owned by Central Data Lake OU.
Establish an ABAC policy
AWS has published extensive security guides for its products and individual services. The base set of security guides covers: AWS KMS, ABAC, Amazon S3, resource-based policies (PrincipalOrgId), cross-account privilege design escalation, scale authorization, data lake, Amazon EMR, and others.
To establish the ABAC policy
- Create an IAM role and IAM user policy by following the instructions in this blog to create the IAM role.
- Create the ABAC policy for the role.
The user can start and stop instances only if principal, PrincipalOrgID, and resource tags are matching. More detailed instructions to define permissions to access AWS resources based on tags are available in the IAM tutorial.
Policy sample to allow the session-tag user to assume the role
The AllowIamUserAssumeRole statement in the following sample policy allows the test-session-tags user to assume the role with the attached policy. When that user assumes the role, they must pass the required session tags and external ID.
- Allow cross-account access to an AWS KMS key (KMS keys). This is an IAM policy to allow principals to call specific operations only on KMS keys in your account.
AWS KMS supports ABAC by allowing you to control access to your customer managed keys based on the tags and aliases associated with the KMS keys. This provides a powerful and flexible way to authorize principals to use KMS keys without editing policies or managing grants. But you should use these features with care so that principals aren’t inadvertently allowed or denied access.
Policy sample to allow a user in another account to use a KMS key
To grant another account access to a KMS key, create an IAM policy on the secondary account that grants access to use the KMS key. For instructions, see Allowing users in other accounts to use a KMS key.
The accounts are limited to a Region (such as us-west2) and have a Project=Alpha tag. You might attach this policy to roles in the example Alpha project.
- Configure the S3 bucket policy.
For cross-account permissions to other AWS accounts or users in another account, you must use a bucket policy. Bucket policy is limited to 20 KB maximum file size.
The idea of the S3 bucket policy is based on data classification, where the S3 bucket policy is used with deny statements that apply if the user doesn’t have the appropriate tags applied. You don’t need to explicitly deny all actions in the bucket policy, because a user must be authorized in both the IAM policy and the S3 bucket policy in a cross-account scenario. This can increase the complexity in your environment.
Policy sample for the S3 bucket
You can use the Amazon S3 console to add a new bucket policy or edit an existing bucket policy. Detailed instructions for creating or editing a bucket policy are in the Amazon S3 User Guide. The following sample policy restricts access only to principals from organizations as listed in the policy and a specific VPC endpoint.
- Configure the VPCE policy.
You can use S3 bucket policies to control access to buckets from specific VPC endpoints, or specific VPCs.
In addition, the S3 bucket policy should be scoped down to explicitly deny the following:
- Non-approved IP addresses
- Incorrect encryption keys
- Non-SSL connections
- Un-encrypted object uploads
Policy sample for VPCE-bucket policy enhancements
The setup of VPC endpoints is described in the Amazon VPC User Guide.
To create an interface endpoint, you must specify the VPC in which to create the interface endpoint, and the service to which to establish the connection.
You can download this sample policy.
- Configure the KMS key policy.
In order to authorize principals’ access to your customer managed keys or AWS managed keys based on tags and aliases associated with these keys, you can define the organization(s) in the policy. Enabling cross-account access requires permission in the key policy (in other words, who can have access) of the KMS key and in an IAM policy (in other words, who does have access) in the external user’s account. Whenever possible, follow the least privilege principal. Especially, when using organization(s) in the policy, limit access only to the KMS keys that principals need for only the operations that they require. Both policies need to be in place for sufficient access.
The tags enable ABAC in AWS KMS and provide a powerful and flexible way to authorize principal organizations without editing policies or managing grants. The aws:PrincipalOrgID global condition key can be used with the Principal element in a resource-based policy with AWS KMS. Instead of listing all the AWS account IDs in an organization, you can specify the Organization ID in the Condition element with the contact key aws:PrincipalOrgID. You can find detailed instructions for changing the key policy by using the AWS Management Console in the AWS KMS Developer Guide.
Policy sample to allow use of the AWS KMS key for specific organization(s)
Following is the sample KMS key policy statement to allow identities of AWS accounts that belong to specific organizations with an ID (for example, “o-teh8ggy8o9”, “o-qrvnjkbfd2”) to use the KMS Key. To get the Organization ID:
- Open the AWS Organizations Console.
- Choose Settings.
- In Organization details, copy the Organization ID.
- Configure the SAML IdP (if you have one) or post in commands manually.
Typically, you configure your SAML IdP to pass in the project, cost-center, and department attributes as session tags. For more information, see Passing session tags using AssumeRoleWithSAML.
The following assume-role AWS CLI command helps you perform and test this request.
This example request assumes the s3Read role for the specified duration with the included session policy, session tags, external ID, and source identity. The resulting session is named my-session.
To avoid unwanted charges to your AWS account, delete the AWS resources you created during this walkthrough.
This post explains how to share encrypted S3 buckets across accounts at scale by considering access across a large number of accounts to a multi-tenant data lake and selecting solutions for scalability. Following the scalability solution path, we selected several concepts for the design and applied them in combination with managing the KMS key policy, including IAM use policy, S3 bucket policy, and VPCE policy. We added policies that enable additional controls, which are structured and orchestrated to optimize interaction. Also, out-of-bounds access, such as unintended principals outside the estate of the account holder, are addressed in order to capture risk that comes with applying the global aws:PrincipalOrgID condition key in your resource-based policies.
Our approach focused on the scalability design; you can generalize and repurpose the steps for different requirements and projects. As a result, a scalable solution for services not integrated with AWS Lake Formation is available for customization in many directions with AWS KMS. For services that honor Lake Formation permissions, you can use the Lake Formation Developer Guide to more easily set up integrated services to encrypt and decrypt data.
In summary, the design provided here is feasible for large projects, with appropriate controls to allow massive scalability (potentially across more than 1,000 accounts and many organizations).
If you have feedback about this blog post, submit comments in the Comments section below. If you have questions about this blog post, start a new thread on AWS Key Management re:Post or contact AWS Support.
Want more AWS Security news? Follow us on Twitter.