In this blog post, you’ll learn how to detect when unintended access has been granted to sensitive data in Amazon Simple Storage Service (Amazon S3) buckets in your Amazon Web Services (AWS) accounts.

It’s critical for your enterprise to understand where sensitive data is stored in your organization and how and why it is shared. The ability to efficiently find data that is shared with entities outside your account and the contents of that data is paramount. You need a process to quickly detect and report which accounts have access to sensitive data. Amazon Macie is an AWS service that can detect many sensitive data types. Macie is a fully managed data security and data privacy service that uses machine learning and pattern matching to discover and help protect your sensitive data in AWS.

AWS Identity and Access Management (IAM) Access Analyzer helps to identify resources in your organization and accounts, such as S3 buckets or IAM roles, that are shared with an external entity. When you enable IAM Access Analyzer, you create an analyzer for your entire organization or your account. The organization or account you choose is known as the zone of trust for the analyzer. The analyzer monitors the supported resources within your zone of trust. This analyzer enables IAM Access Analyzer to detect each instance of a resource shared outside the zone of trust and generates a finding about the resource and the external principals that have access to it.

Currently, you can use IAM Access Analyzer and Macie to detect external access and discover sensitive data as separate processes. You can join the findings from both to best evaluate the risk. The solution in this post integrates IAM Access Analyzer, Macie, and AWS Security Hub to automate the process of correlating findings between the services and presenting them in Security Hub.

How does the solution work?

First, IAM Access Analyzer discovers S3 buckets that are shared outside the zone of trust. Next, the solution schedules a Macie sensitive data discovery job for each of these buckets to determine if the bucket contains sensitive data. Upon discovery of shared sensitive data in S3, a custom high severity finding is created in Security Hub for review and incident response.

Solution architecture

This solution is based on a serverless architecture, and uses the following services:

Figure 1: Architecture diagram

Figure 1: Architecture diagram

Figure 1 depicts the following process flow:

  1. IAM Access Analyzer detects shared S3 buckets outside of the zone of trust—the organization or account you choose is known as a zone of trust for the analyzer—and creates the event Access Analyzer Finding in EventBridge.
  2. EventBridge triggers the Lambda function sda-aa-save-findings.
  3. The sda-aa-save-findings function records each finding in DynamoDB.
  4. An EventBridge scheduled event periodically starts a new cycle of the Step Function state machine, which immediately runs the Lambda function sda-macie-submit-scan. The template sets a 15-minute interval, but this is configurable.
  5. The sda-macie-submit-scan function reads the IAM Access Analyzer findings that were created by sda-aa-save-findings from DynamoDB.
  6. sda-macie-submit-scan launches a Macie classification job for each distinct S3 bucket that is related to one or more recent IAM Access Analyzer findings.
  7. Macie performs a sensitive discovery scan on each requested S3 bucket.
  8. The sda-macie-submit-scan function initiates the Lambda function sda-macie-check-status.
  9. sda-macie-check-status periodically checks the status of each Macie classification job, waiting for all the Macie jobs initiated by this solution to complete.
  10. Upon completion of the sda-macie-check-status function, the step function runs the Lambda function sda-sh-create-findings.
  11. sda-sh-create-findings joins the resulting IAM Access Analyzer and Macie datasets for each S3 bucket.
  12. sda-sh-create-findings publishes a finding to Security Hub for each bucket that has both external access and sensitive data.

    Note: The Macie scan is skipped if the S3 bucket is tagged to be excluded or if it was recently scanned by Macie. See the Cost considerations section for more information on custom configurations.

  13. Information security can review and act on the findings shown in Security Hub.

Sample Security Hub output

Figure 2 shows the sample findings that Security Hub will present. Each finding includes:

  • Severity
  • Workflow status
  • Record state
  • Company
  • Product
  • Title
  • Resource
Figure 2: Sample Security Hub findings

Figure 2: Sample Security Hub findings

The output to Security Hub will display a severity of HIGH with workflow NEW, because this is the first time the event has been observed. The record state is ACTIVE because the workflow state is NEW. The title explains the reason for the event.

For example, if potentially sensitive data is discovered in a bucket that is shared outside a zone of trust, selecting an event will display the resources involved in the finding so you can investigate. For more information, see the Security Hub User Guide.

Notes:

  • Detection of public S3 buckets by IAM Access Analyzer will still occur through Security Hub and will be marked as critical severity. This solution does not add to or augment this finding in Security Hub.
  • If a finding in IAM Access Analyzer is archived, the solution does not update the related finding in Security Hub.

Prerequisites

To use this solution, you need the following:

  • Permission to run AWS CloudFormation
  • Permission to create Lambda functions
  • Permission to create DynamoDB tables
  • Permission to create Step Function state machines
  • Permission to create EventBridge event rules
  • Permission to enable IAM Access Analyzer on the account where sensitive discovery is required
  • Permission to enable Macie on the account
  • Permission to enable Security Hub on the account

Deploy the solution

The solution is deployed through AWS CloudFormation, and you can review the template for options to best suit your specific needs.

  1. Sign in to your AWS account located at https://aws.amazon.com/console/.
  2. In the AWS Management Console, navigate to the AWS CloudFormation service, and then choose Create stack.
  3. Under Prerequisite – Prepare template, choose Template is ready.
  4. Under Specify template, choose Amazon S3 URL and provide the following URL:
    https://awsiammedia.s3.amazonaws.com/public/sample/936-correlating-aa-findings-macie/sda-cfn.yml
  5. Choose Next.
  6. Enter the stack name.
  7. The Application code location, S3 Bucket and S3 Key fields will be pre-filled.
  8. Under Service Activations, modify the activations based on the services you presently have running in your account.
  9. Modify the Logging and Monitoring settings if required.
  10. (Optional) Set an alert email address for errors.
  11. Choose Next, then choose Next again.
  12. Under Capabilities, select the check box.
  13. Choose Create Stack. The solution will begin deploying; watch for the CREATE_COMPLETE message.
Figure 3: Sample CloudFormation deployment status

Figure 3: Sample CloudFormation deployment status

The solution is now deployed and will start monitoring for sensitive data that is being shared. It will send the findings to Security Hub for your teams to investigate.

Cost considerations

When you scan large S3 buckets with sensitive data, remember that Macie cost is based on the amount of data scanned. For more information on Macie costs, see Amazon Macie pricing.

This solution allows the following options, which you can use to help manage costs:

  • Use environment variables in Lambda to skip specific tagged buckets
  • Skip recently scanned S3 buckets and reuse prior findings
Figure 4: Screen shot of configurable environment variable

Figure 4: Screen shot of configurable environment variable

Conclusion

In this post, we discussed how the solution uses Lambda, Step Functions and EventBridge to integrate IAM Access Analyzer with Macie discovery jobs. We reviewed the components of the application, deployed it by using CloudFormation, and reviewed the output a security team would use to take the appropriate actions. We also provided two ways that you can manage the costs associated with the solution.

After you deploy this project, you can modify it to meet your organization’s needs. For example, you can modify the tags to skip specific S3 buckets your organization has already classified to hold sensitive data. Customers who use multiple AWS accounts can designate a centralized Security Hub administrator account to receive the solution alerts from each member account. For more information on this option, see Designating a Security Hub administrator account.

If you have feedback about this post, please submit it in the Comments section below. If you have questions about this post, please start a new thread on the AWS Identity and Access Management forum.

Other resources

For more information on correlating security findings with AWS Security Hub and Amazon EventBridge, refer to this blog post.

Want more AWS Security news? Follow us on Twitter.

Nihar Das

Nihar Das

Nihar has over 20 years of experience in various business domains including financial services. As an AWS Senior Solutions Architect, he is passionate about solving challenges in the cloud and helps financial services customers to migrate to AWS and support the continued innovation.

Joe Dunn

Joe Dunn

Joe is an AWS Senior Solutions Architect in Financial Services with over 20 years of experience in infrastructure architecture and migration of business-critical loads to AWS. He helps financial services customers to innovate on the AWS Cloud by providing solutions using AWS products and services.

Armand Aquino

Armand Aquino

Armand is a solutions architect helping financial services organizations design their critical workloads on AWS. In his spare time, he enjoys exploring outdoors and learning Korean.