Upstart

Thoughts on building software and startups.

MOBYCAST
A weekly podcast about cloud native development, AWS, and building distributed systems.

Tags


Upstart

Updating Container Secrets Using CloudWatch Events + Lambda

Using Amazon Elastic Container Service (ECS) secrets management integration, but afraid to rotate credentials because your app will break? Here's a technique for automatically updating your containers when secrets are changed.


In a previous post, I showed how Amazon Elastic Container Service (ECS) makes it easy to inject sensitive data stored as either AWS Secrets Manager secrets or AWS Systems Manager Parameter Store parameters into your containers.

However, one of the problems with this approach is that container startup is the only time when ECS will inject sensitive data into your container. This means that if the sensitive data is updated after the container is started, your container will not automatically receive any updates. It is up to you to ensure that the container is stopped and a new one created in order to read the updated value.

A best practice with secrets management is to periodically rotate credentials. But given that our containers won't receive these updates after the containers are started, how can we safely rotate these credentials without breaking the application?

What we need is a method to automatically update containers when secrets are updated. To accomplish that, we need to have two components in place. First, we need to receive a notification when a secret is updated. Then, we trigger an action to recycle the container(s). In this post, I will show you how to leverage CloudWatch Events and Lambda to perform both of these tasks to automatically update your container secrets.

Using CloudWatch Events to receive notifications when secrets are updated

To receive notifications about changes when secrets are updated, you can leverage CloudWatch Events. CloudWatch Events is a service that delivers a near real-time stream of system events that describe changes in AWS resources. There are three primary components associated with CloudWatch Events: events, rules and targets.

Whenever an action is performed on a Secrets Manager secret or Systems Manager parameter, a CloudWatch event representing the action is emitted. For example, events are emitted whenever a value is created, updated or deleted.

To consume the CloudWatch Event, you create a CloudWatch Events rule that filters for these events. You can then invoke a target, such as Lambda function, to trigger other actions whenever a filtered event is received.

Determining the event structure

Events in Amazon CloudWatch Events are represented as JSON objects. All CloudWatch events have the same top-level fields, such as source and detail-type. The combination of the source and detail-type fields serves to identify the emitter of the event. All custom data is stored in the detail field of the event.

Example CloudWatch event - Parameter Store
CloudWatch event emitted by Systems Manager Parameter Store

Keep in mind that the schema of the event will depend on the source that emitted it. For example, the detail field structure for a Systems Manager Parameter Store event will be different than the detail field structure of a Secrets Manager event.

Since we plan to use Lambda to process events, normally we would create our Lambda function first. But the code will need to know the schema of the event structure passed to it.

For AWS resources that emit events directly to CloudWatch Events, you can view sample events when creating rules in the CloudWatch Events console. To view these sample events, just expand the "Show sample event(s)" dropdown under the event pattern textbox. But samples are not available for all types of resources, such as AWS Secrets Manager.

An alternative technique for discovering event schemas is to use CloudWatch Logs as a temporary target. You'll be able to see the exact structure of the events in the CloudWatch Logs, which can then serve as your "specification" when writing the Lambda handler code. Then, after coding the Lambda function, you can update the target to be the Lambda function instead of CloudWatch Logs. Note that this technique works for any AWS resource.

Configure CloudWatch Events for Systems Manager parameters

To consume the CloudWatch events emitted by Systems Manager Parameter Store, you create a CloudWatch Events rule that filters for these specific events. The rule can be created using the AWS Console, using the AWS Command Line Interface (CLI) or by making a direct API call.

Here's how to create a CloudWatch Events rule for Systems Manager parameters using the AWS Console.

  1. Open the CloudWatch console.
  2. From the left-hand navigation pane, choose Events->Rules, and then click the "Create rule" button.
  3. Under Event Source, verify that Event Pattern is selected.
  4. For "Service Name" dropdown, choose "EC2 Simple Systems Manager (SSM)".
  5. For "Event Type" dropdown, choose "Parameter Store".
  6. Enable the "Specific detail type(s)" radio button, and then choose "Parameter Store Change" from the dropdown.
  7. Under Targets, click the "Add target" button.
  8. In the Targets list, choose "CloudWatch log group" as the target type. Specify the name of the log group (e.g. /aws/events/ssm).
  9. Click the "Configure details" button to move to the next screen.
  10. Provide a name and (optional) description for the CloudWatch Events rule. Leave the Enabled box selected to make the rule active immediately.
  11. Finally, click the "Create rule" button.
Creating a CloudWatch Event rule for Parameter Store
Creating a CloudWatch Event rule for Parameter Store

Configure CloudWatch Events for AWS Secrets Manager secrets

Unlike Systems Manager Parameter Store, Secrets Manager does not directly emit events that can be detected by CloudWatch Events. However, you can use AWS CloudTrail to produce CloudWatch Events when secrets are modified within Secrets Manager.

AWS CloudTrail is a service that automatically records AWS API calls. Each time CloudTrail records a Secrets Manager API call, it will emit a CloudWatch Event. We can then create a CloudWatch Events rule to trigger on the information captured by CloudTrail.

Enable CloudTrail logging

In order to use CloudTrail to produce CloudWatch Events, you need to enable at least one trail for your account. There is no charge for creating a trail that delivers a single copy of management events (the default setting when creating a trail). You only pay for S3 charges associated with storing the CloudTrail logs.

Here's how to create a trail for your account using the AWS Console.

  1. Open the CloudTrail console.
  2. Click the "Create Trail" button.
  3. Specify a trail name.
  4. By default, management events will be enabled, and insights and data events will be disabled. These settings are sufficient for triggering CloudWatch events when secrets are updated in Secrets Manager.
  5. Under "Storage Location", specify the S3 bucket where the CloudTrail logs should be delivered.
  6. Click the "Create" button.

Create a CloudWatch Events rule for Secrets Manager

Now that CloudTrail logging is enabled, you can create a CloudWatch Events rule that filters for events emitted by CloudTrail specific to Secrets Manager operations.

Here's how to create a CloudWatch Events rule for Secrets Manager parameters using the AWS Console.

  1. Open the CloudWatch console.
  2. From the left-hand navigation pane, choose Events->Rules, and then click the "Create rule" button.
  3. Under Event Source, verify that Event Pattern is selected.
  4. For "Service Name" dropdown, choose "Secrets Manager".
  5. For "Event Type" dropdown, choose "AWS API Call via CloudTrail".
  6. Leave the "Any operation" radio button selected.
  7. Under Targets, click the "Add target" button.
  8. In the Targets list, choose "CloudWatch log group" as the target type. Specify the name of the log group (e.g. /aws/events/secrets-mgr).
  9. Click the "Configure details" button to move to the next screen.
  10. Provide a name and (optional) description for the CloudWatch Events rule. Leave the "Enabled" checkbox selected to make the rule active immediately.
  11. Finally, click the "Create rule" button.
Creating a CloudWatch Event rule for Secrets Manager
Creating a CloudWatch Event rule for Secrets Manager

Testing the CloudWatch Events rule

Now that we have created rules that capture events emitted when values change in either System Manager Parameter Store or AWS Secrets Manager, we can test the rule by updating a secret value and observing the output sent to the CloudWatch logs group.

To do this, go to the AWS console for the secrets management service you are using (either Systems Manager Parameter Store or AWS Secrets Manager). From the listing of parameters/secrets, choose an existing item that will get updated (if you don't have any yet, create one first). On the value detail page, select "Edit", provide an updated value and then save your change.

To view the event emitted when you updated the item, open the CloudWatch console, and select Logs->Log groups from the left-hand navigation pane. Choose the log group you specified when creating your rule to view the captured event. You should see an event similar to one of the following (depending on which service hosts the secret you updated):

Example of AWS Secrets Manager update event

{
    "version": "0",
    "id": "6e6b200b-f2b2-95c4-42ac-c26e912d2738",
    "detail-type": "AWS API Call via CloudTrail",
    "source": "aws.secretsmanager",
    "account": "1234567890",
    "time": "2020-02-05T19:15:10Z",
    "region": "us-west-2",
    "resources": [],
    "detail": {
        "eventVersion": "1.05",
        "eventName": "PutSecretValue",
        "requestParameters": {
            "secretId": "/development/credentials/test.json"
        }
    }
}

NOTE: For AWS Secret Manager events, detail-type will be "AWS API Call via CloudTrail" and source will be "aws.secretsmanager". The operation that was performed can be found in detail.eventName.

TIP: The detail.requestParameters.secretId property can be in either short name format (e.g. /development/credentials/test.json) or a full ARN (e.g. arn:aws:secretsmanager:us-west-2:1234567890:secret:/development/credentials/test.json-fWJsLX). The particular format that will be used depends on how the request was made and by which client. For example, if you update the secret via the AWS Console, the short name format will be used. But if the update was done via the built-in credential rotation (Lambda function), the full ARN will be used. If you need to test against specific secret names, you should perform substring matching instead of exact matching.

Example of Systems Manager Parameter Store update event

{
    "version": "0",
    "id": "60794edf-9ea4-a349-1f9e-451156ae5a8c",
    "detail-type": "Parameter Store Change",
    "source": "aws.ssm",
    "account": "1234567890",
    "time": "2020-02-21T21:57:33Z",
    "region": "us-west-2",
    "resources": [
        "arn:aws:ssm:us-west-2:1234567890:parameter/development/credentials/test.json"
    ],
    "detail": {
        "name": "/development/credentials/test.json",
        "type": "SecureString",
        "operation": "Update"
    }
}

NOTE: For Systems Manager Parameter Store events, detail-type will be "Parameter Store Change" and source will be "aws.ssm". The operation that was performed can be found in detail.operation.

Recycling containers in response to CloudWatch Event

Now that we know the format of the event, we can create a Lambda function to process the CloudWatch events.

The Lambda function will need to be able to process events from both Parameter Store and AWS Secrets Manager. It will look for changes made to a specific item that represents the database credentials, and when it detects a change to this item, it will then reboot the containers associated with the application service.

TIP: You can use the "Force new deployment" option for ECS services to recycle all containers without creating a new task definition file.

First, we start with the primary handler function. This entry point is essentially a router, sending events to the appropriate function based on whether this is a Parameter Store or AWS Secrets Manager update.

Lambda handler function (Node.js)

const AWS = require('aws-sdk');
const ecs = new AWS.ECS({ apiVersion: '2014-11-13' });

exports.handler = async function(event, context) {
    if ('aws.ssm' === event.source &&
            'Parameter Store Change' === event['detail-type']) {
        await handleSsmChange(event.detail);
    }
    else if ('aws.secretsmanager' === event.source &&
            'Parameter Store Change' === event['detail-type']) {
        await handleSecretsManagerChange(event.detail);
    }
};

Since the event schema used by each service is different, we break up processing into two helper functions, each specific to their respective secrets service. The helper function verifies that the event represents an "update" of the secret value representing the database credentials. If so, it then calls a helper function for rebooting the containers.

Handling Parameter Store events

const handleSsmChange = async detail => {
    if ('Update' === detail.operation) {
        if (detail.name.includes(DB_CONFIG_SECRET_NAME)) {
            // DB credentials have been updated -
            // restart containers with ECS
            await updateEcsService(CLUSTER_NAME, SERVICE_NAME)
        }
    }
};

Handling Secrets Manager events

const handleSecretsManagerChange = async detail => {
    if (detail.errorCode && typeof detail.errorCode === 'string') {
        //  This is a failure event - we can ignore
        return;
    }

    if ('PutSecretValue' === detail.eventName) {
        const secretId = detail.requestParameters.secretId;
        if (secretId.includes(DB_CONFIG_SECRET_NAME)) {
            // DB credentials have been updated -
            // restart containers with ECS
            await updateEcsService(CLUSTER_NAME, SERVICE_NAME)
        }
    }
};

To reboot the containers, we add a helper function that invokes the ECS API to make a "ecs.updateService" API call using the forceNewDeployment flag.

Reboot ECS containers

const updateEcsService = async function(clusterName, serviceName) => {
    const params = {
        service: serviceName,
        cluster: clusterName,
        forceNewDeployment: true
    };

    try {
        await ecs.updateService(params).promise();
    } catch (error) {
        console.log(`ecs.UpdateService failed. Reason: ${error}`);
    }
};

REMEMBER: After you have created the Lambda function, make sure to update the CloudWatch Event rules to specify the Lambda function as the target (instead of the CloudWatch Logs group).

Wrapping it all up

Let's consider a real-world use case. Suppose we have a containerized application running on ECS. The application uses a MySQL RDS database to store state. The application retrieves database credentials from AWS Secrets Manager. Within Secrets Manager, automatic rotation has been configured for the MySQL RDS database credentials.

Now, with our system in place for detecting and responding to changes to secrets, we have the following automated workflow:

  1. Secrets Manager updates the secret (credential rotation).
  2. CloudWatch Event(s) are emitted to the system bus.
  3. The CloudWatch Events rule for AWS Secrets Manager fires on the event and invokes the Lambda handler.
  4. The Lambda function processes the event, detects that the database credentials have been updated, and then makes the ECS API call to force a new deployment.
  5. Containers associated with the ECS service are stopped and restarted per ECS service rules. As the containers restart, they receive the new database credentials.

TIP: By setting appropriate minimum, desired and maximum task counts, you can ensure zero downtime during the container reboot cycle.

One final note

It may seem overkill to force a new deployment when a secret is updated. However, when leveraging a "no-code" solution to secrets management (such as using ECS secrets injection via task definition files), this is likely one of the most appropriate techniques. Especially considering that credential rotation will happen sporadically (say, once per month) and redeployment can happen with zero downtime.

If, on the other hand, you have developed secrets management for your application by direct use of the APIs, then you can be much finer grained in your response to secrets being updated. For example, your application could have a listener for events when secrets are updated, and then simply update its connection string dynamically without any restart required.

Entrepreneur, technologist and startup junkie. Founded two companies, one with $24M in VC, the other bootstrapped with a SBA loan. Now building startup #3, looking to shake up the podcasting industry.

View Comments