Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add container images cleanup settings to API #3215

Merged

Conversation

arnaldo2792
Copy link
Contributor

Issue number:

Closes #3167

Description of changes:
This adds additional API settings for the ECS variants, to further tweak image clean up cycles. In the issue, only ECS_NUM_IMAGES_DELETE_PER_CYCLE and ECS_IMAGE_CLEANUP_INTERVAL were asked for. But while testing, I found that not all the images that I was expecting to be deleted were actually cleaned. This is because not all the images were old enough to be picked up by cleanup cycles. Thus, I decided to add two more settings that allow further tweaks to the cleaning cycles. The new configurations in this PR include:

settings.ecs.image-cleanup-wait: maps to ECS_IMAGE_CLEANUP_INTERVAL, which is the time to wait between image cleanup cycles.
settings.ecs.images-to-delete-per-cycle: maps to ECS_NUM_IMAGES_DELETE_PER_CYCLE, which is the number of images to be deleted per cleanup cycle
settings.ecs.image-cleanup-age: maps to ECS_IMAGE_MINIMUM_CLEANUP_AGE, which is the time that has to pass since the images were pulled to be considered for clean up
settings.ecs.image-cleanup-disabled: maps to ECS_DISABLE_IMAGE_CLEANUP, indicates whether image cleanup is disabled.

Testing done:

  • Created two separate tasks with two different container images
  • Configured an instance with these values:
[settings.ecs]
image-cleanup-wait = "10m"
images-to-delete-per-cycle = 1
image-cleanup-age = "1s"
image-cleanup-disabled = false
  • Confirmed that only one image was deleted per cleanup cycle:
Jun 19 22:41:23 <!> msg="Candidate image for deletion: [Image: [ImageID: sha256:fddade41cfbe8e039e9c5d5c5695fc3efd22764c6bca5cf0d9b4849dd22812f9; Names: docker.io/library/fedora:35] referenced by 0 containers; PulledAt: 2023-06-19 22:32:19.863928879 +0000 UTC m=+56.779801748; LastUsedAt: 2023-06-19 22:33:41.697169747 +0000 UTC m=+138.613042609; PullSucceeded: true]" module=docker_image_manager.go
Jun 19 22:41:23 <!> msg="Candidate image for deletion: [Image: [ImageID: sha256:1d81a0524f7d02b7673be0eb1201ab507468b736188d2678a488aeb3449d266e; Names: docker.io/library/fedora:36] referenced by 0 containers; PulledAt: 2023-06-19 22:32:42.295312319 +0000 UTC m=+79.211185190; LastUsedAt: 2023-06-19 22:33:43.190129074 +0000 UTC m=+140.106001934; PullSucceeded: true]" module=docker_image_manager.go
Jun 19 22:41:23 <!> msg="Found 2 eligible images for deletion" module=docker_image_manager.go
Jun 19 22:41:23 <!> msg="Removing Image: docker.io/library/fedora:35" module=docker_image_manager.go
Jun 19 22:41:23 <!> msg="Image removed: docker.io/library/fedora:35" module=docker_image_manager.go
Jun 19 22:41:23 <!> msg="Cleaning up all tracking information for image docker.io/library/fedora:35 as it has zero references" module=docker_image_manager.go
Jun 19 22:41:23 <!> msg="Removing Image State: [Image: [ImageID: sha256:fddade41cfbe8e039e9c5d5c5695fc3efd22764c6bca5cf0d9b4849dd22812f9; Names: ] referenced by 0 containers; PulledAt: 2023-06-19 22:32:19.863928879 +0000 UTC m=+56.779801748; LastUsedAt: 2023-06-19 22:33:41.697169747 +0000 UTC m=+138.613042609; PullSucceeded: true] from Image Manager" module=docker_image_manager.go
#
# 10 mins passed .....
#
Jun 19 22:51:23 <!> msg="Begin building map of eligible unused images for deletion" module=docker_image_manager.go
Jun 19 22:51:23 <!> msg="Candidate image for deletion: [Image: [ImageID: sha256:1d81a0524f7d02b7673be0eb1201ab507468b736188d2678a488aeb3449d266e; Names: docker.io/library/fedora:36] referenced by 0 containers; PulledAt: 2023-06-19 22:32:42.295312319 +0000 UTC m=+79.211185190; LastUsedAt: 2023-06-19 22:33:43.190129074 +0000 UTC m=+140.106001934; PullSucceeded: true]" module=docker_image_manager.go
Jun 19 22:51:23 <!> msg="Found 1 eligible images for deletion" module=docker_image_manager.go
Jun 19 22:51:23 <!> msg="Removing Image: docker.io/library/fedora:36" module=docker_image_manager.go
Jun 19 22:51:23 <!> msg="Image removed: docker.io/library/fedora:36" module=docker_image_manager.go
Jun 19 22:51:23 <!> msg="Cleaning up all tracking information for image docker.io/library/fedora:36 as it has zero references" module=docker_image_manager.go
  • Performed upgrade/downgrade testing
bash-5.1# apiclient get os
{
  "os": {
    "arch": "x86_64",
    "build_id": "<>",
    "pretty_name": "Bottlerocket OS 1.14.1 (aws-ecs-1)",
    "variant_id": "aws-ecs-1",
    "version_id": "1.14.1"
  }
}
bash-5.1# updog check-update
aws-ecs-1 1.14.2
bash-5.1# updog update -i 1.14.2 -n
Starting update to 1.14.2

Update applied: aws-ecs-1 1.14.2

### Reboot and downgrade

[ssm-user@control]$ apiclient get os
{
  "os": {
    "arch": "x86_64",
    "build_id": "<>",
    "pretty_name": "Bottlerocket OS 1.14.2 (aws-ecs-1)",
    "variant_id": "aws-ecs-1",
    "version_id": "1.14.2"
  }
}

# Rollback commands and reboot

[ssm-user@control]$ apiclient get os
{
  "os": {
    "arch": "x86_64",
    "build_id": "246e3c6f-dirty",
    "pretty_name": "Bottlerocket OS 1.14.1 (aws-ecs-1)",
    "variant_id": "aws-ecs-1",
    "version_id": "1.14.1"
  }
}

Terms of contribution:

By submitting this pull request, I agree that this contribution is dual-licensed under the terms of both the Apache License, version 2.0, and the MIT license.

Copy link
Contributor

@yeazelm yeazelm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@arnaldo2792
Copy link
Contributor Author

(Rebase fixes for the rust version)

Copy link
Contributor

@stmcginnis stmcginnis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some minor things on the markdown to make it "more correct", but my main concern is on the template rendering. Looks like it's missing some of the new settings?

README.md Outdated
@@ -657,6 +657,14 @@ Valid time units include `s`, `m`, and `h`, e.g. `1h`, `1m1s`.
* `settings.ecs.reserved-memory`: The amount of memory, in MiB, reserved for critical system processes.
* `settings.ecs.task-cleanup-wait`: Time to wait before the task's containers are removed after they are stopped.
Valid time units are `s`, `m`, and `h`, e.g. `1h`, `1m1s`.
* `settings.ecs.image-cleanup-wait`: Time to wait between image cleanup cycles
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: missing period at the end of this sentence. With the next line being part of the same bullet, that will cause the two to run together.

Second lines should really be indented too, so they are clearly part of the same bullet item line. But at least for now, it looks like GitHub's markdown rendering handles this OK so the rendered output is correct.

README.md Show resolved Hide resolved
packages/ecs-agent/ecs.config Show resolved Hide resolved
@arnaldo2792
Copy link
Contributor Author

(Forced push to address nits on README.md)

README.md Outdated
Valid time units are `s`, `m`, and `h`, e.g. `1h`, `1m1s`.
* `settings.ecs.image-cleanup-wait`: Time to wait between image cleanup cycles.
Valid time units are `s`, `m`, and `h`, e.g. `1h`, `1m1s`.
* `settings.ecs.images-to-delete-per-cycle`: Number of images to delete in a single image cleanup cycle.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't like the switch away from the image- prefix here - maybe this?

Suggested change
* `settings.ecs.images-to-delete-per-cycle`: Number of images to delete in a single image cleanup cycle.
* `settings.ecs.image-cleanup-delete-per-cycle`: Number of images to delete in a single image cleanup cycle.

@arnaldo2792
Copy link
Contributor Author

(Forced push renames images-to-delete-per-cycle to image-cleanup-delete-per-cycle)

README.md Outdated
* `settings.ecs.image-cleanup-wait`: Time to wait between image cleanup cycles.
Valid time units are `s`, `m`, and `h`, e.g. `1h`, `1m1s`.
* `settings.ecs.image-cleanup-delete-per-cycle`: Number of images to delete in a single image cleanup cycle.
* `settings.ecs.image-cleanup-disabled`: Disable automatic images clean up after the tasks have been removed.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We've preferred the positive form of these booleans elsewhere in the API.

Suggested change
* `settings.ecs.image-cleanup-disabled`: Disable automatic images clean up after the tasks have been removed.
* `settings.ecs.image-cleanup-enabled`: Enable automatic images clean up after the tasks have been removed. Defaults to `true`.

@arnaldo2792
Copy link
Contributor Author

(Forced push renames image-cleanup-disabled to image-cleanup-enabled)

This adds the configurations exposed by the agent to clean up container
images after a period of time. As with previous changes, time-related
configurations are rendered at `/etc/ecs/ecs.config` instead of
`/etc/ecs/ecs.config.json`, since the JSON parser used by the ECS agent
fails to parse time-formatted strings.

Signed-off-by: Arnaldo Garcia Rincon <[email protected]>
@arnaldo2792
Copy link
Contributor Author

(Forced push removes unnecessary #[serde()])

@arnaldo2792 arnaldo2792 merged commit d5c1756 into bottlerocket-os:develop Jun 27, 2023
38 checks passed
@arnaldo2792 arnaldo2792 deleted the additional-ecs-settings branch July 20, 2023 22:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add additional settings for ECS image cleanup
5 participants