RFC: NodeOverlay #1305

ellistarn · 2024-06-06T22:18:36Z

Fixes:

Account for AWS Savings Plan when choosing to deploy on-demand or spot aws/karpenter-provider-aws#3860
feat: Adding capacityType discount as a percentage aws/karpenter-provider-aws#4697
Mega Issue: Karpenter doesnt support custom resources requests/limit #751
karpenter doesn't scaleout when using time-slicing for GPUs #729
Discover Instance Type Capacity Memory Overhead Instead of vmMemoryOverheadPercent aws/karpenter-provider-aws#5161

Description

---
# Reduce on-demand prices to 90%
# https://github.com/aws/karpenter-provider-aws/issues/3860
# https://github.com/aws/karpenter-provider-aws/pull/4697
kind: NodeOverlay
metadata:
  name: discount
spec:
  selector:
    matchLabels:
      karpenter.sh/capacity-type: on-demand
  pricePercent: 90
---
# Support for extended resource types (e.g. smarter-devices/fuse)
# https://github.com/kubernetes-sigs/karpenter/issues/751
# https://github.com/kubernetes-sigs/karpenter/issues/729
kind: NodeOverlay
metadata:
  name: discount
spec:
  selector:
    matchLabels:
      karpenter.sh/capacity-type: on-demand
    matchExpressions:
    - key: node.kubernetes.io/instance-type
      operator: In
      values:
      - m5.large
      - m5.2xlarge
      - m5.4xlarge
      - m5.8xlarge
      - m5.12xlarge
  capacity:
    smarter-devices/fuse: 1
---
# Add memory overhead of 10Mi to all instances with 2Gi memory
# https://github.com/aws/karpenter-provider-aws/issues/5161
kind: NodeOverlay
metadata:
  name: discount
spec:
  selector:
    matchLabels:
      karpenter.k8s.aws/instance-memory: 2048
  overhead:
    memory: 10Mi
---

How was this change tested?

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

k8s-ci-robot · 2024-06-06T22:18:42Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: ellistarn

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~OWNERS~~ [ellistarn]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

sftim · 2024-06-10T22:56:15Z

(nit) In the PR description, the NodeOverlays all have the same .metadata.name

sftim · 2024-06-10T22:57:19Z

Is this an overlay for Nodes, or an overlay for NodePools (maybe more than one NodePool, if we support matching via a selector)?

jwcesign · 2024-06-11T02:55:38Z

In a real scenario with a saving plan, there is a commitment to a consistent amount of usage. Once the committed usage is exhausted, the price will revert to the on-demand rate. Does this PR take that into consideration?

What I expect is:

When there is remaining committed usage, Karpenter should continue to select on-demand instances(instance type) under the saving plan.
When the committed usage is exhausted, Karpenter should choose spot instances.

daimaxiaxie · 2024-06-12T03:50:15Z

@ellistarn We redefined the value of topology.kubernetes.io/zone. However, the current zone value is provided by the CloudProvider. Is it possible to support custom topology.kubernetes.io/zone and other WellKnownLabel values?

sftim · 2024-06-12T08:16:03Z

@ellistarn We redefined the value of topology.kubernetes.io/zone. However, the current zone value is provided by the CloudProvider. Is it possible to support custom topology.kubernetes.io/zone and other WellKnownLabel values?

It sounds like you don't agree with the zone naming used in an existing integration. You can:

use a different label
- this is the easy and recommended option
replace the cloud provider integration code with your own
(Karpenter, cloud controller manager, etc)

I don't think this PR would need to change to accommodate the ask.

GnatorX · 2024-06-12T15:49:13Z

Thanks for trying to tackle this!

I wonder if there is a cleaner way to represent the overlays to reduce sprawl.

Thinking more generally, each dimension can be additive, adding a new custom resource, and/or adjustment, price changes or resource modification. (Assuming we aren't supporting deleting properties at this time).

Price is a slight outlier from resources so we can separate out. Would it make sense to have at the highest level price and resources. Then subsection of adjustment and additions. Adjustment would support +/- percent or values and additions would just be a dict of new custom resources and values?

daimaxiaxie · 2024-06-14T02:14:19Z

@sftim I mean nodeoverlay can provide more overlay options, not just resources. For example, Lables.

tallaxes · 2024-07-02T01:06:18Z

designs/node-overlays.md

+### Price
+
+Each instance offering comes with a price, which is used as a minimization function in Karpenter's scheduling algorithm. However, Karpenter cannot perfectly understand all factors that influence price, such as an external vendor's fee, a pricing discount, or an external motiviation such as carbon-offset.


It may be worth describing the potential interaction of client-side use (and customization) of pricing with provider-specific server-side price-sensitive allocation strategies (like those supported by EC2 Fleet).

rlindsberg · 2024-07-03T08:18:32Z

Hi all!
Do we have an estimation of when this feature will be publicly available? We have been waiting for it for over a year now..

Our use case is we run Circle CI runner jobs on a EKS cluster. And we have this limit for buildah to work:

resources:
  limits:
    github.com/fuse: 2

But we got the following error:

{
    "level": "ERROR",
    "logger": "controller.provisioner",
    "message": "Could not schedule pod, incompatible with nodepool \"default\", daemonset overhead={\"cpu\":\"180m\",\"memory\":\"120Mi\",\"pods\":\"8\"}, no instance type satisfied resources {\"cpu\":\"2180m\",\"github.com/fuse\":\"1\",\"memory\":\"8312Mi\",\"pods\":\"9\"} and requirements karpenter.k8s.aws/instance-category In [m t], karpenter.k8s.aws/instance-generation Exists >2, karpenter.sh/capacity-type In [on-demand], karpenter.sh/nodepool In [default], kubernetes.io/arch In [amd64], kubernetes.io/os In [linux] (no instance type has enough resources); incompatible with nodepool \"arm64\", daemonset overhead={\"cpu\":\"180m\",\"memory\":\"120Mi\",\"pods\":\"8\"}, no instance type satisfied resources {\"cpu\":\"2180m\",\"github.com/fuse\":\"1\",\"memory\":\"8312Mi\",\"pods\":\"9\"} and requirements karpenter.k8s.aws/instance-category In [c m t], karpenter.k8s.aws/instance-generation Exists >2, karpenter.sh/capacity-type In [on-demand], karpenter.sh/nodepool In [arm64], kubernetes.io/arch In [arm64], kubernetes.io/os In [linux] (no instance type has enough resources)"
}

wmgroot · 2024-07-08T16:06:17Z

designs/node-overlays.md

+      operator: In
+      values: ["m5.large", "m5.2xlarge", "m5.4xlarge", "m5.8xlarge", "m5.12xlarge"]
+  capacity:
+    smarter-devices/fuse: 1


To be clear, the "extended resource support" in this design would only support statically applied extended resources on selected nodes, correct? This would not cover cases where dynamic extended resources are needed, for example setting an extended resource for available network bandwidth which varies by EC2 instance type.

Additionally, karpenter still needs to have its provisioning and deprovisioning logic updated to account for arbitrary extended resources other than the gpu resource, right?

This would not cover cases where dynamic extended resources are needed, for example setting an extended resource for available network bandwidth which varies by EC2 instance type.

You could achieve through multiple NodeOverlays, that match different selectors.

e.g. m5.large has 2, m5.2xlarge has 4.

I don't think defining a NodeOverlay for each network bandwidth amount is a reasonable solution for my use case. We use a wide variety of instance types and sizes.

Do you think it could be feasible to incorporate some level of templating for values that can be easily retrieved from the EC2 API?

For NetworkBandwidth, I'd love to just support this one first class.

We could always try to do something that attempts to model this like:

resourcesPerCPU: network: 1Gi

But it's hard to capture all potential relationships without devolving into something like https://github.com/google/cel-spec

coveralls · 2024-07-10T23:28:51Z

Pull Request Test Coverage Report for Build 9882748974

Details

0 of 60 (0.0%) changed or added relevant lines in 1 file are covered.
No unchanged relevant lines lost coverage.
Overall coverage decreased (-0.4%) to 77.764%

Changes Missing Coverage	Covered Lines	Changed/Added Lines	%
pkg/apis/v1alpha1/zz_generated.deepcopy.go	0	60	0.0%

Totals
Change from base Build 9880691959:	-0.4%
Covered Lines:	8750
Relevant Lines:	11252

💛 - Coveralls

daverin · 2024-07-18T20:47:12Z

Hi All

I'm a bit confused about the relationship between kubernetes-sigs/karpenter and aws/karpenter-provider-aws.

The lack of support for custom resource requests/limits is blocking me from autocalling with xilinx devices.

I want to understand how to deploy this branch to EKS.

Any help or understanding would be appreciated.

github-actions · 2024-08-02T12:01:52Z

This PR has been inactive for 14 days. StaleBot will close this stale PR after 14 more days of inactivity.

sftim · 2024-08-02T12:43:21Z

I want to understand how to deploy this branch to EKS.

This branch adds a “request for comments“. @ellistarn should the RFC and implementation land as one commit? Seems unusual.

github-actions · 2024-08-17T12:01:33Z

This PR has been inactive for 14 days. StaleBot will close this stale PR after 14 more days of inactivity.

zaafar · 2024-09-01T23:33:46Z

i think this shouldn't be closed.

njtran · 2024-09-04T20:09:04Z

/lifecycle frozen

k8s-ci-robot · 2024-09-04T20:09:07Z

@njtran: The lifecycle/frozen label cannot be applied to Pull Requests.

In response to this:

/lifecycle frozen

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

daimaxiaxie · 2024-09-14T09:30:31Z

What is the progress now? Why does it seems not to be updated.

github-actions · 2024-09-28T12:01:52Z

This PR has been inactive for 14 days. StaleBot will close this stale PR after 14 more days of inactivity.

ellistarn marked this pull request as draft June 6, 2024 22:18

k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. labels Jun 6, 2024

k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jun 6, 2024

k8s-ci-robot requested review from jackfrancis and jmdeal June 6, 2024 22:18

k8s-ci-robot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Jun 6, 2024

ellistarn force-pushed the overlays branch from 9ef6d50 to af92b99 Compare June 6, 2024 22:19

k8s-ci-robot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Jun 6, 2024

ellistarn changed the title ~~feat: Implemented NodeOverlay with support for custom pricing~~ feat: Implemented NodeOverlay with support for custom pricing, resources, and labels Jun 6, 2024

ellistarn changed the title ~~feat: Implemented NodeOverlay with support for custom pricing, resources, and labels~~ feat: Implemented NodeOverlay with support for custom pricing, capacity, overhead, and labels Jun 6, 2024

ellistarn changed the title ~~feat: Implemented NodeOverlay with support for custom pricing, capacity, overhead, and labels~~ feat: Implemented NodeOverlay with support for custom pricing, capacity, overhead, and requirements Jun 6, 2024

ellistarn force-pushed the overlays branch from af92b99 to 0033f1c Compare June 6, 2024 22:21

njtran mentioned this pull request Jun 7, 2024

feat: Add option to configure extra pod capacity for alternate cnis aws/karpenter-provider-aws#6042

Closed

3 tasks

k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jun 11, 2024

jonathan-innis mentioned this pull request Jun 11, 2024

feat: support unknown resources #603

Closed

ellistarn mentioned this pull request Jun 12, 2024

Mega Issue: Karpenter doesnt support custom resources requests/limit #751

Open

ellistarn force-pushed the overlays branch from 0033f1c to 32b7ff4 Compare June 12, 2024 20:11

k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jun 12, 2024

k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jun 18, 2024

ellistarn force-pushed the overlays branch from 32b7ff4 to 4573cb8 Compare June 20, 2024 22:50

tallaxes reviewed Jul 2, 2024

View reviewed changes

wmgroot reviewed Jul 8, 2024

View reviewed changes

docs: RFC for NodeOverlay

0f8c0e9

ellistarn force-pushed the overlays branch from f2c4ea6 to 0f8c0e9 Compare July 10, 2024 23:12

jmdeal mentioned this pull request Jul 15, 2024

Allocatable memory less than actual memory during scheduling (trn1.32x) aws/karpenter-provider-aws#6509

Open

njtran mentioned this pull request Jul 30, 2024

vmMemoryOverheadPercent is causing over provision. It should be considered only after launching machines aws/karpenter-provider-aws#6611

Closed

github-actions bot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Aug 2, 2024

github-actions bot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Aug 3, 2024

bnetzi mentioned this pull request Aug 7, 2024

vmMemoryOverheadPercent design is causing over provision aws/karpenter-provider-aws#6652

Open

github-actions bot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Aug 17, 2024

jonathan-innis mentioned this pull request Aug 26, 2024

Account for additional costs per node aws/karpenter-provider-aws#5033

Open

github-actions bot added the lifecycle/closed label Sep 1, 2024

github-actions bot closed this Sep 1, 2024

njtran reopened this Sep 4, 2024

njtran removed lifecycle/closed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Sep 4, 2024

github-actions bot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Sep 28, 2024

jmdeal mentioned this pull request Oct 1, 2024

feat: Discover Instance Type Capacity Memory Overhead aws/karpenter-provider-aws#7004

Open

3 tasks

njtran removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Oct 2, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RFC: NodeOverlay #1305

RFC: NodeOverlay #1305

ellistarn commented Jun 6, 2024 •

edited

Loading

k8s-ci-robot commented Jun 6, 2024

sftim commented Jun 10, 2024

sftim commented Jun 10, 2024

jwcesign commented Jun 11, 2024

daimaxiaxie commented Jun 12, 2024

sftim commented Jun 12, 2024

GnatorX commented Jun 12, 2024

daimaxiaxie commented Jun 14, 2024

tallaxes Jul 2, 2024

rlindsberg commented Jul 3, 2024

wmgroot Jul 8, 2024

ellistarn Jul 9, 2024

wmgroot Jul 9, 2024

ellistarn Jul 9, 2024

coveralls commented Jul 10, 2024

daverin commented Jul 18, 2024

github-actions bot commented Aug 2, 2024

sftim commented Aug 2, 2024

github-actions bot commented Aug 17, 2024

zaafar commented Sep 1, 2024

njtran commented Sep 4, 2024

k8s-ci-robot commented Sep 4, 2024

daimaxiaxie commented Sep 14, 2024

github-actions bot commented Sep 28, 2024

		### Price

		Each instance offering comes with a price, which is used as a minimization function in Karpenter's scheduling algorithm. However, Karpenter cannot perfectly understand all factors that influence price, such as an external vendor's fee, a pricing discount, or an external motiviation such as carbon-offset.

RFC: NodeOverlay #1305

Are you sure you want to change the base?

RFC: NodeOverlay #1305

Conversation

ellistarn commented Jun 6, 2024 • edited Loading

k8s-ci-robot commented Jun 6, 2024

sftim commented Jun 10, 2024

sftim commented Jun 10, 2024

jwcesign commented Jun 11, 2024

daimaxiaxie commented Jun 12, 2024

sftim commented Jun 12, 2024

GnatorX commented Jun 12, 2024

daimaxiaxie commented Jun 14, 2024

tallaxes Jul 2, 2024

Choose a reason for hiding this comment

rlindsberg commented Jul 3, 2024

wmgroot Jul 8, 2024

Choose a reason for hiding this comment

ellistarn Jul 9, 2024

Choose a reason for hiding this comment

wmgroot Jul 9, 2024

Choose a reason for hiding this comment

ellistarn Jul 9, 2024

Choose a reason for hiding this comment

coveralls commented Jul 10, 2024

Pull Request Test Coverage Report for Build 9882748974

Details

💛 - Coveralls

daverin commented Jul 18, 2024

github-actions bot commented Aug 2, 2024

sftim commented Aug 2, 2024

github-actions bot commented Aug 17, 2024

zaafar commented Sep 1, 2024

njtran commented Sep 4, 2024

k8s-ci-robot commented Sep 4, 2024

daimaxiaxie commented Sep 14, 2024

github-actions bot commented Sep 28, 2024

ellistarn commented Jun 6, 2024 •

edited

Loading