Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] The parent queue has over the maximum #2117

Open
hiwangzhihui opened this issue Jun 26, 2024 · 1 comment
Open

[BUG] The parent queue has over the maximum #2117

hiwangzhihui opened this issue Jun 26, 2024 · 1 comment
Labels
area/koord-scheduler kind/bug Create a report to help us improve

Comments

@hiwangzhihui
Copy link

What happened:

  • The parent queue has over the maximum
    image
    image
    image

  • The child queue len resource can't recycle
    image

What you expected to happen:

  • The parent queue can't over the maximum
  • The child queue len resource can recycle

How to reproduce it (as minimally and precisely as possible):

  1. create ns
    kubectl create ns namespace1 kubectl create ns namespace2

  2. create queue
    `apiVersion: scheduling.sigs.k8s.io/v1alpha1
    kind: ElasticQuota
    metadata:
    name: root
    labels:
    quota.scheduling.koordinator.sh/is-parent: "true"
    quota.scheduling.koordinator.sh/allow-lent-resource: "false"
    spec:
    max:
    cpu: 2
    memory: 2Gi
    min:
    cpu: 2
    memory: 2Gi


kind: ElasticQuota
metadata:
name: a
namespace: namespace1
labels:
quota.scheduling.koordinator.sh/parent: "root"
quota.scheduling.koordinator.sh/is-parent: "false"
quota.scheduling.koordinator.sh/allow-lent-resource: "true"
annotations:
quota.scheduling.koordinator.sh/shared-weight: '{"cpu":"1","memory":"1Gi"}'
spec:
max:
cpu: 2
memory: 2Gi
min:
cpu: 1
memory: 1Gi


apiVersion: scheduling.sigs.k8s.io/v1alpha1
kind: ElasticQuota
metadata:
name: b
namespace: namespace2
labels:
quota.scheduling.koordinator.sh/parent: "root"
quota.scheduling.koordinator.sh/is-parent: "false"
quota.scheduling.koordinator.sh/allow-lent-resource: "true"
annotations:
quota.scheduling.koordinator.sh/shared-weight: '{"cpu":"1","memory":"1Gi"}'
spec:
max:
cpu: 2
memory: 2Gi
min:
cpu: 1
memory: 1Gi
`

  1. Two pods submit to "a" queue

`apiVersion: v1
kind: Pod
metadata:
name: pod-a-1
namespace: namespace1
labels:
quota.scheduling.koordinator.sh/name: "a"
koordinator.sh/qosClass: BE
spec:
schedulerName: koord-scheduler
priorityClassName: koord-batch
containers:
- command:
- sleep
- 365d
image: nginx
imagePullPolicy: IfNotPresent
name: curlimage
resources:
limits:
cpu: 1
memory: 1Gi
requests:
cpu: 1
memory: 1Gi
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
restartPolicy: Always


apiVersion: v1
kind: Pod
metadata:
name: pod-a-2
namespace: namespace1
labels:
quota.scheduling.koordinator.sh/name: "a"
koordinator.sh/qosClass: BE
spec:
schedulerName: koord-scheduler
priorityClassName: koord-batch
containers:
- command:
- sleep
- 365d
image: nginx
imagePullPolicy: IfNotPresent
name: curlimage
resources:
limits:
cpu: 1
memory: 1Gi
requests:
cpu: 1
memory: 1Gi
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
restartPolicy: Always`

  1. Two pods submit to "b" queue
    `apiVersion: v1
    kind: Pod
    metadata:
    name: pod-b-1
    namespace: namespace2
    labels:
    quota.scheduling.koordinator.sh/name: "b"
    koordinator.sh/qosClass: LS
    spec:
    priorityClassName: koord-prod
    schedulerName: koord-scheduler
    containers:
    • command:
      • sleep
      • 365d
        image: nginx
        imagePullPolicy: IfNotPresent
        name: curlimage
        resources:
        limits:
        cpu: 1
        memory: 1Gi
        requests:
        cpu: 1
        memory: 1Gi
        terminationMessagePath: /dev/termination-log
        terminationMessagePolicy: File
        restartPolicy: Always

apiVersion: v1
kind: Pod
metadata:
name: pod-b-2
namespace: namespace2
labels:
quota.scheduling.koordinator.sh/name: "b"
koordinator.sh/qosClass: LS
spec:
priorityClassName: koord-prod
schedulerName: koord-scheduler
containers:
- command:
- sleep
- 365d
image: nginx
imagePullPolicy: IfNotPresent
name: curlimage
resources:
limits:
cpu: 1
memory: 1Gi
requests:
cpu: 1
memory: 1Gi
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
restartPolicy: Always`

Anything else we need to know?:

Environment:

  • App version:
  • Kubernetes version (use kubectl version): v1.21.0
  • Install details (e.g. helm install args):
  • koordinator_1.4
  • Others:
@hiwangzhihui hiwangzhihui added the kind/bug Create a report to help us improve label Jun 26, 2024
@saintube
Copy link
Member

saintube commented Jul 9, 2024

@hiwangzhihui To recursively check the parent tree, please set enableCheckParentQuota to true in the pluginArgs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/koord-scheduler kind/bug Create a report to help us improve
Projects
None yet
Development

No branches or pull requests

2 participants