Completed builds pods are not garbage collected (pre-4.3.x)
Build pods from completed Domino builds don’t get garbage collected and can sit around in Kubernetes indefinitely. While these normally don’t cause issues, they can impair overall Kubernetes API performance if the number grows to the hundreds.
Finding completed builds pods
Legacy Domino (K8s < 1.10 / kops / kubeadm)
kubectl get po --show-all=true --selector=buildId | egrep 'Error|Completed'
Domino 4.x < 4.3
kubectl --namespace <compute namespace> get po --selector=buildId --field-selector=status.phase!=Running
Deleting completed builds pods
Legacy Domino (K8s < 1.10 / kops / kubeadm)
Manual cleanup via kubectl:
kubectl -n default get po --show-all=true --selector=buildId | egrep 'Error|Completed' | awk '{print $1}' | xargs kubectl delete po -n default
Once these pods have been cleaned up, you can create a cron job on the deployment’s Salt Master to clean them up automatically going forward. After running crontab -e as root, add:
# Clean up exited Domino build pods
0 0 * * * kubectl -n default get po --show-all=true --selector=buildId | egrep 'Error|Completed' | awk '{print $1}' | xargs kubectl delete po -n default
Domino 4.x < 4.3
Manual cleanup via kubectl:
kubectl --namespace <compute namespace> delete po --selector=buildId --field-selector=status.phase!=Running
K8s CronJob for automatic cleanup:
Replace all occurrences of __FILL_ME_IN__ with the compute namespace of the stage and apply the manifest.
--- apiVersion: v1 kind: ServiceAccount metadata: name: delete-completed-build-pods namespace: __FILL_ME_IN__ labels: app.kubernetes.io/name: delete-completed-build-pods app.kubernetes.io/instance: delete-completed-build-pods --- apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRole metadata: name: domino-delete-completed-build-pods labels: app.kubernetes.io/name: delete-completed-build-pods app.kubernetes.io/instance: delete-completed-build-pods rules: - apiGroups: - "" resources: - pods verbs: - list - get - delete --- kind: ClusterRoleBinding apiVersion: rbac.authorization.k8s.io/v1 metadata: name: domino-delete-completed-build-pods namespace: __FILL_ME_IN__ labels: app.kubernetes.io/name: delete-completed-build-pods app.kubernetes.io/instance: delete-completed-build-pods roleRef: kind: ClusterRole name: domino-delete-completed-build-pods apiGroup: rbac.authorization.k8s.io subjects: - kind: ServiceAccount name: delete-completed-build-pods namespace: __FILL_ME_IN__ --- apiVersion: batch/v1beta1 kind: CronJob metadata: name: domino-delete-completed-build-pods namespace: __FILL_ME_IN__ spec: schedule: "0 0 * * *" jobTemplate: spec: ttlSecondsAfterFinished: 3600 template: spec: serviceAccountName: delete-completed-build-pods imagePullSecrets: - name: domino-quay-repos containers: - name: domino-delete-completed-build-pods image: quay.io/domino/bitnami.kubectl:1.18.15-20210205-1610 imagePullPolicy: IfNotPresent args: ["delete", "pods", "--selector=buildId", "--field-selector=status.phase!=Running"] restartPolicy: OnFailure
Comments
0 comments
Please sign in to leave a comment.