Version/Environment:
Domino 5.3 or later
Issue:
Creating an environment results in the build not starting or stuck in Queued status:
In the build logs, you see the following logs only or no logs at all:
Mar 16 2023 12:05:11 -0400 Validating registry credentials
Mar 16 2023 12:05:12 -0400 Leasing buildkit worker
If you are a Domino Admin, you may also see hephaestus-buildkit-0 pods in the domino-compute namespaces that are not ready or in error status:
% kubectl get pods -n domino-compute | grep build hephaestus-buildkit-0 0/1 Running 0 10m hephaestus-buildkit-1 0/1 Running 0 10m
If you describe one of the pods, you also see these errors:
kubectl describe pod -n domino-compute hephaestus-buildkit-0
...........
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedMount 17m (x12 over 3h28m) kubelet Unable to attach or mount volumes: unmounted volumes=[cache], unattached volumes=[ca-bundle-vol cache config-vol mtls-vol]: timed out waiting for the condition
Warning FailedMount 8m34s (x18 over 4h4m) kubelet Unable to attach or mount volumes: unmounted volumes=[cache], unattached volumes=[cache config-vol mtls-vol ca-bundle-vol]: timed out waiting for the condition
Warning FailedMount 111s (x55 over 4h8m) kubelet Unable to attach or mount volumes: unmounted volumes=[cache], unattached volumes=[config-vol mtls-vol ca-bundle-vol cache]: timed out waiting for the condition
Warning FailedAttachVolume 3s (x122 over 4h6m) attachdetach-controller (combined from similar events): AttachVolume.Attach failed for volume "pvc-6d4d555c-66d1-4be2-8169-b4a264e6bdd7" : InvalidVolume.NotFound: The volume 'vol-05a6f85b95df5bdbe' does not exist.
status code: 400, request id: 72a91157-3e22-41fe-a97a-773b33c50be1
Root Cause:
The error InvalidVolume.NotFound: The volume 'vol-05a6f85b95df5bdbe' does not exist
Is due to the underlying volume/storage being deleted or removed for the Persistent Volume that the PVC is made from. In this example, this is an Amazon EBS volume. Thus, the pod cannot mount that PVC.
Resolution:
NOTE: The following commands outlined in this article will delete PVC's associated with environment builds and require a Domino Admin with kubectl access to perform these steps. If you do not feel comfortable deleting these PVC's, please contact Domino Support for additional assistance.
1. Find the number of replicas hephaestus-buildkit has:
% kubectl get sts -n domino-compute hephaestus-buildkit -o yaml | grep replicas replicas: 2 replicas: 2
Note the number of replicas as you will need this information for a later step of this articles.
2. Scale down the hephaestus-buildkit replicas to 0:
% kubectl scale sts -n domino-compute hephaestus-buildkit --replicas=0 statefulset.apps/hephaestus-buildkit scaled
3. Find the cache-hephaestus PVCs for your deployment:
% kubectl get pvc -n domino-compute | grep cache-hephaestus cache-hephaestus-buildkit-0 Bound pvc-adac8d27-40c8-458c-8297-92f548a38e75 100Gi RWO dominodisk 109d cache-hephaestus-buildkit-1 Bound pvc-58ee9c93-580f-4e40-9463-5e15fa1dff22 100Gi RWO dominodisk 109d
The number of PVCs you will see from the output will vary between Domino deployments.
4. Delete all the cache-hephaestus-buildkit PVCs that are noted in the command above:
% kubectl delete pvc cache-hephaestus-buildkit-0 cache-hephaestus-buildkit-1 -n domino-compute --grace-period=0 --force
5. Scale the hephaestus-buildkit replicas back to the original number noted in the first step:
% kubectl scale sts -n domino-compute hephaestus-buildkit --replicas=2 statefulset.apps/hephaestus-buildkit scaled
6. Restart the hephaestus-manager pod:
% kubectl get pods -n domino-platform | grep hephaestus hephaestus-manager-6d79cf47d7-qndwt 2/2 Running 0 25m
% kubectl delete pod hephaestus-manager-6d79cf47d7-qndwt -n domino-platform
It takes a few minutes for the pod to come back up and for builds to continue. You may need to cancel the builds and try them again for them to complete successfully.
Notes/Information:
Please contact Domino Support for further assistance if the above steps do not work.
Comments
0 comments
Please sign in to leave a comment.