Version:
Domino 4.x & 5.x
Issue:
When checking the status of your git backup pods they are shown as being in an Error state:
git-backup-1575137100-cblk 0/1 Error 0 07d
git-backup-1488142199-s5tg 0/1 Error 0 07d
Investigating the failure:
The first line of investigation is to inspect the logs of the pod to determine the root cause, you can do this like so:
kubectl logs -n domino-platform <git-backup pod name>
ubuntu@testeval [/apps/domino] $ kubectl logs -n domino-platform git-backup-1575137100-cblk --timestamps
2023-03-07T00:00:11.377099591Z + GIT_PROJECTREPOS=projectrepos
2023-03-07T00:00:11.377162439Z + SCRATCH_PATH=/tmp/backups
2023-03-07T00:00:11.377173745Z + POSTGRESQL_HOST=postgresql-headless
2023-03-08T00:00:11.377182602Z + POSTGRESQL_USER=postgres
2023-03-08T00:00:11.377189966Z + [[ -n '' ]]
2023-03-08T00:00:11.377198897Z + [[ -n '' ]]
2023-03-08T00:00:11.377207555Z + [[ -n '' ]]
2023-03-08T00:00:11.377216431Z + [[ -n /var/opt/git ]]
2023-03-08T00:00:11.377225556Z + mkdir -p /tmp/backups/git
2023-03-08T00:00:11.382221347Z + cd /var/opt/git
2023-03-08T00:00:11.382949461Z ++ date +%Y%m%d-%H%M
2023-03-08T00:00:11.388608105Z + tar -czf /tmp/backups/git/20230308-0000.tar.gz projectrepos
2023-03-08T00:01:28.111132881Z tar: projectrepos/3d46/5f4406bb7a8bf3605d5f2995.git/objects: file changed as we read it
2023-03-08T00:01:28.111788397Z tar: projectrepos/3d46/5f4406bb7a8bf3605d5f2995.git: file changed as we read it
From the above we can see that the failure could not be avoided here since a file was changed as a backup was taking place hence the failure. At this time there is currently no locking mechanism in place to prevent this.
Resolution (workaround)
1. Run the cron job manually to re-run the failed backup
#Get all Cron Jobs
kubectl get cronjob -A
#Describe the Cron Job
kubectl describe cronjob git-backup -n domino-platform
# Create a one time manual run of the cron job and give it a unique name i.e. 'git-backup-manual-test-001'
kubectl create job --from=cronjob/git-backup git-backup-manual-test-001 -n domino-platform
#Check if it was successfully run
kubectl get jobs
2. The alternative here would be to automate the kubectl cron backup jobs to run at non peak hours when you know users will be less likely to use the system or make changes to files.
Ultimately based on the inspection of the log and what it details, the resolution above is subject to change. If you are unsure on how to proceed then it would be best to contact Domino support.
Comments
0 comments
Please sign in to leave a comment.