-
Notifications
You must be signed in to change notification settings - Fork 0
SILVA Backup Routine
- Backup scheduled to run everyday at 12:00 AM, through cronjob (0 0 * * *);
- Backup verification scheduled to run everyday at 12:30 AM, through cronjob (0 0 31 2 *);
- Each namespace (DEV, TEST and PROD) will have its own backup file;
- There will be a new backup before every deployment to TEST and PROD, since deployments triggers Flyway and apply migrations.
- Files are being saved inside the namespace on a VPC;
- In the near future, backup will be saved on S3;
- BACKUP_DIR has value /tmp/backup
- RESTORE_DIR has value /tmp/restore
- NUM_BACKUPS (Number of backup files to be retained) has value 5
- JOB_SERVICE_ACCOUNT has value default but it's required to be informed as a parameter!
- SUCCESS_JOBS_HISTORY_LIMIT (Number of successful jobs that will be retained) has value 5
- FAILED_JOBS_HISTORY_LIMIT (Number of failed jobs that will be retained) hsa value 2
- JOB_BACKOFF_LIMIT (Number of attempts to try for a successful job outcome) has value 0
- PVC_SIZE has value 256Mi
- Backup Schedule: 0 0 * * * (Every day at 12:00 AM)
- Restore Schedule: 0 0 31 2 * (Every day at 12:00 AM)
In a nutshell:
- Runs the Postgres pg_dump tool passing all the arguments in the command line
- Removes old files based in the amount of backups that should be retained
- Copy the latest saved backup SQL file for the restore dir, for easily access.
Example command:
pg_dump \ -U ${POSTGRESQL_USER} \ -h ${NAME}-${ZONE}-database \ -d ${POSTGRESQL_DATABASE} \ --data-only \ --schema=nr-spar \ --inserts \ --no-comments \ --on-conflict-do-nothing \ --no-sync \ --exclude-table=table1 \ --exclude-table=table2 \ --file=${BACKUP_DIR}/backup_$(date +%Y-%m-%d).sql \ && find "${BACKUP_DIR}" -type f -mtime +$NUM_BACKUPS -exec rm -f {} \; && cp -r ${BACKUP_DIR}/backup_$(date +%Y-%m-%d).sql ${RESTORE_DIR}/W0__restore.sql
Notes:
-
--data-only
: Dump only the data, not the schema (data definitions). Table data, large objects, and sequence values are dumped. -
--inserts
: we use insert because it's more readable and easy to understand. -
--no-comments
: discard comments. -
--on-conflict-do-nothing
: Add ON CONFLICT DO NOTHING option to INSERT. -
--no-sync
: makes the dump assync !? -
--exclude-table=table1
: exclude tables you don't need -
--file
: saves into a file named, for example, /tmp/backup/backup_2024-10-01.sql
If you need to copy the backup file for any reason, here's how you can do it.
echo "Creating backup folder" mkdir -p backups echo "Waiting for Pod to Start" oc wait --for=condition=Ready pod/nr-forest-client-782-database-pvc-backup --timeout=300s echo "Copying backup file" oc cp nr-forest-client-782-database-pvc-backup:/tmp/backup/backup_2024-02-05.sql ./backups/backup_2024-02-05.sql oc cp nr-forest-client-782-database-pvc-backup:/tmp/backup/backup_$(date +%Y-%m-%d).sql ./backups/backup_$(date +%Y-%m-%d).sql echo "Deleting backup pod" oc process -f backup.yml | oc delete -f - oc delete po --field-selector=status.phase==Succeeded
PG Dump documentation: https://www.postgresql.org/docs/current/app-pgdump.html SPAR OpenShift Backup yml file: https://github.com/bcgov/nr-spar/blob/main/common/openshift.backup.yml If you have questions, reach out to Paulo Cruz.