Skip to content

Backup & Recovery

Accurids uses Docker (resp. Kubernetes) volumes to store all persisted data. This keeps data automatically when updating the accurids container. If you want to create a backup of the data or move them to another host, follow the steps below.

Kubernetes

We briefly describe how to make backups for two use cases: * Regular backups with Velero * Download the data to migrate it to another instance

Regular backups with Velero

For regular backups and disaster recovery, we suggest using a tool like Velero for generic Kubernetes backups.

  • The Velero homepage contains documentation on how to install the client (CLI) and the server in Kubernetes. Depending on your Kubernetes installation, you'll need to configure the storage provider (e.g. AWS S3 if you are using EKS on Amazon Web Services).
  • You can then create a backup of the namespace that contains Accurids. Also, you can specify schedules for regular backups (e.g. each night at 2am).
  • For recovery, you can re-create the complete Accurids namespace with the restore command.

Accurids can be configured to use an external PostgreSQL database. We kindly refer you to PostgreSQL's backup documentation.

Copy data to a local machine

To clone all data stored within Accurids to your local machine, you will need to copy both the working directory of each Accurids node, as well as the content of the relational database. You will need to have kubectl set up and pointing to the appropriate cluster. This guide assumes that you perform the following steps in bash.

The basic outline is that Accurids is stopped, temporarily pods are used to access the volumes containing the data, the data is copied to a local machine via those pods. Additionally, the database content is dumped.

  1. Note the current number of replicas: kubectl get sts/accurids -n <YOUR NAMESPACE> -o jsonpath='{.spec.replicas}'.
  2. Scale down the Accurids StatefulSet to 0 replicas in order to ensure data consistency: kubectl scale sts/accurids --replicas 0 -n <YOUR NAMESPACE>. If you wish to perform a backup without downtime, create a volume snapshot of each volume associated with running Accurids pods instead and provision new PersistentVolumeClaims from these snapshots as documented on the Kubernetes website. Then replace the volume claim names in step 3 with the PVCs you have created.
  3. Create temporary backup pods for each PVC. Run this for each ONGOING IDENTIFIER between 0 and your original replica count:
cat <<EOF | kubectl apply -f -
kind: Pod
apiVersion: v1
metadata:
  name: accurids-backup-<ONGOING IDENTIFIER>
  namespace: <YOUR NAMESPACE>
spec:
  volumes:
    - name: working-dir
      persistentVolumeClaim:
       claimName: working-dir-accurids-<ONGOING IDENTIFIER>
  containers:
    - name: working-dir-backup
      image: debian
      command: ['sleep', '36000']
      volumeMounts:
        - mountPath: "/working-dir"
          name: working-dir
EOF
  1. Copy the working directory from those pods: kubectl cp accurids-backup-<ONGOING IDENTIFIER>:/working-dir ./accurids-backup-<ONGOING IDENTIFIER> -n <YOUR NAMESPACE>.
  2. Remove the pods: kubectl delete pod/accurids-backup-<ONGOING IDENTIFIER>.
  3. Dump the database:
# Retrieve the username and password for the database user from the Kubernetes secret:
export DB_SRC_PASS=$(kubectl get secret --namespace <YOUR POSTGRES NAMESPACE> postgresql-ha-postgresql -o jsonpath="{.data.postgresql-password}" | base64 --decode)
export DB_SRC_USER=$(kubectl get secret --namespace <YOUR NAMESPACE> accurids-db -o jsonpath="{.data.username}" | base64 --decode)
# Also the database name:
export DB_SRC_DATABASE=$(kubectl get secret --namespace <YOUR NAMESPACE> accurids-db -o jsonpath="{.data.database}" | base64 --decode)
# Create a database dump on the database pod
kubectl exec postgresql-ha-postgresql-0 -c postgresql -n <YOUR POSTGRES NAMESPACE> -- bash -c "PGPASSWORD='$DB_SRC_PASS' pg_dump -U $DB_SRC_USER -F tar $DB_SRC_DATABASE > /opt/$DB_SRC_DATABASE.tar"
# Copy the dump to the local machine
kubectl cp -c postgresql -n <YOUR POSTGRES NAMESPACE> postgresql-ha-postgresql-0:/opt/$DB_SRC_DATABASE.tar ./$DB_SRC_DATABASE.tar
# Remove the dump file on the pod
kubectl exec postgresql-ha-postgresql-0 -c postgresql -n <YOUR POSTGRES NAMESPACE> -- rm /opt/$DB_SRC_DATABASE.tar
  1. Scale Accurids back up to the original set of replicas: kubectl scale sts/accurids --replicas <ORIGINAL REPLICA COUNT> -n <YOUR NAMESPACE>.

You should now have copies of the Accurids working directories, as well as of the relational database within your present working directory.

Docker

Backup Accurids Data

The data is stored in the two docker volumes: working-dir and esdata. These can be backed up by starting a new container using the same volumes, which are then backed up to tar archives:

First, stop the Accurids and Elastic containers if they are running:

docker-compose stop accurids
docker-compose stop elastic

Then navigate to the directory where you want to save the backups and run the following commands. Save Working Directory to .tar File:

docker run --rm --volumes-from accurids -v $(pwd):/backup ubuntu tar cvf /backup/working-dir.tar /working-dir

Save esdata to .tar File

docker run --rm --volumes-from elastic -v $(pwd):/backup ubuntu tar cvf /backup/esdata.tar /usr/share/elasticsearch/data

The tar files can then be found in the present working directory.

Restore Data from Backup

To restore the data, you have to extract the tar files into two new volumes before starting a new installation of Accurids.

Restore working-dir from .tar file

docker volume create working-dir
docker run --rm -v working-dir:/restore -v $(pwd):/backup ubuntu bash -c "cd /backup && tar xvf working-dir.tar"

Restore esdata from .tar File:

docker volume create esdata
docker run --rm -v esdata:/restore -v $(pwd):/backup ubuntu bash -c "cd /backup && tar xvf esdata.tar"

After having restored the volumes, you can follow the installation as described in Installation with Docker. For further reference on managing docker volumes, see the Docker manual.