Backup configuration
Accurids provides the capability to create a backup of the data stored in it. A backup allows to recover the system's content to the time of the backup in case of a catastrophic failure.
(Please note: This feature is available since Accurids 1.4.0)
To perform a regular full backup, a few configuration parameters have to be set (for a detailed description, see the sections below):
-
The backup repository defines where the data is written to. Currently, Accurids supports Amazon S3 and a local file system.
-
A backup schedule defines when the backup is performed.
-
A recovery trigger defines under which circumstances a recovery is performed.
Backup Repository Settings
The backup repository defines where the data is written to. Currently, Accurids supports:
- S3: This is the recommended setting for production systems. Besides Amazon S3 buckets, also compatible services like Minio are supported.
- File system: The backup data is written to a directory in the file system. Currently, this setting is for evaluation purposes and should not be used in production systems, esp. not for cluster setups with multiple Accurids nodes.
For details, see the respective section about the backup repository below.
The configuration parameter accurids.backup.type
must be used to choose the
repository type (s3
or fs
).
Please ensure that only one Accurids installation is using a repository location to perform backups at the same time to avoid data corruption. A repository can be used by several installations at the same time for recoveries.
Amazon S3
- Preliminaries: To store the backup data, an S3 bucket must be created.
Additionally, the access to the bucket must be configured. The following
AWS policy shows the recommended permissions where
arn:aws:s3:::my-accurids-backup
is the ARN of the bucket:
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "Statement1",
"Effect": "Allow",
"Action": [
"s3:ListBucket",
"s3:GetBucketLocation",
"s3:ListBucketMultipartUploads",
"s3:ListBucketVersions"
],
"Resource": "arn:aws:s3:::my-accurids-backup"
},
{
"Sid": "Statement2",
"Effect": "Allow",
"Action": [
"s3:GetObject",
"s3:PutObject",
"s3:ListMultipartUploadParts",
"s3:AbortMultipartUpload",
"s3:GetObjectVersion",
"s3:DeleteObject"
],
"Resource": [
"arn:aws:s3:::my-accurids-backup",
"arn:aws:s3:::my-accurids-backup/*"
]
}
]
}
- Configuration settings:
Key | Description |
---|---|
accurids.backup.type | Must be set to s3 to use this repository type |
accurids.backup.s3.bucket | Mandatory, name of the S3 bucket (e.g. my-accurids-backup ) |
accurids.backup.s3.basePath | Optional, the backup is placed under a this path in the bucket, e.g. (department1/backup ) |
accurids.backup.s3.region | Optional, the region of the S3 bucket, e.g. us-east-1 |
accurids.backup.s3.endpoint | Optional, an alternative S3 endpoint, necessary when using alternatives S3 compatible services (e.g. https://s3-backup-server.example.com) |
accurids.backup.s3.accessKeyId | Optional, the access key ID for the AWS user to user (see below) |
accurids.backup.s3.secretAccessKey | Optional, the secret key for the AWS credentials |
accurids.backup.s3.readonly | Optional, the backup repository can only be used for recoveries, no backups can be are written |
- Authentication:
If present, AWS credentials are taken from the configuration settings above
(accurids.backup.s3.accessKeyId
and accurids.backup.s3.secretAccessKey
).
If not present, the AWS default mechanism takes place. Among others, the following attempts to find credentials are performed:
- The environment variables
AWS_ACCESS_KEY_ID
andAWS_SECRET_ACCESS_KEY
are evaluated - Web Identity Token credentials are retrieved from the environment or container. E.g. this is used when a container in a Kubernetes cluster has assigned a service account which is mapped to an AWS role.
File System
Not for production systems
Currently, this repository type (file system) is for evaluation purposes only and should not be used in production systems, esp. not for cluster setups with multiple Accurids nodes.
- Configuration settings:
Key | Description |
---|---|
accurids.backup.type | Must be set to fs to use this repository type |
accurids.backup.fs.path | Mandatory, file system path to the backup directory (e.g. /opt/backup/accurids ) |
accurids.backup.fs.readonly | Optional, the backup repository can only be used for recoveries, no backups can be are written |
Backup Schedule
A backup schedule defines when the backup is performed.
The parameter accurids.backup.schedule
takes a "cron" expression which consists of six fields, separated by spaces.
A field contains either a number or an asterisk (*
) for an arbitrary value.
Multiple values can be given by separating them by commas.
- Second (0-59)
- Minute (0-59)
- Hour (0-23)
- Day of the month (1-31)
- Month (1-12)
- Day of the week (0-6 for Sunday(0)-Saturday(6))
Example Schedules:
- Every day at 03:17:35 in the night:
35 17 3 * * *
- Every Sunday at 18:00:00:
0 0 18 * * 0
- Every first day of the month at 02:30:00:
0 30 2 1 * *
-
Every first, tenth and twentieth day of the month at 02:30:00:
0 30 2 1,10,20 * *
-
To use a backup, the recovery procedure must be triggered. Currently, Accurids supports "Automatic recovery":
-
Automatic recovery: By setting the parameter
accurids.backup.autoRecovery
totrue
, Accurids will start the recovery of the last performed backup when:- Accurids is started the first time (i.e. when the underlying database is empty)
- A backup repository is configured and a backup has been made.
Index backup
Feature in BETA status
Please note: This feature (the index backup) is in BETA status, do not yet use for production systems
By default, Accurids does not backup the search index. In case of a disaster recovery, the search index needs to be re-created from the recovered data. Depending on the number and size of datasets hosted by Accurids, this process could take several hours.
Accurids uses Elasticsearch for the search index. To enable a backup of the index, first Elasticsearch's snapshot feature has not be configured. Currently, Accurids supports only index backup when using S3 repositories.
- Elasticsearch configuration: Documentation about Elasticsearch snapshots in general can be found here and about S3 repositories here.
In particular, you need to perform the following steps on the command line in the Elasticsearch directory: * Install the S3 plugin
bin/elasticsearch-plugin install repository-s3
-
Set the AWS access key (here the
default
client is used, you can choose any other identifier, see below)elasticsearch-keystore add s3.client.default.access_key
-
Set the AWS secret key (here the
default
client is used, you can choose any other identifier, see below)elasticsearch-keystore add s3.client.default.secret_key
-
Elasticsearch repository (optional): You can setup and configure your own Elasticsearch snapshot repository that is then used by Accurids (see the Elasticsearch documentation). To use this feature, set the parameter
accurids.backup.es.reponame
to the name of your snapshot repository. Alternatively, you can let Accurids set up the configuration. -
Accurids backup configuration:
Key | Description |
---|---|
accurids.backup.es.reponame | Optional, specify to use your own Elasticsearch snapshot repository configuration |
accurids.backup.es.autoconfig | Optional, set to true to let Accurids create an Elasticsearch snapshot configuration. Ignored if reponame is specified. |
accurids.backup.es.settings.* | Optional, additional settings for the Elasticsearch snapshot repository configuration. These settings take precedence over any automatically generated once. Ignored if autoconfig not set. |
If you have added the S3 Elasticsearch plugin and added the AWS access and secret key to
Elasticsearch as described above, it is sufficient to set accurids.backup.es.autoconfig
to true.
If you have chosen a different client (other than default
) you can specify this by
overwriting the snapshot configuration accurids.backup.es.settings.client
.
Recovery
To use a backup, the recovery procedure must be triggered.
Currently, Accurids supports "Automatic recovery":
-
Automatic recovery: By setting the parameter
accurids.backup.autoRecovery
totrue
, Accurids will start the recovery of the last performed backup when: -
Accurids is started the first time (i.e. when the underlying database is empty)
- A backup repository is configured which contains an existing backup.
An automatic recovery procedure allows to set up a working system in short time after a failure.
Example configuration
The following configuration is for a daily backup at 4:30 in the morning to an AWS S3
bucket (my-accurids-backup/acc-backup
). The access and secret key are randomly chosen.
If the system is started from scratch, the last performed backup will be recovered on startup.
(autoRecovery: true
).
accurids:
backup:
type: s3
s3:
bucket: my-accurids-backup
basePath: acc-backup
region: eu-central-1
accessKeyId: AKIDFOO9RM3HPRJLBARO
secretAccessKey: OaBCm+123456789LUabcdefghijk/BExyzrg
schedule: 0 30 4 * * *
autoRecovery: true