Restoring Neo4j Containers
This approach assumes you have credentials and wish to store your backups on Google Cloud Storage, AWS S3, or Azure Blob Storage. If this is not the case, you will need to adjust the restore script for your desired cloud storage method, but the approach will work for any backup location. |
This approach works only for Neo4j 4.0+. The tools and the DBMS itself changed quite a lot between 3.5 and 4.0, and the approach here will likely not work for older databases without substantial modification. |
Approach
The restore container is used as an initContainer
in the main cluster. Prior to
a node in the Neo4j cluster starting, the restore container copies down the backup
set, and restores it into place. When the initContainer terminates, the regular
Neo4j docker instance starts, and picks up where the backup left off.
This container is primarily tested against the backup .tar.gz archives produced by
the backup
container in this same code repository. We recommend you use that approach. If you tar/gz your own backups using a different approach, be careful to
inspect the restore.sh
script, because it needs to make certain assumptions about
directory structure that come out of archived backups in order to restore properly.
In order to read the backup files from cloud storage the container needs to provide some credentials or authentication. The mechanisms available depends on the cloud service provider.
Use a Service Account to access cloud storage (Google Cloud only)
GCP
Workload Identity is the recommended way to access Google Cloud services from applications running within GKE due to its improved security properties and manageability.
Follow the GCP instructions to:
-
Enable Workload Identity on your GKE cluster
-
Create a Google Cloud IAMServiceAccount that has read permissions for your backup location
-
Bind the IAMServiceAccount to the Neo4j deployment’s Kubernetes ServiceAccount*
[*] you can configure the name of the Kubernetes ServiceAccount that a Neo4j deployment uses by setting serviceAccountName
in values.yaml. To check the name of the Kubernetes ServiceAccount that a Neo4j deployment is using run kubectl get pods -o=jsonpath='{.spec.serviceAccountName}{"\n"}' <your neo4j pod name>
If you are unable to use Workload Identity with GKE then you can create a service key secret instead as described in the next section.
Create a service key secret to access cloud storage
First you want to create a kubernetes secret that contains the content of your account service key. This key must have permissions to access the bucket and backup set that you’re trying to restore.
AWS
-
You must create the credential file and this file should look like this:
[default]
region=
aws_access_key_id=
aws_secret_access_key=
-
You have to create a secret for this file
kubectl create secret generic neo4j-aws-credentials \
--from-file=credentials=aws-credentials
GCP
You do NOT need to follow the steps in this section if you are using Workload Identity for GCP.
-
You must create the credential file and this file should look like this:
{
"type": "",
"project_id": "",
"private_key_id": "",
"private_key": "",
"client_email": "",
"client_id": "",
"auth_uri": "",
"token_uri": "",
"auth_provider_x509_cert_url": "",
"client_x509_cert_url": ""
}
-
You have to create a secret for this file
kubectl create secret generic neo4j-gcp-credentials \
--from-file=credentials=gcp-credentials.json
Azure
-
You must create the credential file and this file should look like this:
export ACCOUNT_NAME=<NAME_STORAGE_ACCOUNT>
export ACCOUNT_KEY=<STORAGE_ACCOUNT_KEY>
-
You have to create a secret for this file
kubectl create secret generic neo4j-azure-credentials \
--from-file=credentials=azure-credentials.sh
If this service key secret is not in place, the auth information will not be able to be mounted as
a volume in the initContainer, and your pods may get stuck/hung at ContainerCreating
phase.
Configure the initContainer for Core and Read Replica Nodes
Refer to the single instance restore deploy scenario to see how the initContainers are configured.
What you will need to customize and ensure: * Ensure you have created the appropriate secret and set its name * Ensure that the volume mount to /auth matches the secret name you created above. * Ensure that your BUCKET, and credentials are set correctly given the way you created your secret.
The example scenario above creates the initContainer just for core nodes. It’s strongly recommended you do the same for readReplica.initContainers
if you are using read replicas. If you restore only to core nodes and not to read replicas, when they start the core nodes will replicate the data to the read replicas. This will work just fine, but may result in longer startup times and much more bandwidth.
Restore Environment Variables for the Init Container
-
To restore you need to add the necessary parameters to values.yaml and this file should look like this:
...
core:
...
restore:
enabled: true
secretName: (neo4j-gcp-credentials|neo4j-aws-credentials|neo4j-azure-credentials|NULL) #required. Set NULL if using Workload Identity in GKE.
database: neo4j,system #required
cloudProvider: (gcp|aws|azure) #required
bucket: (gs://|s3://)test-neo4j #required
timestamp: "latest" #optional #default:"latest"
forceOverwrite: true #optional #default:true
purgeOnComplete: true #optinal #default:true
readReplica:
...
restore:
enabled: true
secretName: (neo4j-gcp-credentials|neo4j-aws-credentials|neo4j-azure-credentials|NULL) #required. Set NULL if using Workload Identity in GKE.
database: neo4j,system #required
cloudProvider: (gcp|aws|azure) #required
bucket: (gs://|s3://)test-neo4j #required
timestamp: "2020-06-16-12:32:57" #optional #default:"latest"
forceOverwrite: true #optional #default:true
purgeOnComplete: true #optinal #default:true
...
-
standard neo4j installation run from the root of neo4j-helm - not tools/restore. Further, it could be the same command you ran to create the cores/replicas originally.
helm install \
neo4j neo4j/neo4j \
-f values.yaml \
--set acceptLicenseAgreement=yes
Warnings & Indications
A common way you might deploy Neo4j would be restore from last backup when a container initializes. This would be good for a cluster, because it would minimize how much catch-up is needed when a node is launched. Any difference between the last backup and the rest of the cluster would be provided via catch-up.
For single nodes, take extreme care here. |
If a node crashes, and you automatically restore from backup, and force-overwrite what was previously on the disk, you will lose any data that the database captured between when the last backup was taken, and when the crash happened. As a result, for single node instances of Neo4j you should either perform restores manually when you need them, or you should keep a very regular backup schedule to minimize this data loss. If data loss is under no circumstances acceptable, do not automate restores for single node deploys.
Special notes for Azure Storage. Parameters require a "bucket", but for Azure storage the
naming is slightly different. There is no protocol scheme as there would be for AWS or Google.
The bucket specified is the "blob container name", not the account name. where the
files were be placed by the backup, and is the same "blob container name" you used in the Backup
Chapter, assuming you followed the examples there. Relative paths will be respected; if you set
bucket to be container/path/to/directory , this expects your backup files to be stored in
container at the path /path/to/directory/db/db-TIMESTAMP.tar.gz where "db" is the
name of the database being backed up (i.e. neo4j and system).
|