Volume mounts and persistent volumes
Neo4j Helm chart uses volume mounts and persistent volumes to manage the storage of data and other Neo4j files.
Volume mounts
A volume mount is part of a Kubernetes Pod spec that describes how and where a volume is mounted within a container.
The Neo4j Helm chart creates the following volume mounts:
-
backups
mounted at /backups -
data
mounted at /data -
import
mounted at /import -
licenses
mounted at /licenses -
logs
mounted at /logs -
metrics
mounted at /metrics (Neo4j Community Edition does not generatemetrics
.)
It is also possible to specify a plugins
volume mount (mounted at /plugins), but this is not created by the default Helm chart.
For more information, see Add plugins using a plugins volume.
Persistent volumes
PersistentVolume
(PV) is a storage resource in the Kubernetes cluster that has a lifecycle independent of any individual pod that uses the PV.
PersistentVolumeClaim
(PVC) is a request for a storage resource by a user.
PVCs consume PV resources.
For more information about what PVs are and how they work, see the Kubernetes official documentation.
The type of PV used and its configuration can have a significant effect on the performance of Neo4j. Some PV types are not suitable for use with Neo4j at all.
The volume type used for the data
volume mount is particularly important.
Neo4j supports the following PV types for the data
volume mount:
-
persistentVolumeClaim
-
hostPath
when using Docker Desktop [1].
Neo4j data
volume mount does not support azureFile
and nfs
.
|
For volume mounts other than the data
volume mount, generally, all PV types are presumed to work.
It is also not recommended to use an HDD or cloud storage, such as AWS S3 mounted as a drive. |
Mapping volume mounts to persistent volumes
By default, the Neo4j Helm chart uses a single PV, named data
, to support volume mounts.
The volume used for each volume mount can be changed by modifying the volumes.<volume name>
object in the Helm chart values.
The Neo4j Helm chart volumes
object supports different modes, such as dynamic
, share
, defaultStorageClass
, volume
, selector
, and volumeClaimTemplate
.
From Neo4j 5.10, you can also set a label on creation for the volumes with mode dynamic
, defaultStorageClass
, selector
, and volumeClaimTemplate
, which can be used to filter the PVs that are used for the volume mount.
mode: dynamic
- Description
-
Dynamic volumes are recommended for most production workloads due to ease of management. The volume mount is backed by a PV that Kubernetes dynamically provisions using a dedicated
StorageClass
. TheStorageClass
is specified in thestorageClassName
field. - Example
-
The data volume uses a dedicated storage class:
storage-class-values.yamlneo4j: name: standalone-with-storage-class volumes: data: labels: data: "true" mode: dynamic dynamic: storageClassName: "neo4j-data" requests: storage: 10Gi
See Provision a PV using a dedicated
StorageClass
for more information.
mode: share
- Description
-
The volume mount shares the underlying volume from one of the other volume objects.
- Example
-
The
logs
volume mount uses thedata
volume (this is the default behavior).volumes: logs: mode: "share" share: name: "data"
mode: defaultStorageClass
- Description
-
The volume mount is backed by a PV that Kubernetes dynamically provisions using the default
StorageClass
. - Example
-
A dynamically provisioned
data
volume with a size of10Gi
.volumes: data: labels: data: "true" mode: "defaultStorageClass" defaultStorageClass: requests: storage: 10Gi
For the
data
volume, ifrequests.storage
is not set,defaultStorageClass
defaults to a10Gi
volume. For all other volumes,defaultStorageClass.requests.storage
must be set explicitly when usingdefaultStorageClass
mode.
mode: volume
- Description
-
A complete Kubernetes
volume
object can be specified for the volume mount. Generally, volumes specified in this way have to be manually provisioned.volume
can be any valid Kubernetes volume type. This mode is typically used to mount a pre-existing Persistent Volume Claim (PVC).For details on how to specify
volume
objects, see the Kubernetes documentation. - Set file permissions on mounted volumes
-
The Neo4j Helm chart supports an additional field not present in normal Kubernetes
volume
objects:setOwnerAndGroupWritableFilePermissions: true|false
. If set totrue
, aninitContainer
will be run to modify the file permissions of the mounted volume, so that the contents can be written and read by the Neo4j process. This is to help with certain volume implementations that are not aware of theSecurityContext
set on pods using them. - Example - reference an existing PersistentVolume
-
The
backups
volume mount is backed by the specified PVC. When this method is used, thepersistentVolumeClaim
object must already exist.volumes: backups: mode: volume volume: persistentVolumeClaim: claimName: my-neo4j-pvc
mode: selector
- Description
-
The volume to use is chosen from the existing PVs based on the provided
selector
object and a PVC that is dynamically generated.If no matching PVs exist, the Neo4j pod will be unable to start. To match, a PV must have the specified
StorageClass
, match the labelselectorTemplate
, and have sufficient storage capacity to meet the requested storage amount. - Example
-
The
data
volume is chosen from the available volumes with theneo4j
storage class and the labeldeveloper: alice
.volumes: import: labels: import: "true" mode: selector selector: storageClassName: "neo4j" requests: storage: 128Gi selectorTemplate: matchLabels: developer: "alice"
For the |
mode: volumeClaimTemplate
- Description
-
A complete Kubernetes
volumeClaimTemplate
object is specified for the volume mount. Volumes specified in this way are dynamically provisioned. - Example - provision Neo4j storage using a volume claim template
-
The data volume uses a dynamically provisioned PVC from the
default
storage class.volumes: data: labels: data: "true" mode: volumeClaimTemplate volumeClaimTemplate: storageClassName: "default" accessModes: - ReadWriteOnce resources: requests: storage: 10Gi
In all cases, do not forget to set the
mode
field when customizing the volumes object. If not set, the defaultmode
is used, regardless of the other properties set on thevolume
object.
Provision persistent volumes with Neo4j Helm chart
Provision persistent volumes dynamically
With the Neo4j Helm chart, you can provision a PV dynamically using the default or a custom StorageClass
.
To see a list of available storage classes in your Kubernetes cluster, run the following command:
kubectl get storageclass
Provision a PV using a dedicated StorageClass
For production workloads, it is recommended to create a dedicated storage class for Neo4j, which uses the Retain
reclaim policy.
This is to avoid data loss when disks are deleted after removing the persistent volume resource.
- Example: Deploy Neo4j using a dedicated
StorageClass
-
The following example shows how to deploy a Neo4j server with a dynamically provisioned PV that uses a dedicated
storageClass
.-
Create a dedicated storage class that uses the
Retain
reclaim policy:-
Create a storage class in GKE that uses the
Retain
reclaim policy andpd-ssd
high-performance SSD disks:cat <<EOF | kubectl apply -f - apiVersion: storage.k8s.io/v1 kind: StorageClass metadata: name: neo4j-data provisioner: pd.csi.storage.gke.io parameters: type: pd-ssd reclaimPolicy: Retain volumeBindingMode: WaitForFirstConsumer allowVolumeExpansion: true EOF
-
Check the storage class is created:
kubectl get storageclass neo4j-data
NAME PROVISIONER RECLAIMPOLICY VOLUMEBINDINGMODE ALLOWVOLUMEEXPANSION AGE neo4j-data pd.csi.storage.gke.io Retain WaitForFirstConsumer true 7s
-
Create a storage class in EKS that uses the
Retain
reclaim policy andgp3
high-performance SSD disks:The EBS CSI Driver addon is required to provision EBS disks in EKS clusters. See the AWS documentation for instructions on installing the driver.
cat <<EOF | kubectl apply -f - kind: StorageClass apiVersion: storage.k8s.io/v1 metadata: name: neo4j-data provisioner: ebs.csi.aws.com parameters: type: gp3 reclaimPolicy: Retain allowVolumeExpansion: true volumeBindingMode: WaitForFirstConsumer EOF
-
Check the storage class is created:
kubectl get storageclass neo4j-data
NAME PROVISIONER RECLAIMPOLICY VOLUMEBINDINGMODE ALLOWVOLUMEEXPANSION AGE neo4j-data ebs.csi.aws.com Retain WaitForFirstConsumer true 2m41s
-
Create a storage class in AKS that uses the
Retain
reclaim policy andpd-ssd
high-performance SSD disks:cat <<EOF | kubectl apply -f - apiVersion: storage.k8s.io/v1 kind: StorageClass metadata: name: neo4j-data provisioner: disk.csi.azure.com parameters: skuName: Premium_LRS reclaimPolicy: Retain volumeBindingMode: WaitForFirstConsumer allowVolumeExpansion: true EOF
-
Check the storage class is created:
kubectl get storageclass neo4j-data
NAME PROVISIONER RECLAIMPOLICY VOLUMEBINDINGMODE ALLOWVOLUMEEXPANSION AGE neo4j-data disk.csi.azure.com Retain WaitForFirstConsumer true 7s
-
-
Install a Neo4j server with a data volume that uses the new storage class:
-
Create a file storage-class-values.yaml that configures the data volume to use the new storage class:
storage-class-values.yamlneo4j: name: standalone-with-storage-class volumes: data: mode: dynamic dynamic: storageClassName: "neo4j-data" requests: storage: 10Gi
-
Install a single Neo4j server:
helm install standalone-with-storage-class neo4j -f storage-class-values.yaml
-
When the installation completes, verify that a PVC has been created:
kubectl get pvc
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE data-standalone-with-storage-class-0 Bound pvc-5d400f06-f99f-43ac-bf37-6079d692eaac 10Gi RWO neo4j-data 23m
-
-
Clean up the resources:
The storage class uses the
Retain
retention policy, meaning the disk will not be deleted after removing the PVC. To delete the disk, patch the PVC to use theDelete
retention policy and delete the PVC:export pv_name=$(kubectl get pvc data-standalone-with-storage-class-0 -o jsonpath='{.spec.volumeName}') kubectl patch pv $pv_name -p '{"spec":{"persistentVolumeReclaimPolicy": "Delete"}}' kubectl delete pvc data-standalone-with-storage-class-0
For the
data
volume, ifrequests.storage
is not set,dynamic
defaults to a100Gi
volume. For all other volumes,dynamic.requests.storage
must be set explicitly when usingdynamic
mode.
-
Provision a PV using defaultStorageClass
Using the default StorageClass
of the running Kubernetes cluster is the quickest way to spin up and run Neo4j for simple tests, handling small amounts of data.
However, it is not recommended for large amounts of data, as it may lead to performance issues.
- Example: Deploy Neo4j using
defaultStorageClass
-
The following example shows how to deploy a Neo4j server with a dynamically provisioned PV that uses the default
StorageClass
.-
Create a file default-storage-class-values.yaml that configures the data volume to use the default
StorageClass
and a storage size100Gi
:storage-class-values.yamlvolumes: data: mode: "defaultStorageClass" defaultStorageClass: requests: storage: 100Gi
-
Install a single Neo4j server:
helm install standalone-with-default-storage-class neo4j -f default-storage-class-values.yaml
-
Provision persistent volumes manually
Optionally, the Helm chart can use manually created disks for Neo4j storage. This installation option has more steps than using dynamic volumes, but it does provide more control over how disks are provisioned.
The instructions for the manual provisioning of PVs vary according to the type of PV being used and the underlying infrastructure. In general, there are two steps:
-
Create the disk/volume to be used for storage in the underlying infrastructure. For example:
-
If using a
csi
volume — create the Persistent Disk using the cloud provider CLI or console. -
If using a
hostPath
volume — on the host node, create the path (directory).
-
-
Create a PV in Kubernetes that references the underlying resource created in step 1.
-
Ensure that the created PV’s
app
label matches the name of the Neo4j Helm release. -
Ensure that the created PV’s
capacity.storage
matches the storage available on the underlying infrastructure.
-
If no suitable PV or PVC exists, the Neo4j pod will not start.
Provision a PV for Neo4j Storage using a PV selector
The Neo4j StatefulSet can select a persistent volume to use based on its labels. A Neo4j Helm release uses only manually provisioned PVs that have:
-
storageClassName
that uses the provisionerkubernetes.io/no-provisioner
. -
An
app
label — set in their metadata, which matches the name of theneo4j.name
value of the Helm installation. -
Sufficient storage capacity — the PV capacity must be greater than or equal to the value of
volumes.data.selector.requests.storage
set for the Neo4j Helm release (default is100Gi
).
The neo4j/neo4j-persistent-volume Helm chart provides a convenient way to provision the persistent volume. |
- Example: Deploy Neo4j using a selector volume
-
The following example shows how to deploy Neo4j using a selector volume.
-
Create a file persistent-volume-selector.yaml that configures the data volume to use a selector:
storage-class-values.yamlneo4j: name: volume-selector volumes: data: mode: selector selector: storageClassName: "manual" accessModes: - ReadWriteOnce requests: storage: 10Gi
-
Export environment variables to be used by the commands:
export RELEASE_NAME=volume-selector export GCP_ZONE="$(gcloud config get compute/zone)" export GCP_PROJECT="$(gcloud config get project)"
-
Create the disks to be used by the persistent volume:
gcloud compute disks create --size 10Gi --type pd-ssd "${RELEASE_NAME}"
-
Use the neo4j/neo4j-persistent-volume chart to configure the persistent volume. This command will create a persistent volume and a manual storage class that uses the
kubernetes.io/no-provisioner
provisioner.helm install "${RELEASE_NAME}"-disk neo4j/neo4j-persistent-volume \ --set neo4j.name="${RELEASE_NAME}" \ --set data.driver=pd.csi.storage.gke.io \ --set data.storageClassName="manual" \ --set data.reclaimPolicy="Delete" \ --set data.createPvc=false \ --set data.createStorageClass=true \ --set data.volumeHandle="projects/${GCP_PROJECT}/zones/${GCP_ZONE}/disks/${RELEASE_NAME}" \ --set data.capacity.storage=10Gi
-
Now install Neo4j using the
persistent-volume-selector.yaml
created earlier:helm install "${RELEASE_NAME}" neo4j/neo4j -f persistent-volume-selector.yaml
-
Clean up the helm installation and disks created for the example:
helm uninstall ${RELEASE_NAME} ${RELEASE_NAME}-disk kubectl delete pvc data-${RELEASE_NAME}-0 gcloud compute disks delete ${RELEASE_NAME} --quiet
The EBS CSI Driver addon is required to provision EBS disks in EKS clusters. You can run the command
kubectl get daemonset ebs-csi-node -n kube-system
to check if it is installed See the AWS Documentation for instructions on installing the driver.-
Create a file
persistent-volume-selector.yaml
that configures the data volume to use a selector:storage-class-values.yamlneo4j: name: volume-selector volumes: data: mode: selector selector: storageClassName: "manual" accessModes: - ReadWriteOnce requests: storage: 10Gi
-
Export environment variables to be used by the commands:
readonly RELEASE_NAME=volume-selector readonly AWS_ZONE={availability zone of EKS cluster}
-
Create the disks to be used by the persistent volume:
export volumeId=$(aws ec2 create-volume \ --availability-zone="${AWS_ZONE}" \ --size=10 \ --volume-type=gp3 \ --tag-specifications 'ResourceType=volume,Tags=[{Key=volume,Value='"${RELEASE_NAME}"'}]' \ --no-cli-pager \ --output text \ --query VolumeId)
-
Use the neo4j/neo4j-persistent-volume chart to configure the persistent volume. This command will create a persistent volume and a manual storage class that uses the
kubernetes.io/no-provisioner
provisioner.helm install "${RELEASE_NAME}"-disk neo4j-persistent-volume \ --set neo4j.name="${RELEASE_NAME}" \ --set data.driver=ebs.csi.aws.com \ --set data.reclaimPolicy="Delete" \ --set data.createPvc=false \ --set data.createStorageClass=true \ --set data.volumeHandle="${volumeId}" \ --set data.capacity.storage=10Gi
-
Now install Neo4j using the
persistent-volume-selector.yaml
created earlier:helm install "${RELEASE_NAME}" neo4j/neo4j -f persistent-volume-selector.yaml
-
Clean up the helm installation and disks created for the example:
helm uninstall ${RELEASE_NAME} ${RELEASE_NAME}-disk kubectl delete pvc data-${RELEASE_NAME}-0 aws ec2 delete-volume --volume-id ${volumeId}
-
Create a file
persistent-volume-selector.yaml
that configures the data volume to use a selector:storage-class-values.yamlneo4j: name: volume-selector volumes: data: mode: selector selector: storageClassName: "manual" accessModes: - ReadWriteOnce requests: storage: 10Gi
-
Export environment variables to be used by the commands:
readonly AKS_CLUSTER_NAME={AKS Cluster name} readonly AZ_RESOURCE_GROUP={Resource group of cluster} readonly AZ_LOCATION={Location of cluster}
-
Create the disks to be used by the persistent volume:
export node_resource_group=$(az aks show --resource-group "${AZ_RESOURCE_GROUP}" --name "${AKS_CLUSTER_NAME}" --query nodeResourceGroup -o tsv) export disk_id=$(az disk create --name "${RELEASE_NAME}" --size-gb "10" --max-shares 1 --resource-group "${node_resource_group}" --location ${AZ_LOCATION} --output tsv --query id)
-
Use the neo4j/neo4j-persistent-volume chart to configure the persistent volume. This command will create a persistent volume and a manual storage class that uses the
kubernetes.io/no-provisioner
provisioner.helm install "${RELEASE_NAME}"-disk neo4j-persistent-volume \ --set neo4j.name="${RELEASE_NAME}" \ --set data.driver=disk.csi.azure.com \ --set data.storageClassName="manual" \ --set data.reclaimPolicy="Delete" \ --set data.createPvc=false \ --set data.createStorageClass=true \ --set data.volumeHandle="${disk_id}" \ --set data.capacity.storage=10Gi
-
Now install Neo4j using the
persistent-volume-selector.yaml
created earlier:helm install "${RELEASE_NAME}" neo4j/neo4j -f persistent-volume-selector.yaml
-
Clean up the helm installation and disks created for the example:
helm uninstall ${RELEASE_NAME} ${RELEASE_NAME}-disk kubectl delete pvc data-${RELEASE_NAME}-0 az disk delete --name ${RELEASE_NAME} -y
-
Provision a PVC for Neo4j Storage
An alternative method for manual provisioning is to use a manually provisioned PVC.
This is supported by the Neo4j Helm chart using the volume
mode.
The neo4j/neo4j-persistent-volume Helm chart can be used to create a PV and PVC for a manually provisioned disk.
A full example can be found in the Neo4j GitHub repository
For example, to use a pre-existing PVC called my-neo4j-pvc
set these values:
volumes:
data:
mode: "volume"
volume:
persistentVolumeClaim:
claimName: my-neo4j-pvc
Reuse a persistent volume
After uninstalling the Neo4j Helm chart, both the PVC and the PV remain and can be reused by a new install of the Helm chart.
If you delete the PVC, the PV moves into a Released
status and will not be reusable.
To be able to reuse the PV by a new install of the Neo4j Helm chart, remove its connection to the previous PVC:
-
Edit the PV by running the following command:
kubectl edit pv <pv-name>
-
Remove the section
spec.claimRef
.
The PV goes back to theAvailable
status and can be reused by a new install of the Neo4j Helm chart.
The performance of Neo4j is very dependent on the latency, IOPS capacity, and throughput of the storage it is using. For the best performance of Neo4j, use the best available disks (e.g., SSD) and set IOPS throttling/quotas to high values. For some cloud providers, IOPS throttling is proportional to the size of the volume. In these cases, the best performance is achieved by setting the size of the volume based on the desired IOPS rather than the amount required for data storage. |
hostPath
volumes.