Tutorial: Back up and copy a single database in a running single instance
This tutorial provides a detailed example of how to back up a single database, in this example version 3.5, and use the
neo4j-admin copy
command to copy it into a running 4.x Neo4j standalone instance.
The neo4j-admin copy
command can be used to clean up database inconsistencies, compact stores, and upgrade/migrate a database (from Community or Enterprise) to a later version of Neo4j Enterprise edition.
Since the neo4j-admin copy
command does not copy the schema store, the intermediary steps of the sequential path are not needed.
If a schema is defined, you just have to recreate it by running the commands that the neo4j-admin copy
operation outputs.
Keep in mind that:
Therefore, if you want to preserve your relationships IDs, or to upgrade the whole DBMS, you should follow the sequential path. |
It is important to note that Estimations for how long the
For example, if your disc manufacturer has provided a maximum of 5000 IOPS, you can reasonably expect up to 5000 such page operations a second.
Therefore, the maximal theoretical throughput you can expect is 40MB/s (or 144 GB/hour) on that disc.
You may then assume that the best-case scenario for running However, it is important to remember that the process must read 144 GB from the source database, and must also write to the target store (assuming the target store is of comparable size).
Additionally, there are internal processes during the copy that will read/modify/write the store multiple times.
Therefore, with an additional 144 GB of both read and write, the best-case scenario for running Finally, it is also important to consider that in almost all Cloud environments, the published IOPS value may not be the same as the actual value, or be able to continuously maintain the maximum possible IOPS. The real processing time for this example could be well above that estimation of 3 hours. |
This tutorial walks through the basics of checking your database store usage, in this example version 3.5, performing a backup, compacting the database backup (using neo4j-admin copy
), and creating it in a running Neo4j 4.x standalone instance.
Check your 3.5 database store usage
Before you back up and copy your 3.5 database, let’s look at the database store usage and see how it changes when you load, delete, and then reload data.
-
Log in to Neo4j Browser of your running 3.5 Neo4j standalone instance, add 100k nodes to the
graph.db
database using the following command:FOREACH (x IN RANGE (1,100000) | CREATE (n:Person {name:x}))
-
Create an index on the
name
property of thePerson
node:CREATE INDEX ON :Person(name)
-
Use the
dbms.checkpoint()
procedure to flush all cached updates from the page cache to the store files.CALL dbms.checkpoint()
-
In your terminal, navigate to the
graph.db
database ($neo4j_home/data/databases/graph.db) and run the following command to check the store size of the loaded nodes and properties.ls -alh
... -rw-r--r-- 1 username staff 1.4M 26 Nov 15:51 neostore.nodestore.db -rw-r--r-- 1 username staff 3.9M 26 Nov 15:51 neostore.propertystore.db ...
The output reports that the node store (neostore.nodestore.db) and the property store (neostore.propertystore.db) occupy
1.4M
and3.9M
, respectively. -
In Neo4j Browser, delete the nodes created above and run
CALL dbms.checkpoint
again to force a checkpoint.MATCH (n) DETACH DELETE n
CALL dbms.checkpoint()
-
Now, add just one node, force a checkpoint, and repeat step 4 to see if the store size has changed.
CREATE (n:Person {name:"John"})
CALL dbms.checkpoint()
If you check the size of the node store and the property store now, they will still be
1.4M
and3.9M
, even though the database only contains one node and one property. Neo4j does not shrink the store files on the hard drive.
In a production database, where numerous load/delete operations are performed, the result is a significant unused space occupied by store files. |
Back up your 3.5 database
Navigate to the /bin folder, and run the following command to back up your database in the targeted folder. If the folder where you want to place your backup does not exist, you have to create it. In this example, it is called /tmp/3.5.24.
./neo4j-admin backup --backup-dir=/tmp/3.5.24 --name=graphdbbackup
For details on performing a backup and the different command options, see Operations Manual → Perform a backup.
Copy your 3.5 database backup to 4.x Neo4j
You can use the neo4j-admin copy
command to reclaim the unused space and create a defragmented copy of your database backup in your 4.x standalone instance.
To speed up the copy operation, you can use the |
-
In your 4.x Neo4j standalone instance, navigate to the /bin folder and run the following command to create a compacted store copy of your 3.5 database backup. Any inconsistent nodes, properties, and relationships will not be copied over to the newly created store.
./neo4j-admin copy --from-path=/private/tmp/3.5.24/graphdbbackup --to-database=compactdb
Starting to copy store, output will be saved to: $neo4j_home/logs/neo4j-admin-copy-2020-11-26.16.07.19.log 2020-11-26 16:07:19.939+0000 INFO [StoreCopy] ### Copy Data ### 2020-11-26 16:07:19.940+0000 INFO [StoreCopy] Source: /private/tmp/3.5.24/graphdbbackup (page cache 8m) 2020-11-26 16:07:19.940+0000 INFO [StoreCopy] Target: $neo4j_home/data/databases/compactdb (page cache 8m) 2020-11-26 16:07:19.940+0000 INFO [StoreCopy] Empty database created, will start importing readable data from the source. 2020-11-26 16:07:21.661+0000 INFO [o.n.i.b.ImportLogic] Import starting Import starting 2020-11-26 16:07:21.699+0000 Estimated number of nodes: 50.00 k Estimated number of node properties: 50.00 k Estimated number of relationships: 0.00 Estimated number of relationship properties: 50.00 k Estimated disk space usage: 2.680MiB Estimated required memory usage: 8.598MiB (1/4) Node import 2020-11-26 16:07:22.220+0000 Estimated number of nodes: 50.00 k Estimated disk space usage: 1.698MiB Estimated required memory usage: 8.598MiB .......... .......... .......... .......... .......... 5% ∆239ms .......... .......... .......... .......... .......... 10% ∆1ms .......... .......... .......... .......... .......... 15% ∆1ms .......... .......... .......... .......... .......... 20% ∆0ms .......... .......... .......... .......... .......... 25% ∆1ms .......... .......... .......... .......... .......... 30% ∆0ms .......... .......... .......... .......... .......... 35% ∆0ms .......... .......... .......... .......... .......... 40% ∆1ms .......... .......... .......... .......... .......... 45% ∆0ms .......... .......... .......... .......... .......... 50% ∆1ms .......... .......... .......... .......... .......... 55% ∆0ms .......... .......... .......... .......... .........- 60% ∆51ms .......... .......... .......... .......... .......... 65% ∆0ms .......... .......... .......... .......... .......... 70% ∆0ms .......... .......... .......... .......... .......... 75% ∆1ms .......... .......... .......... .......... .......... 80% ∆0ms .......... .......... .......... .......... .......... 85% ∆0ms .......... .......... .......... .......... .......... 90% ∆1ms .......... .......... .......... .......... .......... 95% ∆0ms .......... .......... .......... .......... .......... 100% ∆0ms (2/4) Relationship import 2020-11-26 16:07:22.543+0000 Estimated number of relationships: 0.00 Estimated disk space usage: 1006KiB Estimated required memory usage: 15.60MiB (3/4) Relationship linking 2020-11-26 16:07:22.879+0000 Estimated required memory usage: 7.969MiB (4/4) Post processing 2020-11-26 16:07:23.272+0000 Estimated required memory usage: 7.969MiB -......... .......... .......... .......... .......... 5% ∆356ms .......... .......... .......... .......... .......... 10% ∆0ms .......... .......... .......... .......... .......... 15% ∆1ms .......... .......... .......... .......... .......... 20% ∆0ms .......... .......... .......... .......... .......... 25% ∆0ms .......... .......... .......... .......... .......... 30% ∆1ms .......... .......... .......... .......... .......... 35% ∆0ms .......... .......... .......... .......... .......... 40% ∆0ms .......... .......... .......... .......... .......... 45% ∆1ms .......... .......... .......... .......... .......... 50% ∆0ms .......... .......... .......... .......... .......... 55% ∆0ms .......... .......... .......... .......... .......... 60% ∆0ms .......... .......... .......... .......... .......... 65% ∆1ms .......... .......... .......... .......... .......... 70% ∆0ms .......... .......... .......... .......... .......... 75% ∆0ms .......... .......... .......... .......... .......... 80% ∆0ms .......... .......... .......... .......... .......... 85% ∆0ms .......... .......... .......... .......... .......... 90% ∆0ms .......... .......... .......... .......... .......... 95% ∆1ms .......... .......... .......... .......... .......... 100% ∆0ms IMPORT DONE in 2s 473ms. Imported: 1 nodes 0 relationships 1 properties Peak memory usage: 15.60MiB 2020-11-26 16:07:24.140+0000 INFO [o.n.i.b.ImportLogic] Import completed successfully, took 2s 473ms. Imported: 1 nodes 0 relationships 1 properties 2020-11-26 16:07:24.668+0000 INFO [StoreCopy] Import summary: Copying of 100704 records took 4 seconds (25176 rec/s). Unused Records 100703 (99%) Removed Records 0 (0%) 2020-11-26 16:07:24.669+0000 INFO [StoreCopy] ### Extracting schema ### 2020-11-26 16:07:24.669+0000 INFO [StoreCopy] Trying to extract schema... 2020-11-26 16:07:24.920+0000 INFO [StoreCopy] ... found 1 schema definitions. The following can be used to recreate the schema: 2020-11-26 16:07:24.922+0000 INFO [StoreCopy] CALL db.createIndex('index_5c0607ad', ['Person'], ['name'], 'native-btree-1.0', {`spatial.cartesian-3d.min`: [-1000000.0, -1000000.0, -1000000.0],`spatial.cartesian.min`: [-1000000.0, -1000000.0],`spatial.wgs-84.min`: [-180.0, -90.0],`spatial.cartesian-3d.max`: [1000000.0, 1000000.0, 1000000.0],`spatial.cartesian.max`: [1000000.0, 1000000.0],`spatial.wgs-84-3d.min`: [-180.0, -90.0, -1000000.0],`spatial.wgs-84-3d.max`: [180.0, 90.0, 1000000.0],`spatial.wgs-84.max`: [180.0, 90.0]}) 2020-11-26 16:07:24.923+0000 INFO [StoreCopy] You have to manually apply the above commands to the database when it is stared to recreate the indexes and constraints. The commands are saved to $neo4j_home/logs/neo4j-admin-copy-2020-11-26.16.07.19.log as well for reference.
-
Run the following command to verify that database has been successfully copied.
ls -al ../data/databases
total 0 drwxr-xr-x@ 5 username staff 160 26 Nov 18:00 . drwxr-xr-x@ 5 username staff 160 26 Nov 18:00 .. drwxr-xr-x 35 username staff 1120 26 Nov 17:58 compactdb -rw-r--r-- 1 username staff 0 26 Nov 18:00 store_lock drwxr-xr-x 33 username staff 1056 26 Nov 18:00 system
Copying a database does not automatically create it. Therefore, it will not be visible if you do
SHOW DATABASES
in Cypher® Shell or Neo4j Browser.
Create your compacted backup
You can now create the copied database and compare its store size with the size of the backed up database.
-
Log in to the Cypher Shell command-line console, change the active database to
system
(:USE system;
), and create thecompactdb
database. For more information about the Cypher Shell command-line interface (CLI) and how to use it, see Operations Manual → Cypher Shell.CREATE DATABASE compactdb;
0 rows available after 145 ms, consumed after another 0 ms
-
Verify that the
compactdb
database is online.SHOW DATABASES;
+-------------------------------------------------------------------------------------------------------+ | name | address | role | requestedStatus | currentStatus | error | default | +-------------------------------------------------------------------------------------------------------+ | "compactdb" | "localhost:7687" | "standalone" | "online" | "online" | "" | FALSE | | "neo4j" | "localhost:7687" | "standalone" | "online" | "online" | "" | TRUE | | "system" | "localhost:7687" | "standalone" | "online" | "online" | "" | FALSE | +-------------------------------------------------------------------------------------------------------+ 3 rows available after 10 ms, consumed after another 3 ms
-
Change your active database to
compactdb
and recreate the schema using the output from theneo4j-admin copy
command.CALL db.createIndex('index_5c0607ad', ['Person'], ['name'], 'native-btree-1.0', {`spatial.cartesian-3d.min`: [-1000000.0, -1000000.0, -1000000.0],`spatial.cartesian.min`: [-1000000.0, -1000000.0],`spatial.wgs-84.min`: [-180.0, -90.0],`spatial.cartesian-3d.max`: [1000000.0, 1000000.0, 1000000.0],`spatial.cartesian.max`: [1000000.0, 1000000.0],`spatial.wgs-84-3d.min`: [-180.0, -90.0, -1000000.0],`spatial.wgs-84-3d.max`: [180.0, 90.0, 1000000.0],`spatial.wgs-84.max`: [180.0, 90.0]});
+-----------------------------------------------------------------------------------+ | name | labels | properties | providerName | status | +-----------------------------------------------------------------------------------+ | "index_5c0607ad" | ["Person"] | ["name"] | "native-btree-1.0" | "index created" | +-----------------------------------------------------------------------------------+ 1 row available after 50 ms, consumed after another 5 ms
-
Verify that all the data has been successfully copied. In this example, there should be one node.
MATCH (n) RETURN n.name;
+--------+ | n.name | +--------+ | "John" | +--------+ 1 row available after 106 ms, consumed after another 2 ms
-
Exit the Cypher Shell command-line console.
:exit; Bye!
-
Navigate to the
compactdb
database ($neo4j_home/data/databases/compactdb) and check the store size of the copied nodes and properties.ls -alh
... -rw-r--r-- 1 username staff 8.0K 26 Nov 17:58 neostore.nodestore.db -rw-r--r-- 1 username staff 8.0K 26 Nov 17:58 neostore.propertystore.db ...
The output reports that the node store and the property store now occupy only
8K
each, compared to the previous1.4M
and3.9M
.
MB/s = (IOPS * B) ÷ 10^6
, where B
is the block size in bytes; in the case of Neo4j, this is 8000
. GB/hour can then be calculated from (MB/s * 3600) ÷ 1000
.