Monitoring system
GDS supports multiple users concurrently working on the same system. Typically, GDS procedures are resource heavy in the sense that they may use a lot of memory and/or many CPU cores to do their computation. To know whether it is a reasonable time for a user to run a GDS procedure it is useful to know the current capacity of the system hosting Neo4j and GDS, as well as the current GDS workload on the system. Graphs and models are not shared between non-admin users by default, however GDS users on the same system will share its capacity.
System monitor procedure
To be able to get an overview of the system’s current capacity and its analytics workload one can use the procedure gds.systemMonitor
.
It will give you information on the capacity of the DBMS’s JVM instance in terms of memory and CPU cores, and an overview of the resources consumed by the GDS procedures currently being run on the system.
Syntax
CALL gds.systemMonitor()
YIELD
freeHeap,
totalHeap,
maxHeap,
jvmAvailableCpuCores,
availableCpuCoresNotRequested,
jvmHeapStatus,
ongoingGdsProcedures
Name | Type | Description |
---|---|---|
freeHeap |
Integer |
The amount of currently free memory in bytes in the Java Virtual Machine hosting the Neo4j instance. |
totalHeap |
Integer |
The total amount of memory in bytes in the Java virtual machine hosting the Neo4j instance. This value may vary over time, depending on the host environment. |
maxHeap |
Integer |
The maximum amount of memory in bytes that the Java virtual machine hosting the Neo4j instance will attempt to use. |
jvmAvailableCpuCores |
Integer |
The number of logical CPU cores currently available to the Java virtual machine. This value may change vary over the lifetime of the DBMS. |
availableCpuCoresNotRequested |
Integer |
The number of logical CPU cores currently available to the Java virtual machine that are not requested for use by currently running GDS procedures. Note that this number may be negative in case there are fewer available cores to the JVM than there are cores being requested by ongoing GDS procedures. |
jvmHeapStatus |
Map |
The above-mentioned heap metrics in human-readable form. |
ongoingGdsProcedures |
List of Map |
A list of maps containing resource usage and progress information for all GDS procedures (of all users) currently running on the Neo4j instance. Each map contains the name of the procedure, how far it has progressed, its estimated memory usage as well as how many CPU cores it will try to use at most. |
|
Example
First let us assume that we just started gds.node2vec.stream
procedure with some arbitrary parameters.
We can have a look at the status of the JVM heap.
CALL gds.systemMonitor()
YIELD
freeHeap,
totalHeap,
maxHeap
freeHeap | totalHeap | maxHeap |
---|---|---|
1234567 |
2345678 |
3456789 |
We can see that there currently is around 1.23 MB
free heap memory in the JVM instance running our Neo4j DBMS.
This may increase independently of any procedures finishing their execution as totalHeap
is currently smaller than maxHeap
.
We can also inspect CPU core usage as well as the status of currently running GDS procedures on the system.
CALL gds.systemMonitor()
YIELD
availableCpuCoresNotRequested,
jvmAvailableCpuCores,
ongoingGdsProcedures
jvmAvailableCpuCores | availableCpuCoresNotRequested | ongoingGdsProcedures |
---|---|---|
100 |
84 |
[{ username: "bob", jobId: "42", procedure: "Node2Vec", progress: "33.33%", estimatedMemoryRange: "[123 kB … 234 kB]", requestedNumberOfCpuCores: "16" }] |
Here we can note that there is only one GDS procedure currently running, namely the Node2Vec
procedure we just started. It has finished around 33.33%
of its execution already.
We also see that it may use up to an estimated 234 kB
of memory.
Note that it may not currently be using that much memory and so it may require more memory later in its execution, thus possible lowering our current freeHeap
.
Apparently it wants to use up to 16
CPU cores, leaving us with a total of 84
currently available cores in the system not requested by any GDS procedures.