How to monitor if a follower is in sync with Leader (Causal Cluster)
To monitor if a Follower is in sync with its Leader, or know how much it is lagging behind, it is possible to check the Last Commited Transaction Id from Leader and Follower.
Last Commited Transaction Id can be assessed in one of the following ways:
-
From the Neo4j Browser
-
Via the Neo4j Metrics
-
Via the JMX MBeans
1. Checking Last Transaction Id from the Neo4j Web Interface
From the Neo4j Browser:
-
type
:sysinfo
and hit enter -
from the "Transactions" frame, identify the parameter "Last Tx Id"
You can also call the dbms.queryJmx
procedure in the following way:
call dbms.queryJmx("org.neo4j:instance=kernel#0,name=Transactions") yield attributes
return attributes["LastCommittedTxId"]
2. Checking Last Commited Transaction Id via the Neo4j Metrics
Assuming Neo4j’s csv metrics are enabled already, you can analyse the following csv file: neo4j.transaction.last_committed_tx_id.csv
.
From 3.4 onwards metrics are enabled by default. If you are running any version prior to 3.4, you need to enable the metrics in your |
3. Checking Last Commited Transaction Id via the JMX MBeans
Please check:
-
LastCommittedTxId
You can do this using curl if you prefer:
$ curl -v http://localhost:7474/db/manage/server/jmx/domain/org.neo4j/instance%3Dkernel%230%2Cname%3DTransactions
For more information on the supported Neo4j JMX MBeans and how to connect to the JMX monitoring programmatically or via JConsole, please refer to https://neo4j.com/docs/java-reference/current/jmx-metrics/ |
Determining how much a Follower is lagging behind its Leader
To determine how much a Follower is lagging behind its Leader, you can compare the Last Commited Transaction Id (assessed in any of the methods described above) from Leader and Follower:
(Last Commited Transaction Id)_leader - (Last Commited Transaction Id)_follower
The higher the difference, the more the Follower is lagging behind (in terms of commited transactions). Because data propagation depends on a combination of different factors such as size of transactions, concurrency, hardware, network latency, etc, it’s virtually impossible to correlate all of this into a time unit.
Important Note: One of Neo4j’s Causal Cluster requirement is to safeguard data. The Core Servers do so by replicating all transactions using the Raft protocol. This ensures that the data is safely durable before confirming transaction commit to the end user application. In practice this means once a majority of Core Servers in a cluster ( You can read more about Neo4j’s Causal Clustering here: https://neo4j.com/docs/operations-manual/current/clustering/introduction/ |
Is this page helpful?