How to diagnose locking issues
Since Neo4j 3.4 it’s possible to better understand locking issues caused by concurrent query. This KB article will not detail the basics of locking in Neo4j. We assume a situation where you concurrently run lots of queries - might be the same parameterized query or different queries. The typical execution times of these are much higher than you expect - you basically think that Neo4j is "slow". In fact the source for this kind of observation might be that your queries try to grab locks on a common node, therefore they need to wait until the lock holder releases the lock. The queries waiting for the common lock are effectively serialized.
In a previous version of neo4j the query logging
feature allowed for the adding of information related to how long that query was in waiting state, whether waiting on locks or
waiting on IO. This can be switched on through neo4j.conf parameter of
dbms.logs.query.time_logging_enabled=true
.
However this gives us only information on the total waiting time and no insight in which contended node or relationship might
have caused it.
Commencing with Neo4j 3.4, and along with new stored procedure dbms.listTransactions()
one can better understand the cause of the delay.
To demonstrate the procedures with 2 cypher-shell connections run
the following cypher
session1: merge (n:Lock {id:1}) set n.age=20 with n call apoc.util.sleep(200000) return n; session2: Match (n:Lock {id:1}) set n.age=85 return n;
The first session will set the age
property to 20 on the :Lock labelled node with id=1 and then sleep for 200 seconds (thus keeping
the lock on said node).
The 2nd session will attempt to update the same node and set its age
property to 85 but will be blocked for these 200 seconds.
During these 200 seconds and via the Neo4j Browser, running
call dbms.listTransactions() yield transactionId, startTime, currentQueryId, currentQuery, status
will return output similar to
To which from this output we can see that transaction-1381/query-1378 reports its status of Blocked by: [transaction-1380] and to which transaction-1380 is also listed in the output and both transactions describe the currentQuery.
Also, running call dbms.listActiveLocks('<currentQueryId>');
for example call dbms.listActiveLocks('query-1377');
will return output similar to
From the output above we see that query-1377 has a EXCLUSIVE NODE lock on a node with id(n)=8432.
With these 2 procedures you have powerful tooling at hand to understand what nodes/relationships might be a source for lock contention.
Is this page helpful?