Monitor cluster endpoints for status information

A cluster exposes some HTTP endpoints which can be used to monitor the health of the cluster. This section describes these endpoints and explains their semantics.

Adjusting security settings for clustering endpoints

If authentication and authorization is enabled in Neo4j, the clustering status endpoints also require authentication credentials. The setting dbms.security.auth_enabled controls whether the native auth provider is enabled. For some load balancers and proxy servers, providing authentication credentials with the request is not an option. For those situations, consider disabling authentication of the clustering status endpoints by setting dbms.security.cluster_status_auth_enabled=false in neo4j.conf.

Unified endpoints

A unified set of endpoints exist, both on primary and secondary servers, with the following behavior:

  • /db/<databasename>/cluster/writable — Used to direct write traffic to specific instances.

  • /db/<databasename>/cluster/read-only — Used to direct read traffic to specific instances.

  • /db/<databasename>/cluster/available — Available for the general case of directing arbitrary request types to instances that are available for processing read transactions.

  • /db/<databasename>/cluster/status — Gives a detailed description of this instance’s view of its status within the cluster, for the given database.

  • /dbms/cluster/status — Gives a detailed description of this instance’s view of its status within the cluster, for all databases. Useful for monitoring and coordinating rolling upgrades. See Status endpoints for further details.

Every /db/<databasename>/* endpoint targets a specific database. The databaseName path parameter represents the name of the database. By default, a fresh Neo4j installation with two databases system and neo4j has the following cluster endpoints:

http://localhost:7474/dbms/cluster/status

http://localhost:7474/db/system/cluster/writable
http://localhost:7474/db/system/cluster/read-only
http://localhost:7474/db/system/cluster/available
http://localhost:7474/db/system/cluster/status

http://localhost:7474/db/neo4j/cluster/writable
http://localhost:7474/db/neo4j/cluster/read-only
http://localhost:7474/db/neo4j/cluster/available
http://localhost:7474/db/neo4j/cluster/status

Attempting to access endpoints for a database not hosted on a server produces a 404 response.

Table 1. Unified HTTP endpoint responses
Endpoint Instance state Returned code Body text

/db/<databasename>/cluster/writable

Leader

200 OK

true

Follower

404 Not Found

false

Secondary

404 Not Found

false

/db/<databasename>/cluster/read-only

Leader

404 Not Found

false

Follower

200 OK

true

Secondary

200 OK

true

/db/<databasename>/cluster/available

Leader

200 OK

true

Follower

200 OK

true

Secondary

200 OK

true

/db/<databasename>/cluster/status

Leader

200 OK

JSON - See Status endpoint for details.

Follower

200 OK

JSON - See Status endpoint for details.

Secondary

200 OK

JSON - See Status endpoint for details.

/dbms/cluster/status

Leader

200 OK

JSON - See Status endpoint for details.

Follower

200 OK

JSON - See Status endpoint for details.

Secondary

200 OK

JSON - See Status endpoint for details.

Example 1. Use a clustering monitoring endpoint

From the command line, a common way to ask those endpoints is to use curl. With no arguments, curl does an HTTP GET on the URI provided and outputs the body text, if any. If the response code is desired, just add the -v flag for verbose output. Here are some examples:

  • Requesting writable endpoint on a primary server that is currently elected leader with verbose output:

#> curl -v localhost:7474/db/neo4j/cluster/writable
* About to connect() to localhost port 7474 (#0)
*   Trying ::1...
* connected
* Connected to localhost (::1) port 7474 (#0)
> GET /db/neo4j/cluster/writable HTTP/1.1
> User-Agent: curl/7.24.0 (x86_64-apple-darwin12.0) libcurl/7.24.0 OpenSSL/0.9.8r zlib/1.2.5
> Host: localhost:7474
> Accept: */*
>
< HTTP/1.1 200 OK
< Content-Type: text/plain
< Access-Control-Allow-Origin: *
< Transfer-Encoding: chunked
< Server: Jetty(9.4.17)
<
* Connection #0 to host localhost left intact
true* Closing connection #0

Status endpoints

The status endpoint, available at /db/<databasename>/cluster/status, is to be used to assist with rolling upgrades. For more information, see Upgrade and Migration Guide → Clusters.

Typically, it is desired to have some guarantee that a primary is safe to shutdown for each database before removing it from a cluster. Counter-intuitively, that a primary is safe to shutdown means that a majority of the other primaries are healthy, caught up, and have recently heard from that database’s leader. The status endpoints provide the following information in order to help resolve such issues.

Several of the fields in status endpoint responses refer to details of Raft, the algorithm used in Neo4j clusters to provide highly available transactions. In a Neo4j cluster each database has its own independent Raft group. Therefore, details such as leader and raftCommandsPerSecond are database-specific.

Example 2. Example status response
{
  "lastAppliedRaftIndex":0,
  "votingMembers":["30edc1c4-519c-4030-8348-7cb7af44f591","80a7fb7b-c966-4ee7-88a9-35db8b4d68fe","f9301218-1fd4-4938-b9bb-a03453e1f779"],
  "memberId":"80a7fb7b-c966-4ee7-88a9-35db8b4d68fe",
  "leader":"30edc1c4-519c-4030-8348-7cb7af44f591",
  "millisSinceLastLeaderMessage":84545,
  "participatingInRaftGroup":true,
  "core":true,
  "isHealthy":true,
  "raftCommandsPerSecond":124
}
Table 2. Status endpoint descriptions
Field Type Optional Example Description

core

boolean

no

true

Used to distinguish between if the server is hosting the database in primary (core) or secondary mode.

lastAppliedRaftIndex

number

no

4321

Every transaction in a cluster is associated with a raft index.

Gives an indication of what the latest applied raft log index is.

participatingInRaftGroup

boolean

no

false

A participating member is able to vote. A primary is considered participating when it is part of the voter membership and has kept track of the leader.

votingMembers

string[]

no

[]

A member is considered a voting member when the leader has been receiving communication with it.

List of member’s memberId that are considered part of the voting set by this primary.

isHealthy

boolean

no

true

Reflects that the local database of this member has not encountered a critical error preventing it from writing locally.

memberId

string

no

30edc1c4-519c-4030-8348-7cb7af44f591

Every member in a cluster has it’s own unique member id to identify it. Use memberId to distinguish between primary and secondary servers.

leader

string

yes

80a7fb7b-c966-4ee7-88a9-35db8b4d68fe

Follows the same format as memberId, but if it is null or missing, then the leader is unknown.

millisSinceLastLeaderMessage

number

yes

1234

The number of milliseconds since the last heartbeat-like leader message. Not relevant to secondaries, and hence is not included.

raftCommandsPerSecond Deprecated

number

yes

124

An estimate of the average Raft state machine throughput over a sampling window configurable via clustering.status_throughput_window setting. raftCommandsPerSecond is not an effective way to monitor that servers are not falling behind in updated and is hence deprecated and will be removed in the next major release of Neo4j. It is recommended to use the metric <prefix>.clustering.core.commit_index on each server and look for divergence instead.

After an instance has been switched on, the status endpoint can be accessed in order to make sure all the guarantees listed in the table below are met.

To get the most accurate view of a cluster it is strongly recommended to access the status endpoint on all primary members and compare the result. The following table explains how results can be compared.

Table 3. Measured values, accessed via the status endpoint
Name of check Method of calculation Description

allServersAreHealthy

Every primary’s status endpoint indicates isHealthy==true.

To ensure the data across the entire cluster is healthy. Whenever any primaries are false that indicates a larger problem.

allVotingSetsAreEqual

For any 2 primaries (A and B), status endpoint A’s votingMembers== status endpoint B’s votingMembers.

When the voting begins, all the primaries are equal to each other, and all members agree on membership.

allVotingSetsContainAtLeastTargetCluster

For all primaries (S), excluding primary Z (to be switched off), every member in S contains S in their voting set. Membership is determined by using the memberId and votingMembers from the status endpoint.

Sometimes network conditions are not perfect and it may make sense to switch off a different primary than the one originally was to be switched off. If this check is run for all primaries, the ones that match this condition can be switched off (providing other conditions are also met).

hasOneLeader

For any 2 primaries (A and B), A.leader == B.leader && leader!=null.

If the leader is different then there may be a partition (alternatively, this could also occur due to bad timing). If the leader is unknown, that means the leader messages have actually timed out.

noMembersLagging

For primary A with lastAppliedRaftIndex = min, and primary B with lastAppliedRaftIndex = max, B.lastAppliedRaftIndex-A.lastAppliedRaftIndex<raftIndexLagThreshold.

If there is a large difference in the applied indexes between primaries, then it could be dangerous to switch off a primary.

raftIndexLagThreshold helps you to monitor the lag in applying Raft log entries across a cluster and set appropriate thresholds. You should pick a raftIndexLagThreshold appropriate to your particular cluster and workload. Measuring the reported lag under normal circumstances and selecting a threshold slightly above that would be a good way to select an appropriate value. For example, you observe the metric (the difference between the maximum and minimum lastAppliedRaftIndex) during all phases of the specific workload and see that it spends all of the working hours around 100 or less, but on Saturdays it spikes to 5,000 for a few hours. Then, depending on your monitoring needs or capabilities, you either set a weekday threshold of 120 and a weekend threshold of 6,000 or just an overall threshold of 6,000. These thresholds can help in identifying performance issues.

Combined status endpoints

When using the status endpoints to support a rolling upgrade, it is required to assess whether a primary is safe to shut down for all databases. To avoid having to issue a separate request to each /db/<databasename>/cluster/status endpoint, use the /dbms/cluster/status instead.

This endpoint returns a json array, the elements of which contain the same fields as the single database version, along with fields for for databaseName and databaseUuid.

Example 3. Example combined status response
[
  {
    "databaseName": "neo4j",
    "databaseUuid": "f4dacc01-f88a-4512-b3bf-68f7539c941e",
    "databaseStatus": {
      "lastAppliedRaftIndex": -1,
      "votingMembers": [
        "0cff51ad-7cee-44cc-9102-538fc4544b95",
        "90ff5df1-f5f8-4b4c-8289-a0e3deb2235c",
        "99ca7cd0-6072-4387-bd41-7566a98c6afc"
      ],
      "memberId": "90ff5df1-f5f8-4b4c-8289-a0e3deb2235c",
      "leader": "90ff5df1-f5f8-4b4c-8289-a0e3deb2235c",
      "millisSinceLastLeaderMessage": 0,
      "raftCommandsPerSecond": 0.0,
      "core": true,
      "participatingInRaftGroup": true,
      "healthy": true
    }
  },
  {
    "databaseName": "system",
    "databaseUuid": "00000000-0000-0000-0000-000000000001",
    "databaseStatus": {
      "lastAppliedRaftIndex": 7,
      "votingMembers": [
        "0cff51ad-7cee-44cc-9102-538fc4544b95",
        "90ff5df1-f5f8-4b4c-8289-a0e3deb2235c",
        "99ca7cd0-6072-4387-bd41-7566a98c6afc"
      ],
      "memberId": "90ff5df1-f5f8-4b4c-8289-a0e3deb2235c",
      "leader": "90ff5df1-f5f8-4b4c-8289-a0e3deb2235c",
      "millisSinceLastLeaderMessage": 0,
      "raftCommandsPerSecond": 0.0,
      "core": true,
      "participatingInRaftGroup": true,
      "healthy": true
    }
  }
]