User-defined procedures

A user-defined procedure is a mechanism that enables you to extend Neo4j by writing customized code, which can be invoked directly from Cypher. Procedures can take arguments, perform operations on the database, and return results. For a comparison between user-defined procedures, functions, and aggregation functions see Neo4j customized code.

User-defined procedures requiring execution on the system database need to include the annotation @SystemProcedure or they will be classed as a user database procedure.

Call a procedure

To call a user-defined procedure, use a Cypher CALL clause. The procedure name must be fully qualified, so a procedure named findDenseNodes defined in the package org.neo4j.examples could be called using:

CALL org.neo4j.examples.findDenseNodes(1000)

CALL may be the only clause within a Cypher statement or may be combined with other clauses. Arguments can be supplied directly within the query or taken from the associated parameter set. For full details, see the documentation in Cypher Manual → CALL procedure.

Create a procedure

Make sure you have read and followed the preparatory setup instructions in Setting up a plugin project.

The example discussed below is available as a repository on GitHub. To get started quickly you can fork the repository and work with the code as you follow along in the guide below.

First, decide what the procedure should do, then write a test that proves that it does it right. Finally, write a procedure that passes the test.

Integration tests

The test dependencies include Neo4j Harness and JUnit. These can be used to write integration tests for procedures. The tests should start a Neo4j instance, load the procedure, and execute queries against it.

An example using JUnit 5 for testing a procedure that returns relationship types found in the graph.

package example;

import org.junit.jupiter.api.AfterAll;
import org.junit.jupiter.api.AfterEach;
import org.junit.jupiter.api.BeforeAll;
import org.junit.jupiter.api.Test;
import org.junit.jupiter.api.TestInstance;
import org.neo4j.driver.Driver;
import org.neo4j.driver.GraphDatabase;
import org.neo4j.driver.Record;
import org.neo4j.driver.Result;
import org.neo4j.driver.Session;
import org.neo4j.driver.Value;
import org.neo4j.harness.Neo4j;
import org.neo4j.harness.Neo4jBuilders;

import static org.assertj.core.api.Assertions.assertThat;

@TestInstance(TestInstance.Lifecycle.PER_CLASS)
public class GetRelationshipTypesTests {

    private Driver driver;
    private Neo4j embeddedDatabaseServer;

    @BeforeAll
    void initializeNeo4j() {
        this.embeddedDatabaseServer = Neo4jBuilders.newInProcessBuilder()
                .withDisabledServer()
                .withProcedure(GetRelationshipTypes.class)
                .build();

        this.driver = GraphDatabase.driver(embeddedDatabaseServer.boltURI());
    }

    @AfterAll
    void closeDriver(){
        this.driver.close();
        this.embeddedDatabaseServer.close();
    }

    @AfterEach
    void cleanDb(){
        try(Session session = driver.session()) {
            session.run("MATCH (n) DETACH DELETE n");
        }
    }

    /**
     * We should be getting the correct values when there is only one type in each direction
     */
    @Test
    public void shouldReturnTheTypesWhenThereIsOneEachWay() {
        final String expectedIncoming = "INCOMING";
        final String expectedOutgoing = "OUTGOING";

        // In a try-block, to make sure we close the session after the test
        try(Session session = driver.session()) {

            //Create our data in the database.
            session.run(String.format("CREATE (:Person)-[:%s]->(:Movie {id:1})-[:%s]->(:Person)", expectedIncoming, expectedOutgoing));

            //Execute our procedure against it.
            Record record = session.run("MATCH (u:Movie {id:1}) CALL example.getRelationshipTypes(u) YIELD outgoing, incoming RETURN outgoing, incoming").single();

            //Get the incoming / outgoing relationships from the result
            assertThat(record.get("incoming").asList(Value::asString)).containsOnly(expectedIncoming);
            assertThat(record.get("outgoing").asList(Value::asString)).containsOnly(expectedOutgoing);
        }
    }
}

The previous example uses JUnit 5, which requires the use of org.neo4j.harness.junit.extension.Neo4jExtension. If you want to use JUnit 4 with Neo4j 4.x or 5, use org.neo4j.harness.junit.rule.Neo4jRule instead.

Define a procedure

With the test in place, write a procedure that fulfills the expectations of the test. The full example is available in the Neo4j Procedure Template repository.

Particular things to note:

All procedures are annotated @Procedure.
The procedure annotation can take three optional arguments: name, mode, and eager.
- name is used to specify a different name for the procedure than the default generated, which is class.path.nameOfMethod. If mode is specified, name must be specified as well.
- mode is used to declare the types of interactions that the procedure performs. A procedure fails if it attempts to execute database operations that violate its mode. The default mode is READ. The following modes are available:
  - READ — This procedure only performs read operations against the graph.
  - WRITE — This procedure performs read and write operations against the graph.
  - SCHEMA — This procedure performs operations against the schema, i.e. create and drop indexes and constraints. A procedure with this mode can read graph data, but not write.
  - DBMS — This procedure performs system operations such as user management and query management. A procedure with this mode is not able to read or write graph data.
- eager is a boolean setting defaulting to false. If it is set to true, the Cypher planner plans an extra eager operation before and after calling the procedure. This is useful in cases where the procedure makes changes to the database in a way that could interact with the operations preceding or following the procedure. For example:
  MATCH (n) WHERE n.key = 'value' WITH n CALL deleteNeighbours(n, 'FOLLOWS')
  This query can delete some of the nodes that are matched by the Cypher query, and the n.key lookup will fail. Marking this procedure as eager prevents this from causing an error in Cypher code. However, it is still possible for the procedure to interfere with itself by trying to read entities it has previously deleted. It is the responsibility of the procedure author to handle that case.
The context of the procedure, which is the same as each resource that the procedure wants to use, is annotated @Context.

The correct way to signal an error from within a procedure is to throw RuntimeException.

Injectable resources

When writing procedures, some resources can be injected into the procedure from the database. To inject these, use the @Context annotation. The classes that can be injected are:

Log
TerminationGuard
GraphDatabaseService
Transaction

All of the above classes are considered safe and future-proof and do not compromise the security of the database. Several unsupported (restricted) classes can also be injected and can be changed with little or no notice. Procedures written to use these restricted APIs are not loaded by default, and you need to use the dbms.security.procedures.unrestricted to load unsafe procedures. Read more about this config setting in Operations Manual → Securing extensions.

Memory Resource Tracking

The memory resource tracking API for the procedure framework is available for preview. Future versions of Neo4j might contain breaking changes to this API.

If your procedure or function allocates significant amounts of heap memory, you can register allocations to count towards the configured transaction limits, see Operations Manual → Limit transaction memory usage for more information. This allows you to avoid OutOfMemory errors that cause database restarts. Memory allocations also show up in query profiles.

To do this you need to inject org.neo4j.procedure.memory.ProcedureMemory as a field in your procedure/function class. ProcedureMemory has various methods to allow you to register allocations. For example (see javadocs for a full reference):

ProcedureMemoryTracker newTracker() creates a new memory resource tracker that is bound to the current transaction.
HeapEstimator heapEstimator() estimates the heap size of classes and instances.
HeapTrackingCollectionFactory collections() lets you create collections that have built-in memory tracking of their internal structure.

It’s usually difficult and time-consuming to implement memory resource tracking. These are a few considerations and caveats that are worth keeping in mind:

Limit the scope of the memory management. Focus only on parts that can grow significantly in memory and ignore minor underestimation.
Beware of overestimation by registering allocations of the same instance multiple times. You can add reference counting or other mechanisms to avoid overestimation if that is a concern.
It’s common not to know the size of an instance before it has been allocated, which may lead you to register allocations after they have already been made. The memory tracker implementation tries to prevent this by always pre-registering a certain amount of memory in the internal memory pools.
It’s cumbersome in Java to know when an instance has been garbage-collected. Typically, you register the release of memory at the point when it’s possible for that memory to be garbage-collected. To account for this, memory trackers may internally choose not to register the release of memory instantaneously.
Testing memory resource tracking can be difficult. One approach is to use a third-party library, like JAMM (Java Agent for Memory Measurements), and assert that the estimates are close enough for some given input.

A basic example of memory resource tracking in user defined procedures.

package org.example;

import org.neo4j.procedure.Context;
import org.neo4j.procedure.Name;
import org.neo4j.procedure.Procedure;
import org.neo4j.procedure.memory.ProcedureMemory;

import java.util.Arrays;
import java.util.stream.Stream;

public class MyProcedures {

    @Context
    public ProcedureMemory memory;

    record Output(Long value) {}

    @Procedure("org.example.memoryHungryRange")
    public Stream<Output> memoryHungryRange(@Name("size") int size) {
        final var tracker = memory.newTracker();

        // Register the allocation of the long array below
        tracker.allocateHeap(memory.heapEstimator().sizeOfLongArray(size));
        // The actual allocation
        final var result = new long[size];

        for (int i = 0; i < size; i++) result[i] = i;

        return Arrays.stream(result)
                .mapToObj(Output::new)
                // Release all registered allocations when the stream is closed
                .onClose(tracker::close);
    }
}