Streams Transformations

The Kafka Connect Neo4j Connector is the recommended method to integrate Kafka with Neo4j, as Neo4j Streams is no longer under active development and will not be supported after version 4.4 of Neo4j.

The most recent version of the Kafka Connect Neo4j Connector can be found here.

Transforming streams from complex, large payload streams to simpler, more numerous streams is a core technique in Kafka to deal with the complexity of what is coming across the wire. There are two main methods of doing this, which in Neo4j terms are somewhat akin to using Cypher vs. using the traversal API.

KSQL

KSQL is a method of writing SQL-like queries which transform streams on the fly. KSQL push queries never return as they would with a traditional database. Instead, the KSQL query reformats a stream (which is a potentially infinite number of messages) into a new stream.

KSQL is the best available method of transforming streams from one format to another, and it is a core technique you should consider if you have complicated topic payloads, that would force you to write complicated Cypher.

The downside to KSQL is that it may not work everywhere. Because it’s a Confluent Enterprise feature, you won’t find it in Amazon MSK or other open source kafka installations.

KStreams

KStreams is a Java API that allows for rich transformation of streams in any way you can design. It is more akin to the Neo4j Traversal API, in that you can do whatever you can imagine, but it requires custom code to do so. Typically KStreams programs are small apps which might read from one topic, transform, and write to another. And so for our purposes with graphs, KStreams serves the same architectural purpose as KSQL, it’s just more powerful, and requires custom code.

In contrast to KSQL which is only available for Confluent Enterprise & Confluent Cloud, the KStreams API should be available with any open source kafka.