Installation
Choose the correct version
Make sure to match both the Spark version and the Scala version of your setup. Here is a compatibility table to help you choose the correct version of the connector.
Spark Version | Artifact (Scala 2.12) | Artifact (Scala 2.13) |
---|---|---|
3.4+ |
|
|
3.3 |
|
|
3.2 |
|
|
3.0 and 3.1 |
|
|
Usage with the Spark shell
The connector is available via Spark Packages:
$SPARK_HOME/bin/spark-shell --packages org.neo4j:neo4j-connector-apache-spark_2.12:5.3.2_for_spark_3
$SPARK_HOME/bin/pyspark --packages org.neo4j:neo4j-connector-apache-spark_2.12:5.3.2_for_spark_3
Alternatively, you can download the connector JAR file from the Neo4j Connector Page or from the GitHub releases page and run the following command to launch a Spark interactive shell with the connector included:
$SPARK_HOME/bin/spark-shell --jars neo4j-connector-apache-spark_2.12-5.3.2_for_spark_3.jar
$SPARK_HOME/bin/pyspark --jars neo4j-connector-apache-spark_2.12-5.3.2_for_spark_3.jar
Self-contained applications
For non-Python applications:
-
Include the connector in your application using the application’s build tool.
-
Package the application.
-
Use
spark-submit
to run the application.
For Python applications, run spark-submit
directly.
As for the spark-shell
, you can run spark-submit
via Spark Packages or with a local JAR file.
See the Quickstart for code examples.
build.sbt
name := "Spark App"
version := "1.0"
scalaVersion := "2.12.18"
libraryDependencies += "org.apache.spark" %% "spark-sql" % "3.5.1"
libraryDependencies += "org.neo4j" %% "neo4j-connector-apache-spark" % "5.3.1_for_spark_3"
If you use the sbt-spark-package plugin, add the following to your build.sbt
instead:
scala spDependencies += "org.neo4j/neo4j-connector-apache-spark_2.12:5.3.2_for_spark_3"
pom.xml
<project>
<groupId>org.neo4j</groupId>
<artifactId>spark-app</artifactId>
<modelVersion>4.0.0</modelVersion>
<name>Spark App</name>
<packaging>jar</packaging>
<version>1.0</version>
<dependencies>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-sql_2.12</artifactId>
<version>3.5.1</version>
<scope>provided</scope>
</dependency>
<dependency>
<groupId>org.neo4j</groupId>
<artifactId>neo4j-connector-apache-spark_2.12</artifactId>
<version>5.3.1_for_spark_3</version>
</dependency>
</dependencies>
</project>