Google BigQuery
BigQuery is a fully-managed, serverless data warehouse that enables scalable analysis over petabytes of data.
Prerequisites
You need a Google BigQuery instance up-and-running. If you don’t have one you can create it from here.
From BigQuery to Neo4j
// Step (1)
// Load a table into a Spark DataFrame
val bigqueryDF: DataFrame = spark.read
.format("bigquery")
.option("table", "google.com:bigquery-public-data.stackoverflow.post_answers")
.load()
// Step (2)
// Save the `bigqueryDF` as nodes with labels `Person` and `Customer` into Neo4j
bigqueryDF.write
.format("org.neo4j.spark.DataSource")
.mode(SaveMode.ErrorIfExists)
.option("url", "neo4j://<host>:<port>")
.option("labels", ":Answer")
.load()
# Step (1)
# Load a table into a Spark DataFrame
bigqueryDF = (spark.read
.format("bigquery")
.option("table", "google.com:bigquery-public-data.stackoverflow.post_answers")
.load())
# Step (2)
# Save the `bigqueryDF` as nodes with labels `Person` and `Customer` into Neo4j
(bigqueryDF.write
.format("org.neo4j.spark.DataSource")
.mode("ErrorIfExists")
.option("url", "neo4j://<host>:<port>")
.option("labels", ":Answer")
.load())
From Neo4j to BigQuery
// Step (1)
// Load `:Answer` nodes as DataFrame
val neo4jDF: DataFrame = spark.read.format("org.neo4j.spark.DataSource")
.option("url", "neo4j://<host>:<port>")
.option("labels", ":Answer")
.load()
// Step (2)
// Save the `neo4jDF` as table CUSTOMER into BigQuery
neo4jDF.write
.format("bigquery")
.mode("overwrite")
.option("temporaryGcsBucket", "<my-bigquery-temp>")
.option("table", "<my-project-id>:<my-private-database>.stackoverflow.answers")
.save()
# Step (1)
# Load `:Answer` nodes as DataFrame
neo4jDF = (spark.read.format("org.neo4j.spark.DataSource")
.option("url", "neo4j://<host>:<port>")
.option("labels", ":Answer")
.load())
# Step (2)
# Save the `neo4jDF` as table CUSTOMER into BigQuery
(neo4jDF.write
.format("bigquery")
.mode("overwrite")
.option("temporaryGcsBucket", "<my-bigquery-temp>")
.option("table", "<my-private-database>.stackoverflow.answers")
.save())