Neo4j on AWS
Neo4j can be easily deployed on EC2 instances in Amazon Web Services (AWS) by using the official listings for Neo4j on the AWS Marketplace.
The AWS Marketplace listing uses a CloudFormation template maintained by Neo4j. The template’s code is available on GitHub and can be customized to meet more complex or bespoke use cases.
Neo4j does not provide Amazon Machine Images (AMIs) with a pre-installed version of the product. The Neo4j AWS Marketplace listings (and listings on GitHub) use CloudFormation templates that deploy and configure Neo4j dynamically with a shell script. |
Supported Neo4j versions
The Neo4j AWS marketplace listing can be configured to deploy either Neo4j Enterprise Edition 5 or 4.4. The CloudFormation template always installs the latest available version.
Neo4j CloudFormation template
AWS CloudFormation is a declarative Infrastructure as Code (IaC) language that is based on YAML and instructs AWS to deploy a set of cloud resources.
The Neo4j CloudFormation template repository contains code for Neo4j 5 on the main
branch and code for Neo4j 4.4 on the Neo4j-4.4
branch:
The Neo4j CloudFormation template takes several parameters as inputs, deploys a set of cloud resources, and provides outputs that can be used to connect to a Neo4j DBMS.
Important considerations
-
The deployment of cloud resources will incur costs.
-
Refer to the AWS pricing calculatorfor more information.
-
-
The Neo4j CloudFormation template deploys a new VPC.
-
AWS accounts are limited to an initial quota of 5 VPCs (you can view your current quota by viewing the Limits page of the Amazon EC2 console).
-
Your VPC quota can be increased if needed by contacting AWS support.
-
-
The Neo4j CloudFormation template uses an Auto Scaling group (ASG) to deploy EC2 instances.
-
This means that to stop or terminate EC2 instances, you must first remove them from the ASG, otherwise, the ASG will automatically replace them.
-
-
SSH Keys are not generated as part of the CloudFormation template.
-
Use EC2 Instance Connect (via the EC2 console) to connect to deployed EC2 instances if needed.
-
Input parameters
Parameter Name | Description |
---|---|
Stack Name |
A name for the CloudFormation stack to be deployed, e.g., |
Install Graph Data Science |
An option to install Graph Data Science (GDS). Accepted values are |
Graph Data Science License Key |
A valid GDS license key can be pasted into this field. License keys will be sent to and stored by Neo4j. This information is used only for product activation purposes. |
Install Bloom |
Optionally install Neo4j Bloom. Accepted values are |
Bloom License Key |
A valid Bloom license key can be pasted into this field. License keys will be sent to and stored by Neo4j. This information is used only for product activation purposes. |
Password |
A password for the |
Number of Servers |
Specify the number of desired EC2 instances to be used to form a Neo4j cluster (a minimum of 3 instances is required to form a cluster). |
Instance type |
The class of EC2 Instances to use. |
Disk Size |
Size (in GB) of the EBS volume on each EC2 instance. Larger EBS volumes are typically faster than smaller ones, therefore 100GB is the recommended minimum size. |
SSH CIDR |
Specify an address range from which EC2 instances are accessible on port |
Deployed cloud resources
The environment created by the CloudFormation template consists of the following AWS resources:
-
1 VPC, with a CIDR range (address space) of
10.0.0.0/16
.-
3 Subnets (if a cluster has been selected), distributed evenly across 3 Availability zones, with the following CIDR ranges:
-
10.0.1.0/24
-
10.0.2.0/24
-
10.0.3.0/24
-
-
A single subnet (if a single instance has been selected) with the following CIDR range:
-
10.0.1.0/24
-
-
A security group.
-
An internet gateway.
-
Routing tables (and associations) for all subnets.
-
-
An auto-scaling group and launch configuration, which creates:
-
1, or between 3 and 10 EC2 instances (Depending on whether a single instance or an autonomous cluster is selected).
-
-
1 Network (Layer 4) Load Balancer.
-
A target group for the EC2 instances.
-
Template outputs
After the installation finishes successfully, the CloudFormation template provides the following outputs, which can be found in the Outputs tab of the CloudFormation page on the AWS console.
Output Name | Description |
---|---|
Neo4jBrowserURL |
The http URL of the Neo4j Browser. |
Neo4jURI |
The Bolt URL of the Neo4j Browser. |
Neo4jUsername |
The username |
Neo4j cluster on AWS
Cluster version consistency
When the CloudFormation template creates a new Neo4j cluster, an Auto Scaling group (ASG) is created and tagged with the monthly version of the installed Neo4j database. If you add more EC2 instances to your ASG, they will be installed with the same monthly version, ensuring that all Neo4j cluster servers are installed with the same version, regardless of when the EC2 instances were created.
Remove a server from the Neo4j cluster
Rolling updates on Amazon Machine Images (AMIs) often involve rotating the images. However, simply removing Neo4j servers from the target Network Load Balancer (NLB) one by one does not prevent requests from being routed to them. This occurs because the NLB and Neo4j server-side routing operate independently and do not share awareness of a server availability.
To correctly remove a server from the cluster and reintroduce it after the update, follow the steps outlined below:
-
Remove the server from the AWS NLB. This prevents external clients from sending requests to the server.
-
Since Neo4j’s cluster routing (server-side routing) does not use the NLB, you need to ensure that queries are not routed to the server. To do this, you have to cleanly shut down the server.
-
Run the following query to check servers are hosting all their assigned databases. The query should return no results:
SHOW SERVERS YIELD name, hosting, requestedHosting, serverId WHERE requestedHosting <> hosting
-
Use the following query to check all databases are in their expected state. The query should return no results:
SHOW DATABASES YIELD name, address, currentStatus, requestedStatus, statusMessage WHERE currentStatus <> requestedStatus RETURN name, address, currentStatus, requestedStatus, statusMessage
-
To stop the Neo4j service, run the following command:
sudo systemctl stop neo4j
To configure the timeout period for waiting on active transactions to either complete or be terminated before the shutdown, modify the setting
db.shutdown_transaction_end_timeout
in the neo4j.conf file.db.shutdown_transaction_end_timeout
defaults to 10 seconds.The environment variable
NEO4J_SHUTDOWN_TIMEOUT
determines how long the system will wait for Neo4j to stop before forcefully terminating the process. You can change this usingsystemctl edit neo4j.service
. By default,NEO4J_SHUTDOWN_TIMEOUT
is set to 120 seconds. If the shutdown process exceeds this limit, it is considered failed. You may need to increase the value if the system serves long-running transactions. -
Verify that the shutdown process has finished successfully by checking the neo4j.log for relevant log messages confirming the shutdown.
-
-
When everything is updated or fixed, start the servers one by one again.
-
Run
systemctl start neo4j
. -
Once the server has been restarted, confirm it is running successfully.
Run the following command and check the server has state
Enabled
and healthAvailable
.SHOW SERVERS WHERE name = [server-id];
-
Confirm that the server has started all the databases that it should.
This command shows any databases that are not in their expected state:
SHOW DATABASES YIELD name, address, currentStatus, requestedStatus, serverID WHERE currentStatus <> requestedStatus AND serverID = [server-id] RETURN name, address, currentStatus, requestedStatus
-
-
Reattach the server to the NLB. Once the server is stable and caught up, add it back to the AWS NLB target group.
Licensing
Installing and starting Neo4j from the AWS marketplace constitutes an acceptance of the Neo4j license agreement. When deploying Neo4j, users are required to confirm that they either have an enterprise license or accept the terms of the Neo4j evaluation license.
If you require the Enterprise version of either Graph Data Science or Bloom, you need to provide a key issued by Neo4j as this will be required during the installation.
To obtain a valid license for either Neo4j, Bloom, or GDS, reach out to your Neo4j account representative or get in touch using the contact form.