Random generation
In certain use cases it is useful to generate random graphs, for example, for testing or benchmarking purposes. For that reason the Neo4j Graph Algorithm library comes with a set of built-in graph generators. The generator stores the resulting graph in the graph catalog. That graph can be used as input for any algorithm in the library.
This feature is in the beta tier. For more information on feature tiers, see API Tiers.
It is currently not possible to persist these graphs in Neo4j. Running an algorithm in write mode on a generated graph will lead to unexpected results. |
The graph generation is parameterized by three dimensions:
-
node count - the number of nodes in the generated graph
-
average degree - describes the average out-degree of the generated nodes
-
relationship distribution function - the probability distribution method used to connect generated nodes
Syntax
CALL gds.graph.generate(
graphName: String,
nodeCount: Integer,
averageDegree: Integer,
configuration: Map
})
YIELD name, nodes, relationships, generateMillis, relationshipSeed, averageDegree, relationshipDistribution, relationshipProperty
Name | Type | Default | Optional | Description |
---|---|---|---|---|
|
String |
|
no |
The name under which the generated graph is stored. |
|
Integer |
|
no |
The number of generated nodes. |
|
Integer |
|
no |
The average out-degree of generated nodes. |
|
Map |
|
yes |
Additional configuration, see below. |
Name | Type | Default | Optional | Description |
---|---|---|---|---|
|
String |
|
yes |
The probability distribution method used to connect generated nodes. For more information see Relationship Distribution. |
|
Integer |
|
yes |
The seed used for generating relationships. |
|
Map |
|
yes |
Describes the method used to generate a relationship property. By default no relationship property is generated. For more information see Relationship Property. |
|
String |
|
yes |
The relationship aggregation method cf. Relationship Projection. |
|
String |
|
yes |
The method of orienting edges. Allowed values are NATURAL, REVERSE and UNDIRECTED. |
|
Boolean |
|
yes |
Whether to allow relationships with identical source and target node. |
Name | Type | Description |
---|---|---|
|
String |
The name under which the stored graph was stored. |
|
Integer |
The number of nodes in the graph. |
|
Integer |
The number of relationships in the graph. |
|
Integer |
Milliseconds for generating the graph. |
|
Integer |
The seed used for generating relationships. |
|
Float |
The average out degree of the generated nodes. |
|
String |
The probability distribution method used to connect generated nodes. |
|
String |
The configuration of the generated relationship property. |
Relationship Distribution
The relationshipDistribution
parameter controls the statistical method used for the generation of new relationships.
Currently there are three supported methods:
-
UNIFORM
- Distributes the outgoing relationships evenly, i.e., every node has exactly the same out degree (equal to the average degree). The target nodes are selected randomly. -
RANDOM
- Distributes the outgoing relationships using a normal distribution with an average ofaverageDegree
and a standard deviation of2 * averageDegree
. The target nodes are selected randomly. -
POWER_LAW
- Distributes the incoming relationships using a power law distribution. The out degree is based on a normal distribution.
Relationship Property
The graph generator is capable of generating a relationship property.
This can be controlled using the relationshipProperty
parameter which accepts the following parameters:
Name | Type | Default | Optional | Description |
---|---|---|---|---|
|
String |
null |
no |
The name under which the property values are stored. |
|
String |
null |
no |
The method used to generate property values. |
|
Float |
0.0 |
yes |
Minimal value of the generated property (only supported by |
|
Float |
1.0 |
yes |
Maximum value of the generated property (only supported by |
|
Float |
null |
yes |
Fixed value assigned to every relationship (only supported by |
Currently, there are two supported methods to generate relationship properties:
-
FIXED
- Assigns a fixed value to every relationship. Thevalue
parameter must be set. -
RANDOM
- Assigns a random value between the lower (min
) and upper (max
) bound.
Relationship Seed
The relationshipSeed
parameter allows the user to specify the seed used to generate the random graph manually.
When specified, the procedure will produce the same relationships between nodes regardless of whether the generated graph is going to be created as weighted or unweighted.
This can be helpful if one wants to examine the behavior or performance of an algorithm under weight conditions.
Examples
All the examples below should be run in an empty database. |
In the following we will demonstrate the usage of the random graph generation procedure.
Generating unweighted graphs
CALL gds.graph.generate('graph',5,2, {relationshipSeed:19})
YIELD name, nodes, relationships, relationshipDistribution
name | nodes | relationships | relationshipDistribution |
---|---|---|---|
"graph" |
5 |
10 |
"UNIFORM" |
A new in-memory graph called graph
with 5
nodes and 10
relationships has been created and added to the graph catalog.
We can examine its topology with the gds.graph.relationships
procedure.
CALL gds.graph.relationships.stream('graph')
YIELD sourceNodeId,targetNodeId
RETURN sourceNodeId as source, targetNodeId as target
ORDER BY source ASC,target ASC
source | target |
---|---|
0 |
1 |
0 |
2 |
1 |
0 |
1 |
4 |
2 |
1 |
2 |
4 |
3 |
0 |
3 |
1 |
4 |
0 |
4 |
3 |
Generating weighted graphs
To generated graphs with weighted relationships we must specify the relationshipProperty
parameter as discussed above.
CALL gds.graph.generate('weightedGraph',5,2, {relationshipSeed:19,
relationshipProperty: {type: 'RANDOM', min: 5.0, max: 10.0, name: 'score'}})
YIELD name, nodes, relationships, relationshipDistribution
name | nodes | relationships | relationshipDistribution |
---|---|---|---|
"weightedGraph" |
5 |
10 |
"UNIFORM" |
The produced graph, weightedGraph
, has a property named score
containing a random value between 5.0 and 10.0 for each relationship.
We can use gds.graph.relationshipProperty.stream
to stream the relationships of the graph along with their score values.
CALL gds.graph.relationshipProperty.stream('weightedGraph','score')
YIELD sourceNodeId, targetNodeId, propertyValue
RETURN sourceNodeId as source, targetNodeId as target, propertyValue as score
ORDER BY source ASC,target ASC, score
source | target | score |
---|---|---|
0 |
1 |
6.791408433596591 |
0 |
2 |
8.662453313014902 |
1 |
0 |
6.258381821615686 |
1 |
4 |
9.711806397654765 |
2 |
1 |
9.469695236791349 |
2 |
4 |
6.519823445755963 |
3 |
0 |
8.747179900968224 |
3 |
1 |
7.752117836610726 |
4 |
0 |
8.614858979680758 |
4 |
3 |
5.060444167785128 |
Notice that despite as graph
and weightedGraph
have been created with the same seed
their relationship topology is equivalent.