K-1 Coloring
Glossary
- Directed
-
Directed trait. The algorithm is well-defined on a directed graph.
- Directed
-
Directed trait. The algorithm ignores the direction of the graph.
- Directed
-
Directed trait. The algorithm does not run on a directed graph.
- Undirected
-
Undirected trait. The algorithm is well-defined on an undirected graph.
- Undirected
-
Undirected trait. The algorithm ignores the undirectedness of the graph.
- Heterogeneous nodes
-
Heterogeneous nodes fully supported. The algorithm has the ability to distinguish between nodes of different types.
- Heterogeneous nodes
-
Heterogeneous nodes allowed. The algorithm treats all selected nodes similarly regardless of their label.
- Heterogeneous relationships
-
Heterogeneous relationships fully supported. The algorithm has the ability to distinguish between relationships of different types.
- Heterogeneous relationships
-
Heterogeneous relationships allowed. The algorithm treats all selected relationships similarly regardless of their type.
- Weighted relationships
-
Weighted trait. The algorithm supports a relationship property to be used as weight, specified via the relationshipWeightProperty configuration parameter.
- Weighted relationships
-
Weighted trait. The algorithm treats each relationship as equally important, discarding the value of any relationship weight.
Introduction
The K-1 Coloring algorithm assigns a color to every node in the graph, trying to optimize for two objectives:
-
To make sure that every neighbor of a given node has a different color than the node itself.
-
To use as few colors as possible.
Note that the graph coloring problem is proven to be NP-complete, which makes it intractable on anything but trivial graph sizes. For that reason the implemented algorithm is a greedy algorithm. Thus it is neither guaranteed that the result is an optimal solution, using as few colors as theoretically possible, nor does it always produce a correct result where no two neighboring nodes have different colors. However the precision of the latter can be controlled by the number of iterations this algorithm runs.
For more information on this algorithm, see:
Running this algorithm requires sufficient memory availability. Before running this algorithm, we recommend that you read Memory Estimation. |
Syntax
CALL gds.k1coloring.stream(graphName: String, configuration: Map)
YIELD nodeId, color
Name | Type | Default | Optional | Description |
---|---|---|---|---|
graphName |
String |
|
yes |
The name of an existing graph on which to run the algorithm. If no graph name is provided, the configuration map must contain configuration for creating a graph. |
configuration |
Map |
|
yes |
Additional configuration, see below. |
Name | Type | Default | Optional | Description |
---|---|---|---|---|
List of String |
|
yes |
Filter the named graph using the given node labels. Nodes with any of the given labels will be included. |
|
List of String |
|
yes |
Filter the named graph using the given relationship types. Relationships with any of the given types will be included. |
|
Integer |
|
yes |
The number of concurrent threads used for running the algorithm. |
|
String |
|
yes |
An ID that can be provided to more easily track the algorithm’s progress. |
|
Boolean |
|
yes |
If disabled the progress percentage will not be logged. |
|
Integer |
|
yes |
The maximum number of iterations of K1 Coloring to run. |
|
minCommunitySize |
Integer |
|
yes |
Only nodes inside communities larger or equal the given value are returned. |
Name | Type | Description |
---|---|---|
nodeId |
Integer |
The ID of the Node |
color |
Integer |
The color of the Node |
CALL gds.k1coloring.stats(
graphName: String,
configuration: Map
)
YIELD
nodeCount,
colorCount,
ranIterations,
didConverge,
configuration,
preProcessingMillis,
computeMillis
Name | Type | Default | Optional | Description |
---|---|---|---|---|
graphName |
String |
|
no |
The name of a graph stored in the catalog. |
configuration |
Map |
|
yes |
Configuration for algorithm-specifics and/or graph filtering. |
Name | Type | Default | Optional | Description |
---|---|---|---|---|
concurrency |
Integer |
4 |
yes |
The number of concurrent threads used for running the algorithm. Also provides the default value for 'readConcurrency' and 'writeConcurrency'. This is dependent on the Neo4j edition; for more information, see CPU. |
Integer |
10 |
yes |
The maximum number of iterations of K1 Coloring to run. |
Name | Type | Description |
---|---|---|
nodeCount |
Integer |
The number of nodes considered. |
ranIterations |
Integer |
The actual number of iterations the algorithm ran. |
didConverge |
Boolean |
An indicator of whether the algorithm found a correct coloring. |
colorCount |
Integer |
The number of colors used. |
preProcessingMillis |
Integer |
Milliseconds for preprocessing the data. |
computeMillis |
Integer |
Milliseconds for running the algorithm. |
configuration |
Map |
The configuration used for running the algorithm. |
CALL gds.k1coloring.mutate(graphName: String, configuration: Map)
YIELD nodeCount, colorCount, ranIterations, didConverge, configuration, preProcessingMillis, computeMillis, mutateMillis
Name | Type | Default | Optional | Description |
---|---|---|---|---|
graphName |
String |
|
no |
The name of a graph stored in the catalog. |
configuration |
Map |
|
yes |
Configuration for algorithm-specifics and/or graph filtering. |
The configuration for the mutate
mode is similar to the write
mode.
Instead of specifying a writeProperty
, we need to specify a mutateProperty
.
Also, specifying writeConcurrency
is not possible in mutate
mode.
Name | Type | Description |
---|---|---|
nodeCount |
Integer |
The number of nodes considered. |
ranIterations |
Integer |
The actual number of iterations the algorithm ran. |
didConverge |
Boolean |
An indicator of whether the algorithm found a correct coloring. |
colorCount |
Integer |
The number of colors used. |
preProcessingMillis |
Integer |
Milliseconds for preprocessing the data. |
computeMillis |
Integer |
Milliseconds for running the algorithm. |
mutateMillis |
Integer |
Milliseconds for adding properties to the projected graph. |
configuration |
Map |
The configuration used for running the algorithm. |
CALL gds.k1coloring.write(graphName: String, configuration: Map)
YIELD nodeCount, colorCount, ranIterations, didConverge, configuration, preProcessingMillis, computeMillis, writeMillis
Name | Type | Default | Optional | Description |
---|---|---|---|---|
graphName |
String |
|
no |
The name of a graph stored in the catalog. |
configuration |
Map |
|
yes |
Configuration for algorithm-specifics and/or graph filtering. |
Name | Type | Default | Optional | Description |
---|---|---|---|---|
Integer |
4 |
yes |
The number of concurrent threads used for running the algorithm. Also provides the default value for 'readConcurrency' and 'writeConcurrency'. This is dependent on the Neo4j edition; for more information, see CPU. |
|
Integer |
value of 'concurrency' |
yes |
The number of concurrent threads used for writing the result. |
|
Integer |
10 |
yes |
The maximum number of iterations of K1 Coloring to run. |
|
String |
n/a |
no |
The node property this procedure writes the color to. |
|
minCommunitySize |
Integer |
0 |
yes |
Only community ids of communities with a size greater than or equal to the given value are written to Neo4j. |
Name | Type | Description |
---|---|---|
nodeCount |
Integer |
The number of nodes considered. |
ranIterations |
Integer |
The actual number of iterations the algorithm ran. |
didConverge |
Boolean |
An indicator of whether the algorithm found a correct coloring. |
colorCount |
Integer |
The number of colors used. |
preProcessingMillis |
Integer |
Milliseconds for preprocessing the data. |
computeMillis |
Integer |
Milliseconds for running the algorithm. |
writeMillis |
Integer |
Milliseconds for writing result data back to Neo4j. |
configuration |
Map |
The configuration used for running the algorithm. |
Examples
All the examples below should be run in an empty database. The examples use Cypher projections as the norm. Native projections will be deprecated in a future release. |
Consider the graph created by the following Cypher statement:
CREATE (alice:User {name: 'Alice'}),
(bridget:User {name: 'Bridget'}),
(charles:User {name: 'Charles'}),
(doug:User {name: 'Doug'}),
(alice)-[:LINK]->(bridget),
(alice)-[:LINK]->(charles),
(alice)-[:LINK]->(doug),
(bridget)-[:LINK]->(charles)
This graph has a super node with name "Alice" that connects to all other nodes. It should therefore not be possible for any other node to be assigned the same color as the Alice node.
MATCH (source:User)-[r:LINK]->(target:User)
RETURN gds.graph.project(
'myGraph',
source,
target,
{},
{ undirectedRelationshipTypes: ['*'] }
)
We can now go ahead and project a graph with all the User
nodes and the LINK
relationships with UNDIRECTED
orientation.
MATCH (source:Person)-[r:LIKES]->(target:Person)
RETURN gds.graph.project(
'myGraph',
source,
target
)
In the following examples we will demonstrate using the K-1 Coloring algorithm on this graph.
CALL gds.k1coloring.stream('myGraph')
YIELD nodeId, color
RETURN gds.util.asNode(nodeId).name AS name, color
ORDER BY name
name | color |
---|---|
|
|
|
|
|
|
|
|
It is also possible to write the assigned colors back to the database using the write
mode.
CALL gds.k1coloring.write('myGraph', {writeProperty: 'color'})
YIELD nodeCount, colorCount, ranIterations, didConverge
nodeCount | colorCount | ranIterations | didConverge |
---|---|---|---|
|
|
|
|
When using write
mode the procedure will return information about the algorithm execution.
In this example we return the number of processed nodes, the number of colors used to color the graph, the number of iterations and information whether the algorithm converged.
To instead mutate the in-memory graph with the assigned colors, the mutate
mode can be used as follows.
CALL gds.k1coloring.mutate('myGraph', {mutateProperty: 'color'})
YIELD nodeCount, colorCount, ranIterations, didConverge
nodeCount | colorCount | ranIterations | didConverge |
---|---|---|---|
|
|
|
|
Similar to the write
mode, stats
mode can run the algorithm and return only the execution statistics without persisting the results.
CALL gds.k1coloring.stats('myGraph')
YIELD nodeCount, colorCount, ranIterations, didConverge
nodeCount | colorCount | ranIterations | didConverge |
---|---|---|---|
|
|
|
|