Collapse Path

This feature is in the beta tier. For more information on feature tiers, see API Tiers.

Introduction

The Collapse Path algorithm is a traversal algorithm capable of creating relationships between the start and end nodes of a traversal. In other words, the path between a start node and an end node is collapsed into a single relationship (a direct path). The algorithm is intended to support the creation of monopartite graphs required by many graph algorithms.

The main input for the algorithm is a list of path templates. Starting from every node in the specified graph, the relationships of each template are traversed one after the other using the order specified in the configuration. Only nodes reached after traversing entire paths are used as end nodes. Exactly one directed relationship is created for every pair of nodes for which at least one path from start to end node exists.

Syntax

Collapse Path syntax per mode
Run Collapse Path in mutate mode on a named graph.
CALL gds.collapsePath.mutate(
  graphName: String,
  configuration: Map
)
YIELD
  preProcessingMillis: Integer,
  computeMillis: Integer,
  mutateMillis: Integer,
  relationshipsWritten: Integer,
  configuration: Map
Table 1. Parameters
Name Type Default Optional Description

graphName

String

n/a

no

The name of a graph stored in the catalog.

configuration

Map

{}

yes

Configuration for algorithm-specifics and/or graph filtering.

Table 2. General configuration for algorithm execution on a named graph.
Name Type Default Optional Description

nodeLabels

List of String

['*']

yes

Filter the named graph using the given node labels.

concurrency

Integer

4

yes

The number of concurrent threads used for running the algorithm.

Table 3. Algorithm specific configuration
Name Type Default Optional Description

pathTemplates

List of List of String

n/a

no

A path template is an ordered list of relationship types used for the traversal. The same relationship type can be added multiple times, in order to traverse them as indicated. And, you may specify several path templates to process in one go.

mutateRelationshipType

String

n/a

no

Relationship type of the newly created relationships.

allowSelfLoops

Boolean

false

yes

Indicates whether it is possible to create self referencing relationships, i.e. relationships where the start and end node are identical.

Table 4. Results
Name Type Description

preProcessingMillis

Integer

Milliseconds for preprocessing the data.

computeMillis

Integer

Milliseconds for running the algorithm.

mutateMillis

Integer

Milliseconds for adding properties to the projected graph.

relationshipsWritten

Integer

The number of relationships created by the algorithm.

configuration

Map

The configuration used for running the algorithm.

Examples

All the examples below should be run in an empty database.

The examples use Cypher projections as the norm. Native projections will be deprecated in a future release.

Consider the graph created by the following Cypher statement:

CREATE
  (Dan:Person),
  (Annie:Person),
  (Matt:Person),
  (Jeff:Person),

  (Guitar:Instrument),
  (Flute:Instrument),

  (Dan)-[:PLAYS]->(Guitar),
  (Annie)-[:PLAYS]->(Guitar),

  (Matt)-[:PLAYS]->(Flute),
  (Jeff)-[:PLAYS]->(Flute)

In this example we want to create a relationship, called PLAYS_SAME_INSTRUMENT, between Person nodes that play the same instrument. To achieve that we have to traverse a path specified by the following Cypher pattern:

(p1:Person)-[:PLAYS]->(:Instrument)-[:PLAYED_BY]->(p2:Person)

In our source graph only the PLAYS relationship type exists. The PLAYED_BY relationship type can be created by loading the PLAYS relationship type in REVERSE direction. The following query will project such a graph:

MATCH (p:Person)-[:PLAYS]->(i:Instrument)
CALL {
  WITH p, i
  RETURN id(p) AS sourceId, id(i) AS targetId, 'PLAYS' AS rType
  UNION
  WITH p, i
  RETURN id(i) AS sourceId, id(p) AS targetId, 'PLAYED_BY' AS rType
}
RETURN gds.graph.project('persons', sourceId, targetId, { relationshipType: rType })

Now we can run the algorithm by specifying the traversal PLAYS, PLAYED_BY in the pathTemplates option.

CALL gds.collapsePath.mutate(
  'persons',
  {
    pathTemplates: [['PLAYS', 'PLAYED_BY']],
    allowSelfLoops: false,
    mutateRelationshipType: 'PLAYS_SAME_INSTRUMENT'
  }
) YIELD relationshipsWritten
Table 5. Results
relationshipsWritten

4

The mutated graph will look like the following graph when filtered by the PLAYS_SAME_INSTRUMENT relationship
CREATE
  (Dan:Person),
  (Annie:Person),
  (Matt:Person),
  (Jeff:Person),

  (Guitar:Instrument),
  (Flute:Instrument),

  (Dan)-[:PLAYS_SAME_INSTRUMENT]->(Annie),
  (Annie)-[:PLAYS_SAME_INSTRUMENT]->(Dan),

  (Matt)-[:PLAYS_SAME_INSTRUMENT]->(Jeff),
  (Jeff)-[:PLAYS_SAME_INSTRUMENT]->(Matt)