Merge nodes
The APOC library contains a procedure that can be used to merge nodes.
This procedure allows for merging a list of nodes onto the first node in the list (all relationships are merged onto that node as well). The merge behaviour can be specified for properties globally and/or individually.
Procedure for merging nodes
Qualified Name | Type |
---|---|
|
Procedure |
MATCH (p:Person)
WITH p ORDER BY p.created DESC // newest one first
WITH p.email AS email, collect(p) as nodes
CALL apoc.refactor.mergeNodes(nodes, {properties: {
name:'discard',
age:'overwrite',
kids:'combine',
`addr.*`: 'overwrite',
`.*`: 'discard'
}})
YIELD node
RETURN node
Config options
Below are the config options for this procedure:
These config option also works for apoc.refactor.mergeRelationships([rels],{config})
.
type | operations |
---|---|
discard |
the property from the first node will remain if already set, otherwise the first property in list will be written |
overwrite / override |
last property in list wins |
combine |
if there is only one property in list, it will be set / kept as single property otherwise create an array, tries to coerce values |
In addition, mergeNodes supports the following config properties:
type | operations |
---|---|
mergeRels |
true/false: give the possibility to merge relationships with same type and direction. |
produceSelfRel |
true/false: if |
preserveExistingSelfRels |
true/false: is valid only with |
singleElementAsArray |
false/true: if is |
Properties with a name not included in the config map are overridden.
Relationships properties are managed with the same method, except when the config map is null, in which case the entity properties are combined.
Examples
The below examples will further explain this procedure.
Same start and end nodes
CREATE (n1:Person {name:'Tom'}),
(n2:Person {name:'John'}),
(n3:Company {name:'Company1'}),
(n5:Car {brand:'Ferrari'}),
(n6:Animal:Cat {name:'Derby'}),
(n7:City {name:'London'}),
(n1)-[:WORKS_FOR {since:2015}]->(n3),
(n2)-[:WORKS_FOR {since:2018}]->(n3),
(n3)-[:HAS_HQ {since:2004}]->(n7),
(n1)-[:DRIVE {since:2017}]->(n5),
(n2)-[:HAS {since:2013}]->(n6);
return *;
MATCH (a1:Person{name:'John'}), (a2:Person {name:'Tom'})
WITH head(collect([a1,a2])) as nodes
CALL apoc.refactor.mergeNodes(nodes,{properties:"combine", mergeRels:true})
YIELD node
RETURN count(*)
If the above query is run, it will result in the following graph:
Since the relationships have the same start and end nodes, the relationships are merged and properties are combined.
Different start and end nodes
Create (n1:Person {name:'Tom'}),
(n2:Person {name:'John'}),
(n3:Company {name:'Company1'}),
(n4:Company {name:'Company2'}),
(n5:Car {brand:'Ferrari'}),
(n6:Animal:Cat {name:'Derby'}),
(n7:City {name:'London'}),
(n8:City {name:'Liverpool'}),
(n1)-[:WORKS_FOR{since:2015}]->(n3),
(n2)-[:WORKS_FOR{since:2018}]->(n4),
(n3)-[:HAS_HQ{since:2004}]->(n7),
(n4)-[:HAS_HQ{since:2007}]->(n8),
(n1)-[:DRIVE{since:2017}]->(n5),
(n2)-[:HAS{since:2013}]->(n6)
return *;
MATCH (a1:Person{name:'John'}), (a2:Person {name:'Tom'})
WITH head(collect([a1,a2])) as nodes
CALL apoc.refactor.mergeNodes(nodes,{
properties:"combine",
mergeRels:true
})
YIELD node
RETURN count(*)
If the above query is run, it will result in the following graph:
Since the relationships have different end nodes, all relationships and properties are maintained.