apoc.meta.schema
This procedure is not considered safe to run from multiple threads. It is therefore not supported by the parallel runtime (introduced in Neo4j 5.13). For more information, see the Cypher Manual → Parallel runtime. |
Syntax |
|
||
Description |
Examines the given sub-graph and returns metadata as a |
||
Input arguments |
Name |
Type |
Description |
|
|
Number of nodes to sample, setting sample to |
|
Return arguments |
Name |
Type |
Description |
|
|
Meta information represented as a map. |
Config parameters
This procedure supports the following config parameters:
Name | Type | Default | Description |
---|---|---|---|
|
|
1000 |
Number of nodes to sample. Setting |
Sampling
Specify the sample
parameter (1000 by default) to analyze a subset of the data.
The sample, along with the count of nodes for each label, is used to calculate a skip value. Since this value is generated using a random number generator, results obtained through the sampling method may vary between subsequent runs.
If a database contains 500 nodes with the label Foo
label, the skip count for that label is calculated as follows:
The skip count per node label is determined by generating a random number between (totalNodesForLabel / sample) ± 0.1
.
Sample 10: skipCount = 500 / 10 = 50
The resulting skip count will be between 45 and 55.
Sample 50: skipCount = 500 / 50 = 10
The resulting skip count will be between 9 and 11.
Sample 100: skipCount = 500 / 100 = 5
The resulting skip count will be 5.
The skip count represents the number of nodes skipped before one is examined. For instance, with a skip count of 5, every 5th node is examined. Consequently, a higher sample number results in more nodes being sampled.
To stop sampling set sample: -1
.
Usage Examples
The examples in this section are based on the following sample graph:
CREATE (Keanu:Person {name:'Keanu Reeves', born:1964})
CREATE (TomH:Person {name:'Tom Hanks', born:1956})
CREATE (TheMatrix:Movie {title:'The Matrix', released:1999, tagline:'Welcome to the Real World'})
CREATE (TheMatrixReloaded:Movie {title:'The Matrix Reloaded', released:2003, tagline:'Free your mind'})
CREATE (TheMatrixRevolutions:Movie {title:'The Matrix Revolutions', released:2003, tagline:'Everything that has a beginning has an end'})
CREATE (SomethingsGottaGive:Movie {title:"Something's Gotta Give", released:2003})
CREATE (TheDevilsAdvocate:Movie {title:"The Devil's Advocate", released:1997, tagline:'Evil has its winning ways'})
CREATE (YouveGotMail:Movie {title:"You've Got Mail", released:1998, tagline:'At odds in life... in love on-line.'})
CREATE (SleeplessInSeattle:Movie {title:'Sleepless in Seattle', released:1993, tagline:'What if someone you never met, someone you never saw, someone you never knew was the only someone for you?'})
CREATE (ThatThingYouDo:Movie {title:'That Thing You Do', released:1996, tagline:'In every life there comes a time when that thing you dream becomes that thing you do'})
CREATE (CloudAtlas:Movie {title:'Cloud Atlas', released:2012, tagline:'Everything is connected'})
CREATE (Keanu)-[:ACTED_IN {roles:['Neo']}]->(TheMatrix)
CREATE (Keanu)-[:ACTED_IN {roles:['Neo']}]->(TheMatrixReloaded)
CREATE (Keanu)-[:ACTED_IN {roles:['Neo']}]->(TheMatrixRevolutions)
CREATE (Keanu)-[:ACTED_IN {roles:['Julian Mercer']}]->(SomethingsGottaGive)
CREATE (Keanu)-[:ACTED_IN {roles:['Kevin Lomax']}]->(TheDevilsAdvocate)
CREATE (TomH)-[:ACTED_IN {roles:['Joe Fox']}]->(YouveGotMail)
CREATE (TomH)-[:ACTED_IN {roles:['Sam Baldwin']}]->(SleeplessInSeattle)
CREATE (TomH)-[:ACTED_IN {roles:['Mr. White']}]->(ThatThingYouDo)
CREATE (TomH)-[:ACTED_IN {roles:['Zachry', 'Dr. Henry Goose', 'Isaac Sachs', 'Dermot Hoggins']}]->(CloudAtlas)
CREATE (s0:sameName{id:1}) -[r0:sameName {alfa: 'beta'}] -> (t0:sameName{id:2});
CALL apoc.meta.schema()
YIELD value
UNWIND keys(value) AS key
RETURN key, value[key] AS value;
Note that, in case of relationship type and node label with the same name, the relationships will be distinguished by the suffix " (RELATIONSHIP)"
key | value |
---|---|
"Movie" |
{count: 9, relationships: {ACTED_IN: {count: 41, properties: {roles: {existence: FALSE, type: "LIST", array: TRUE}}, direction: "in", labels: ["Person"]}}, type: "node", properties: {tagline: {existence: FALSE, type: "STRING", indexed: FALSE, unique: FALSE}, title: {existence: FALSE, type: "STRING", indexed: FALSE, unique: FALSE}, released: {existence: FALSE, type: "INTEGER", indexed: FALSE, unique: FALSE}}, labels: []} |
"ACTED_IN" |
{count: 9, type: "relationship", properties: {roles: {existence: FALSE, type: "LIST", array: TRUE}}} |
"Person" |
{count: 2, relationships: {ACTED_IN: {count: 9, properties: {roles: {existence: FALSE, type: "LIST", array: TRUE}}, direction: "out", labels: ["Movie"]}}, type: "node", properties: {name: {existence: FALSE, type: "STRING", indexed: FALSE, unique: FALSE}, born: {existence: FALSE, type: "INTEGER", indexed: FALSE, unique: FALSE}}, labels: []} |
"sameName (RELATIONSHIP)" |
{"count":1,"type":"relationship","properties":{"alfa":{"existence":false,"type":"STRING","array":false}}} |
"sameName" |
{count: 2, relationships: {"sameName": {count: 1, properties: {alfa: {existence: false, type: "STRING", array: false}}, direction: "out", labels: ["sameName"]}}, type:"node", properties: {id: {existence: false,type: "INTEGER", indexed: false,unique: false}},labels: []} |
Sample config usage example
Given the following graph:
CREATE (:Foo), (:Other)-[:REL_0]->(:Other), (:Other)-[:REL_1]->(:Other)<-[:REL_2 {baz: 'baa'}]-(:Other), (:Other {alpha: 'beta'}), (:Other {foo:'bar'})-[:REL_3]->(:Other)
Without sample
parameter we receive:
CALL apoc.meta.schema()
YIELD value RETURN value["Other"] as value;
value |
---|
|
Otherwise, with sample: 2
we might obtain (the result can change):
CALL apoc.meta.schema({sample: 2})
YIELD value RETURN value["Other"] as value
value |
---|
|