apoc.meta.nodeTypeProperties

This procedure is not considered safe to run from multiple threads. It is therefore not supported by the parallel runtime (introduced in Neo4j 5.13). For more information, see the Cypher Manual → Parallel runtime.

Details

Syntax

apoc.meta.nodeTypeProperties([ config ]) :: (nodeType, nodeLabels, propertyName, propertyTypes, mandatory, propertyObservations, totalObservations)

Description

Examines the full graph and returns a table of metadata with information about the NODE values therein.

Input arguments

Name

Type

Description

config

MAP

{ includeLabels = [] :: LIST<STRING>, includeRels = [] :: LIST<STRING>, excludeLabels = [] :: LIST<STRING>, excludeRels = [] :: LIST<STRING>, sample = 1000 :: INTEGER, maxRels = 100 :: INTEGER }. The default is: {}.

Return arguments

Name

Type

Description

nodeType

STRING

The type of the node.

nodeLabels

LIST<STRING>

The labels on the node.

propertyName

STRING

The name of the property.

propertyTypes

LIST<STRING>

The types this property has.

mandatory

BOOLEAN

Whether or not this property exists on all nodes of the given type.

propertyObservations

INTEGER

The number of times this property was observed.

totalObservations

INTEGER

The number of times the label was seen.

Config parameters

This procedure supports the following config parameters:

Config parameters
Name Type Default Description

includeLabels

LIST<STRING>

[]

Node labels to include. Default is to include all node labels.

includeRels

LIST<STRING>

[]

Relationship types to include. Default is to include all relationship types.

excludeLabels

LIST<STRING>

[]

Node labels to exclude. Default is to include all node labels.

excludeRels

LIST<STRING>

[]

Relationship types to exclude. Default is to include all relationship types.

sample

INTEGER

1000

Number of nodes to sample. Setting sample to -1 will remove sampling.

maxRels

INTEGER

100

Number of relationships to read per sampled node.

Sampling

Specify the sample parameter (1000 by default) to analyze a subset of the data.

The sample, along with the count of nodes for each label, is used to calculate a skip value. Since this value is generated using a random number generator, results obtained through the sampling method may vary between subsequent runs.

Example 1. Calculating skip count for data sampling

If a database contains 500 nodes with the label Foo label, the skip count for that label is calculated as follows:

The skip count per node label is determined by generating a random number between (totalNodesForLabel / sample) ± 0.1.

Sample 10: skipCount = 500 / 10 = 50
The resulting skip count will be between 45 and 55.

Sample 50: skipCount = 500 / 50 = 10
The resulting skip count will be between 9 and 11.

Sample 100: skipCount = 500 / 100 = 5
The resulting skip count will be 5.

The skip count represents the number of nodes skipped before one is examined. For instance, with a skip count of 5, every 5th node is examined. Consequently, a higher sample number results in more nodes being sampled.

To stop sampling set sample: -1.

Deprecated parameters
Name Type Default Description

labels

LIST<STRING>

[]

Deprecated, use includeLabels.

rels

LIST<STRING>

[]

Deprecated, use includeRels.

excludes

LIST<STRING>

[]

Deprecated, use excludeLabels.

Usage Examples

The examples in this section are based on the following sample graph:

CREATE (Keanu:Person {name:'Keanu Reeves', born:1964})
CREATE (TomH:Person {name:'Tom Hanks', born:1956})

CREATE (TheMatrix:Movie {title:'The Matrix', released:1999, tagline:'Welcome to the Real World'})
CREATE (TheMatrixReloaded:Movie {title:'The Matrix Reloaded', released:2003, tagline:'Free your mind'})
CREATE (TheMatrixRevolutions:Movie {title:'The Matrix Revolutions', released:2003, tagline:'Everything that has a beginning has an end'})
CREATE (SomethingsGottaGive:Movie {title:"Something's Gotta Give", released:2003})
CREATE (TheDevilsAdvocate:Movie {title:"The Devil's Advocate", released:1997, tagline:'Evil has its winning ways'})

CREATE (YouveGotMail:Movie {title:"You've Got Mail", released:1998, tagline:'At odds in life... in love on-line.'})
CREATE (SleeplessInSeattle:Movie {title:'Sleepless in Seattle', released:1993, tagline:'What if someone you never met, someone you never saw, someone you never knew was the only someone for you?'})
CREATE (ThatThingYouDo:Movie {title:'That Thing You Do', released:1996, tagline:'In every life there comes a time when that thing you dream becomes that thing you do'})
CREATE (CloudAtlas:Movie {title:'Cloud Atlas', released:2012, tagline:'Everything is connected'})

CREATE (Keanu)-[:ACTED_IN {roles:['Neo']}]->(TheMatrix)
CREATE (Keanu)-[:ACTED_IN {roles:['Neo']}]->(TheMatrixReloaded)
CREATE (Keanu)-[:ACTED_IN {roles:['Neo']}]->(TheMatrixRevolutions)
CREATE (Keanu)-[:ACTED_IN {roles:['Julian Mercer']}]->(SomethingsGottaGive)
CREATE (Keanu)-[:ACTED_IN {roles:['Kevin Lomax']}]->(TheDevilsAdvocate)

CREATE (TomH)-[:ACTED_IN {roles:['Joe Fox']}]->(YouveGotMail)
CREATE (TomH)-[:ACTED_IN {roles:['Sam Baldwin']}]->(SleeplessInSeattle)
CREATE (TomH)-[:ACTED_IN {roles:['Mr. White']}]->(ThatThingYouDo)
CREATE (TomH)-[:ACTED_IN {roles:['Zachry', 'Dr. Henry Goose', 'Isaac Sachs', 'Dermot Hoggins']}]->(CloudAtlas);

We can return the metadata of the database from a sample of the database contents, by running the following query:

CALL apoc.meta.nodeTypeProperties();
Results
nodeType nodeLabels propertyName propertyTypes mandatory propertyObservations totalObservations

":`Person`"

["Person"]

"name"

["String"]

FALSE

2

2

":`Person`"

["Person"]

"born"

["Long"]

FALSE

2

2

":`Movie`"

["Movie"]

"title"

["String"]

FALSE

9

9

":`Movie`"

["Movie"]

"tagline"

["String"]

FALSE

8

9

":`Movie`"

["Movie"]

"released"

["Long"]

FALSE

9

9

To return metadata for a subset of labels, specify the includeLabels config parameter. The following returns the metadata for the Person label:

CALL apoc.meta.nodeTypeProperties({includeLabels: ["Person"]});
Results
nodeType nodeLabels propertyName propertyTypes mandatory propertyObservations totalObservations

":`Person`"

["Person"]

"name"

["String"]

FALSE

2

2

":`Person`"

["Person"]

"born"

["Long"]

FALSE

2

2

We can control the sampling rate by specifying the sample parameter. The following returns metadata based on sampling up to 3 nodes per label:

CALL apoc.meta.nodeTypeProperties({sample: 3});
Results
nodeType nodeLabels propertyName propertyTypes mandatory propertyObservations totalObservations

":`Person`"

["Person"]

"name"

["String"]

FALSE

2

2

":`Person`"

["Person"]

"born"

["Long"]

FALSE

2

2

":`Movie`"

["Movie"]

"title"

["String"]

FALSE

3

3

":`Movie`"

["Movie"]

"tagline"

["String"]

FALSE

3

3

":`Movie`"

["Movie"]

"released"

["Long"]

FALSE

3

3