The impact of indexes on query performance
Search-performance indexes enable quicker and more efficient pattern matching by solving a particular combination of node label/relationship type and property predicate.
They are used automatically by the Cypher® planner in MATCH
clauses, usually at the start of a query, to scan the graph for the most appropriate place to start the pattern-matching process.
By examining query execution plans, this page will explain the scenarios in which the various search-performance indexes are used to improve the performance of Cypher queries. It will also provide some general heuristics for when to use indexes, and advice about how to avoid over-indexing.
Example graph
The examples on this page center around finding routes and points of interest in Central Park, New York, based on data provided by OpenStreetMap. The data model contains two node labels:
-
OSMNode
(Open Street Map Node) — a junction node with geo-spatial properties linking together routes from specific points. -
PointOfInterest
— a subcategory ofOSMNode
. In addition to geospatial properties, these nodes also contain information about specific points of interest, such as statues, baseball courts, etc. in Central Park.
The data model also contains one relationship type: ROUTE
, which specifies the distance in meters between the nodes in the graph.
In total, the graph contains 69165 nodes (of which 188 have the label PointOfInterest
) and 152077 ROUTE
relationships.
To recreate the graph, download and import the 5.0 dump file to an empty Neo4j database. Dump files can be imported for both Aura and on-prem instances.
Token lookup indexes
Two token lookup indexes are present by default when creating a Neo4j database. They store copies of all node labels and relationship types in the database and only solve node label and relationship type predicates.
The following query [1], which counts the number of PointOfInterest
nodes that have value baseball
for the type
property, will access the node label lookup index:
PROFILE
MATCH (n:PointOfInterest)
WHERE n.type = 'baseball'
RETURN count(n)
count(n) |
---|
|
Rows: 1 |
+-------------------+----+------------------------+----------------+------+---------+----------------+------------------------+-----------+---------------------+ | Operator | Id | Details | Estimated Rows | Rows | DB Hits | Memory (Bytes) | Page Cache Hits/Misses | Time (ms) | Pipeline | +-------------------+----+------------------------+----------------+------+---------+----------------+------------------------+-----------+---------------------+ | +ProduceResults | 0 | `count(n)` | 1 | 1 | 0 | 0 | 0/0 | 0.075 | In Pipeline 1 | | | +----+------------------------+----------------+------+---------+----------------+------------------------+-----------+---------------------+ | +EagerAggregation | 1 | count(n) AS `count(n)` | 1 | 1 | 0 | 32 | | | | | | +----+------------------------+----------------+------+---------+----------------+ | | | | +Filter | 2 | n.type = $autostring_0 | 9 | 26 | 376 | | | | | | | +----+------------------------+----------------+------+---------+----------------+ | | | | +NodeByLabelScan | 3 | n:PointOfInterest | 188 | 188 | 189 | 376 | 116/0 | 8.228 | Fused in Pipeline 0 | +-------------------+----+------------------------+----------------+------+---------+----------------+------------------------+-----------+---------------------+ Total database accesses: 565, total allocated memory: 472
The following details are worth highlighting in the execution plan:
-
The
NodeByLabelScan
operator accesses the node label lookup index and produces 188 rows, representing the 188 nodes with thePointOfInterest
label in the database. -
The query required 565 DB hits (each DB hit represents an instance when the query required access to the database).
-
The query completed in just over 8 milliseconds.
Token lookup indexes are very important because they improve the performance of Cypher queries and the population of other indexes, and deleting them will lead to severe performance degradation.
In the above example, had a node label lookup index not existed, the NodeByLabelScan
operator would have been replaced with AllNodesScan
, which would have had to read all 69165 nodes from the database before returning a result.
While useful, token lookup indexes will rarely be sufficient for applications querying databases of a non-trivial size because they cannot solve any property-related predicates.
For more information about the predicates supported by token lookup indexes, see Create, show, and delete indexes → Token lookup indexes: supported predicates.
Range indexes
Range indexes solve most types of predicates, and they are used for efficiently retrieving data based on a range of values. They are particularly useful for dealing with properties that have ordered, comparable values.
The following example first creates a relevant index on the type property for PointOfInterest
nodes, and then runs the above query again, counting the number of PointOfInterest
nodes that have a baseball
type
value:
CREATE INDEX range_index_type FOR (n:PointOfInterest) ON (n.type)
If no index type is specified when creating an index, Neo4j will default to create a range index. For more information about creating indexes, see Create, show, and delete indexes → CREATE INDEX. |
PROFILE
MATCH (n:PointOfInterest)
WHERE n.type = 'baseball'
RETURN count(n)
+-------------------+----+----------------------------------------------------------------+----------------+------+---------+----------------+------------------------+-----------+---------------------+ | Operator | Id | Details | Estimated Rows | Rows | DB Hits | Memory (Bytes) | Page Cache Hits/Misses | Time (ms) | Pipeline | +-------------------+----+----------------------------------------------------------------+----------------+------+---------+----------------+------------------------+-----------+---------------------+ | +ProduceResults | 0 | `count(n)` | 1 | 1 | 0 | 0 | 0/0 | 0.057 | In Pipeline 1 | | | +----+----------------------------------------------------------------+----------------+------+---------+----------------+------------------------+-----------+---------------------+ | +EagerAggregation | 1 | count(n) AS `count(n)` | 1 | 1 | 0 | 32 | | | | | | +----+----------------------------------------------------------------+----------------+------+---------+----------------+ | | | | +NodeIndexSeek | 2 | RANGE INDEX n:PointOfInterest(type) WHERE type = $autostring_0 | 5 | 26 | 27 | 376 | 0/1 | 0.945 | Fused in Pipeline 0 | +-------------------+----+----------------------------------------------------------------+----------------+------+---------+----------------+------------------------+-----------+---------------------+ Total database accesses: 27, total allocated memory: 472
Comparing this query plan with the plan generated before the creation of a relevant range index, the following has changed:
-
NodeByLabelScan has been replaced by NodeIndexSeek. This only produces 26 rows (representing the 26
PointOfInterest
nodes in the database with atype
value set tobaseball
). -
The query now only requires 27 DB hits.
-
The query completed in less than 1 millisecond - almost 8 times faster than it took the query to complete without a range index.
These points all illustrate the fundamental point that search-performance indexes can significantly improve the performance of Cypher queries.
For more information about the predicates supported by range indexes, see Create, show, and delete indexes → Range indexes: supported predicates.
Text indexes
Text indexes are used for queries filtering on STRING
properties.
If there exists both a range and a text index on a given STRING
property, the text index will only be used by the Cypher planner for queries filtering with the CONTAINS
or ENDS WITH
operators.
In all other cases, the range index will be used.
To show this behavior, it is necessary to create a text index and a range index on the same property:
CREATE TEXT INDEX text_index_name FOR (n:PointOfInterest) ON (n.name)
CREATE INDEX range_index_name FOR (n:PointOfInterest) ON (n.name)
The following query filters all PointOfInterest
nodes with a name
property that CONTAINS
'William'
:
STRING
property CONTAINS
PROFILE
MATCH (n:PointOfInterest)
WHERE n.name CONTAINS 'William'
RETURN n.name AS name, n.type AS type
name | type |
---|---|
|
|
|
|
Rows: 2 |
+------------------------+----+----------------------------------------------------------------------+----------------+------+---------+----------------+------------------------+-----------+---------------------+ | Operator | Id | Details | Estimated Rows | Rows | DB Hits | Memory (Bytes) | Page Cache Hits/Misses | Time (ms) | Pipeline | +------------------------+----+----------------------------------------------------------------------+----------------+------+---------+----------------+------------------------+-----------+---------------------+ | +ProduceResults | 0 | name, type | 1 | 2 | 0 | 0 | | | | | | +----+----------------------------------------------------------------------+----------------+------+---------+----------------+ | | | | +Projection | 1 | cache[n.name] AS name, cache[n.type] AS type | 1 | 2 | 0 | | | | | | | +----+----------------------------------------------------------------------+----------------+------+---------+----------------+ | | | | +CacheProperties | 2 | cache[n.type], cache[n.name] | 1 | 2 | 6 | | | | | | | +----+----------------------------------------------------------------------+----------------+------+---------+----------------+ | | | | +NodeIndexContainsScan | 3 | TEXT INDEX n:PointOfInterest(name) WHERE name CONTAINS $autostring_0 | 1 | 2 | 3 | 248 | 4/0 | 53.297 | Fused in Pipeline 0 | +------------------------+----+----------------------------------------------------------------------+----------------+------+---------+----------------+------------------------+-----------+---------------------+ Total database accesses: 9, total allocated memory: 312
The plan shows that the query uses the text index to find all relevant nodes.
If, however, the query is changed to use the STARTS WITH
operator instead of CONTAINS
, the query will use the range index instead:
STRING
property STARTS WITH
PROFILE
MATCH (n:PointOfInterest)
WHERE n.name STARTS WITH 'William'
RETURN n.name, n.type
+-----------------------+----+-----------------------------------------------------------------------------------------+----------------+------+---------+----------------+------------------------+-----------+---------------------+ | Operator | Id | Details | Estimated Rows | Rows | DB Hits | Memory (Bytes) | Page Cache Hits/Misses | Time (ms) | Pipeline | +-----------------------+----+-----------------------------------------------------------------------------------------+----------------+------+---------+----------------+------------------------+-----------+---------------------+ | +ProduceResults | 0 | `n.name`, `n.type` | 1 | 2 | 0 | 0 | | | | | | +----+-----------------------------------------------------------------------------------------+----------------+------+---------+----------------+ | | | | +Projection | 1 | cache[n.name] AS `n.name`, n.type AS `n.type` | 1 | 2 | 4 | | | | | | | +----+-----------------------------------------------------------------------------------------+----------------+------+---------+----------------+ | | | | +NodeIndexSeekByRange | 2 | RANGE INDEX n:PointOfInterest(name) WHERE name STARTS WITH $autostring_0, cache[n.name] | 1 | 2 | 3 | 248 | 4/1 | 1.276 | Fused in Pipeline 0 | +-----------------------+----+-----------------------------------------------------------------------------------------+----------------+------+---------+----------------+------------------------+-----------+---------------------+ Total database accesses: 7, total allocated memory: 312
This is because range indexes store STRING
values alphabetically.
This means that, while they are very efficient for retrieving exact matches of a STRING
, or for prefix matching, they are less efficient for suffix and contains searches, where they have to scan all relevant properties to filter any matches.
Text indexes do not store STRING
properties alphabetically, and are instead optimized for suffix and contains searches.
That said, if no range index had been present on the name property, the previous query would still have been able to utilize the text index.
It would have done so less efficiently than a range index, but it still would have been useful.
For more information about range index ordering, see the section on Range index-backed ORDER BY.
Text indexes are only used for exact query matches. To perform approximate matches (including, for example, variations and typos), and to compute a similarity score between STRING values, use semantic full-text indexes instead.
|
For more information about the predicates supported by text indexes, see Create, show, and delete indexes → Text indexes: supported predicates.
Ensuring text index use
In order for the planner to use text indexes, it must be able to confirm that the properties included in the predicate are STRING
values.
This is not possible when accessing property values within nodes or relationships, or values within a MAP
, since Cypher does not store the type information of these values.
To ensure text indexes are used in these cases, the toString
function should be used.
WITH {name: 'William Shakespeare'} AS varName
MERGE (:PointOfInterest {name:varName.name})
WITH {name: 'William Shakespeare'} AS varName
MERGE (:PointOfInterest {name: toString(varName.name)})
For information about how to ensure the use of text indexes when predicates may contain null
values, see Indexes and null
values.
Text indexes and STRING
sizes
The size of the indexed STRING
properties is also relevant to the planner’s selection between range and text indexes.
Range indexes have a maximum key size limit of around 8 kb.
This means that range indexes cannot be used to index STRING
values larger than 8 kb.
Text indexes, on the other hand, have a maximum key size limit of around 32 kb.
As a result, they can be used to index STRING
values up to that size.
For information about calculating the size of indexes, see Neo4j Knowledge Base → A method to calculate the size of an index in Neo4j.
Point indexes
Point indexes solve predicates operating on spatial POINT
values.
Point indexes are optimized for queries filtering for the distance between property values, or for property values within a bounding box.
The following example creates a point index which is then used in a query returning the name
and type
of all PointOfInterest
nodes within a set bounding box:
CREATE POINT INDEX point_index_location FOR (n:PointOfInterest) ON (n.location)
point.withinBBox()
functionPROFILE
MATCH (n:PointOfInterest)
WHERE point.withinBBox(
n.location,
point({srid: 4326, x: -73.9723702, y: 40.7697989}),
point({srid: 4326, x: -73.9725659, y: 40.770193}))
RETURN n.name AS name, n.type AS type
name | type |
---|---|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Rows: 8 |
+-----------------------+----+------------------------------------------------------------------------------------------------------+----------------+------+---------+----------------+------------------------+-----------+---------------------+ | Operator | Id | Details | Estimated Rows | Rows | DB Hits | Memory (Bytes) | Page Cache Hits/Misses | Time (ms) | Pipeline | +-----------------------+----+------------------------------------------------------------------------------------------------------+----------------+------+---------+----------------+------------------------+-----------+---------------------+ | +ProduceResults | 0 | `n.name`, `n.type` | 4 | 8 | 0 | 0 | | | | | | +----+------------------------------------------------------------------------------------------------------+----------------+------+---------+----------------+ | | | | +Projection | 1 | cache[n.name] AS `n.name`, cache[n.type] AS `n.type` | 4 | 8 | 0 | | | | | | | +----+------------------------------------------------------------------------------------------------------+----------------+------+---------+----------------+ | | | | +CacheProperties | 2 | cache[n.type], cache[n.name] | 4 | 8 | 24 | | | | | | | +----+------------------------------------------------------------------------------------------------------+----------------+------+---------+----------------+ | | | | +NodeIndexSeekByRange | 3 | POINT INDEX n:PointOfInterest(location) WHERE point.withinBBox(location, point($autoint_0, $autodoub | 4 | 8 | 10 | 248 | 302/0 | 2.619 | Fused in Pipeline 0 | | | | le_1, $autodouble_2), point($autoint_3, $autodouble_4, $autodouble_5)) | | | | | | | | +-----------------------+----+------------------------------------------------------------------------------------------------------+----------------+------+---------+----------------+------------------------+-----------+---------------------+ Total database accesses: 34, total allocated memory: 312
For more information about the predicates supported by point indexes, see Create, show, and delete indexes → Point indexes: supported predicates.
Point index configuration settings
It is possible to configure point indexes to only index properties within a specific geographical area.
This is done by specifying either of the following settings in the indexConfig
part of the OPTIONS
clause when creating a point index:
-
spatial.cartesian.min
andspatial.cartesian.max
: used for Cartesian 2D coordinate systems. -
spatial.cartesian-3d.min
andspatial.cartesian-3d.max
: used for Cartesian 3D coordinate systems. -
spatial.wgs-84.min
andspatial.wgs-84.max
: used for WGS-84 2D coordinate systems. -
spatial.wgs-84-3d.min
andspatial.wgs-84-3d.max
: used for WGS-84 3D coordinate systems.
The min
and max
of each setting define the minimum and maximum bounds for the spatial data in each coordinate system.
For example, the following index would only store OSMNodes
in the northern half of Central Park:
CREATE POINT INDEX central_park_north
FOR (o:OSMNode) ON (o.location)
OPTIONS {
indexConfig: {
`spatial.wgs-84.min`:[40.7714, -73.9743],
`spatial.wgs-84.max`:[40.7855, -73.9583]
}
}
Restricting the geographic area of a point index improves the performance of spatial queries. This is especially beneficial when dealing with complex, large geo-spatial data, and when spatial queries are a significant part of an application’s functionality.
Composite indexes
It is possible to create a range index on a single property or multiple properties (text and point indexes are single-property only). The latter are called composite indexes and can be useful if queries against a database frequently filter on all the properties indexed by the composite index.
The following example first creates a composite index on PointOfInterest
nodes for the properties name
and type
, and then queries the graph using the shortestPath function to determine both the path length (in terms of traversed relationships in the graph) and geographical distance between the Zoo School
and its nearest tennis pitch
(note that there are 32 unique PointOfInterest
tennis pitch
nodes in the graph):
CREATE INDEX composite_index FOR (n:PointOfInterest) ON (n.name, n.type)
PROFILE
MATCH (tennisPitch: PointOfInterest {name: 'pitch', type: 'tennis'})
WITH tennisPitch
MATCH path = shortestPath((tennisPitch)-[:ROUTE*]-(:PointOfInterest {name: 'Zoo School'}))
WITH path, relationships(path) AS relationships
ORDER BY length(path) ASC
LIMIT 1
UNWIND relationships AS rel
RETURN length(path) AS pathLength, sum(rel.distance) AS geographicalDistance
pathLength | geographicalDistance |
---|---|
|
|
Rows: 1 |
+---------------------+----+------------------------------------------------------------------------------------------------------+----------------+------+---------+----------------+------------------------+-----------+------------------+---------------------+ | Operator | Id | Details | Estimated Rows | Rows | DB Hits | Memory (Bytes) | Page Cache Hits/Misses | Time (ms) | Ordered by | Pipeline | +---------------------+----+------------------------------------------------------------------------------------------------------+----------------+------+---------+----------------+------------------------+-----------+------------------+---------------------+ | +ProduceResults | 0 | pathLength, geographicalDistance | 1 | 1 | 0 | 0 | 0/0 | 0.065 | | | | | +----+------------------------------------------------------------------------------------------------------+----------------+------+---------+----------------+------------------------+-----------+ | | | +OrderedAggregation | 1 | length(path) AS pathLength, sum(rel.distance) AS geographicalDistance | 1 | 1 | 50 | 5140 | 31/0 | 4.097 | pathLength ASC | In Pipeline 3 | | | +----+------------------------------------------------------------------------------------------------------+----------------+------+---------+----------------+------------------------+-----------+------------------+---------------------+ | +Unwind | 2 | relationships AS rel | 1 | 25 | 0 | 3112 | 0/0 | 0.180 | | In Pipeline 2 | | | +----+------------------------------------------------------------------------------------------------------+----------------+------+---------+----------------+------------------------+-----------+ +---------------------+ | +Projection | 3 | relationships(path) AS relationships | 0 | 1 | 0 | | 0/0 | 0.050 | | | | | +----+------------------------------------------------------------------------------------------------------+----------------+------+---------+----------------+------------------------+-----------+ | | | +Top | 4 | `length(path)` ASC LIMIT 1 | 0 | 1 | 0 | 57472 | 0/0 | 1.763 | length(path) ASC | In Pipeline 1 | | | +----+------------------------------------------------------------------------------------------------------+----------------+------+---------+----------------+------------------------+-----------+------------------+---------------------+ | +Projection | 5 | length(path) AS `length(path)` | 0 | 32 | 0 | | | | | | | | +----+------------------------------------------------------------------------------------------------------+----------------+------+---------+----------------+ | +------------------+ | | +ShortestPath | 6 | path = (tennisPitch)-[anon_0:ROUTE*]-(anon_1) | 0 | 32 | 181451 | 70080 | | | | | | | +----+------------------------------------------------------------------------------------------------------+----------------+------+---------+----------------+ | +------------------+ | | +MultiNodeIndexSeek | 7 | RANGE INDEX tennisPitch:PointOfInterest(name, type) WHERE name = $autostring_0 AND type = $autostrin | 0 | 31 | 0 | 376 | 131215/1 | 188.723 | | Fused in Pipeline 0 | | | | g_1, RANGE INDEX anon_1:PointOfInterest(name) WHERE name = $autostring_2 | | | | | | | | | +---------------------+----+------------------------------------------------------------------------------------------------------+----------------+------+---------+----------------+------------------------+-----------+------------------+---------------------+ Total database accesses: 181501, total allocated memory: 116040
The query plan shows the composite index being used, and not the previously created range index on the type
property.
This is because the composite index solves the queried predicate simultaneously, while the single propertied index would only be able to solve part of the predicate.
Property order and query planning
Like single-property range indexes, composite indexes support all predicates:
-
Equality check:
n.prop = value
-
List membership check:
n.prop IN [value, …]
-
Existence check:
n.prop IS NOT NULL
-
Range search:
n.prop > value
-
Prefix search:
n.prop STARTS WITH value
However, the order in which properties are defined when creating a composite index impacts how the planner will use the index to solve predicates.
For example, a composite index on (n.prop1, n.prop2, n.prop3)
will generate a different query plan than a composite index created on (n.prop3, n.prop2, n.prop1)
.
The following example shows how composite indexes on the same properties defined in a different order will generate different execution plans:
CREATE INDEX composite_2 FOR (n:PointOfInterest) ON (n.lat, n.name, n.type)
Note the order in which the properties are defined when creating the index, with lat
first, name
second, and type
last.
PROFILE
MATCH (n:PointOfInterest)
WHERE n.lat = 40.7697989 AND n.name STARTS WITH 'William' AND n.type IS NOT NULL
RETURN n.name AS name
name |
---|
|
Rows: 1 |
+-----------------+----+------------------------------------------------------------------------------------------------------+----------------+------+---------+----------------+------------------------+-----------+---------------------+ | Operator | Id | Details | Estimated Rows | Rows | DB Hits | Memory (Bytes) | Page Cache Hits/Misses | Time (ms) | Pipeline | +-----------------+----+------------------------------------------------------------------------------------------------------+----------------+------+---------+----------------+------------------------+-----------+---------------------+ | +ProduceResults | 0 | name | 0 | 0 | 0 | 0 | | | | | | +----+------------------------------------------------------------------------------------------------------+----------------+------+---------+----------------+ | | | | +Projection | 1 | cache[n.name] AS name | 0 | 0 | 0 | | | | | | | +----+------------------------------------------------------------------------------------------------------+----------------+------+---------+----------------+ | | | | +NodeIndexSeek | 2 | RANGE INDEX n:PointOfInterest(lat, name, type) WHERE lat = $autodouble_0 AND name STARTS WITH $autos | 0 | 0 | 1 | 248 | 0/2 | 1.276 | Fused in Pipeline 0 | | | | tring_1 AND type IS NOT NULL, cache[n.name] | | | | | | | | +-----------------+----+------------------------------------------------------------------------------------------------------+----------------+------+---------+----------------+------------------------+-----------+---------------------+ Total database accesses: 1, total allocated memory: 312
The plan shows the recently created composite index is used.
It also shows that the predicates are filtered as specified in the query (i.e. an equality check on the lat
property, a prefix search on the name
property, and an existence check on the type
property).
However, if the order of the properties is altered when creating the index, a different query plan will be generated.
To demonstrate this behavior, it is first necessary to drop the recently created composite_2
index and create a new composite index on the same properties defined in a different order:
DROP INDEX composite_2
CREATE INDEX composite_3 FOR (n:PointOfInterest) ON (n.name, n.type, n.lat)
Note that the order of the properties has changed: the name
property is now the first property defined in the composite index, and the lat
property is indexed last.
PROFILE
MATCH (n:PointOfInterest)
WHERE n.lat = 40.769798 AND n.name STARTS WITH 'William' AND n.type IS NOT NULL
RETURN n.name AS name
+-----------------+----+-----------------------------------------------------------------------------------------------------+----------------+------+---------+----------------+------------------------+-----------+---------------------+ | Operator | Id | Details | Estimated Rows | Rows | DB Hits | Memory (Bytes) | Page Cache Hits/Misses | Time (ms) | Pipeline | +-----------------+----+-----------------------------------------------------------------------------------------------------+----------------+------+---------+----------------+------------------------+-----------+---------------------+ | +ProduceResults | 0 | name | 0 | 0 | 0 | 0 | | | | | | +----+-----------------------------------------------------------------------------------------------------+----------------+------+---------+----------------+ | | | | +Projection | 1 | cache[n.name] AS name | 0 | 0 | 0 | | | | | | | +----+-----------------------------------------------------------------------------------------------------+----------------+------+---------+----------------+ | | | | +Filter | 2 | cache[n.lat] = $autodouble_0 | 0 | 0 | 0 | | | | | | | +----+-----------------------------------------------------------------------------------------------------+----------------+------+---------+----------------+ | | | | +NodeIndexSeek | 3 | RANGE INDEX n:PointOfInterest(name, type, lat) WHERE name STARTS WITH $autostring_1 AND type IS NOT | 0 | 2 | 3 | 248 | 2/0 | 0.807 | Fused in Pipeline 0 | | | | NULL AND lat IS NOT NULL, cache[n.name], cache[n.lat] | | | | | | | | +-----------------+----+-----------------------------------------------------------------------------------------------------+----------------+------+---------+----------------+------------------------+-----------+---------------------+ Total database accesses: 3, total allocated memory: 312
This plan now shows that, while a prefix search has been used to solve the name
property predicate, the lat
property predicate is no longer solved with an equality check, but rather with an existence check and an explicit filter operation afterward.
Note that if the composite_2
index had not been dropped before the query was rerun, the planner would have used it instead of the composite_3
index.
This is because, when using composite indexes, any predicate after a prefix search will automatically be planned as an existence check predicate.
Composite index rules
-
If a query contains an equality check or a list membership check predicates, they need to be for the first properties defined when creating the composite index.
-
Queries utilizing a composite index can contain up to one range search or prefix search predicate.
-
There can be any number of existence check predicates.
-
Any predicates following a prefix search or an existence check will be planned as existence checks.
-
Suffix and substring search predicates can utilize composite indexes. However, they are always planned as an existence check and any subsequent query predicates will accordingly also be planned as such. Note that if these predicates are used, and a text index also exists on any of the indexed (
STRING
) properties, the planner will use the text index instead of a composite index.
These rules can be important when creating composite indexes, as some checks are more efficient than others. For instance, it is generally more efficient for the planner to perform an equality check on a property than an existence check. Depending on the queries and the application, it may, therefore, be cost-effective to consider the order in which properties are defined when creating a composite index.
Additionally, it bears repeating that composite indexes can only be used if a predicate filters on all the properties indexed by the composite index, and that composite indexes can only be created for range indexes.
Range index-backed ORDER BY
Range indexes store properties in ascending order (alphabetically for STRING
values, and numerically for FLOAT
and INTEGER
values).
This can have important implications for query performance, because the planner may be able to take advantage of a pre-existing index order and therefore not have to perform an expensive Sort
operation later in the query.
To demonstrate this behavior, the following query will filter out any ROUTE
relationships with a distance
property less than 30
, and return the distance
property of the matched relationships in ascending numerical order using the ORDER BY clause.
PROFILE
MATCH ()-[r:ROUTE]-()
WHERE r.distance < 30
RETURN r.distance AS distance
ORDER BY distance
+---------------------------------+----+--------------------------------+----------------+-------+---------+----------------+------------------------+-----------+--------------+---------------------+ | Operator | Id | Details | Estimated Rows | Rows | DB Hits | Memory (Bytes) | Page Cache Hits/Misses | Time (ms) | Ordered by | Pipeline | +---------------------------------+----+--------------------------------+----------------+-------+---------+----------------+------------------------+-----------+--------------+---------------------+ | +ProduceResults | 0 | distance | 3013 | 6744 | 0 | 0 | 0/0 | 14.397 | | | | | +----+--------------------------------+----------------+-------+---------+----------------+------------------------+-----------+ | | | +Sort | 1 | distance ASC | 3013 | 6744 | 0 | 540472 | 0/0 | 16.844 | distance ASC | In Pipeline 1 | | | +----+--------------------------------+----------------+-------+---------+----------------+------------------------+-----------+--------------+---------------------+ | +Projection | 2 | cache[r.distance] AS distance | 3013 | 6744 | 0 | | | | | | | | +----+--------------------------------+----------------+-------+---------+----------------+ | +--------------+ | | +Filter | 3 | cache[r.distance] < $autoint_0 | 3013 | 6744 | 10041 | | | | | | | | +----+--------------------------------+----------------+-------+---------+----------------+ | +--------------+ | | +UndirectedRelationshipTypeScan | 4 | (anon_0)-[r:ROUTE]-(anon_1) | 10044 | 10041 | 5023 | 376 | 84/0 | 22.397 | | Fused in Pipeline 0 | +---------------------------------+----+--------------------------------+----------------+-------+---------+----------------+------------------------+-----------+--------------+---------------------+ Total database accesses: 15064, total allocated memory: 540808
This plan shows two important points about indexes and the ordering of results:
-
Only the relationship type lookup index was used in this query (accessed by the
UndirectedRelationshipTypeScan
operator, which fetches all relationships and their start and end nodes from the relationship type index). -
As a result, the planner has to perform a
Sort
operation to order the results by the distance property (in this case, it required 540472 bytes of memory).
To see how an index could impact the query plan, it is first necessary to create a range index on the distance
property:
CREATE INDEX range_index_relationships FOR ()-[r:ROUTE]-() ON (r.distance)
Re-running the query, it now generates a different plan:
PROFILE
MATCH ()-[r:ROUTE]-()
WHERE r.distance < 30
RETURN r.distance AS distance
ORDER BY distance
+-----------------------------------------+----+--------------------------------------------------------------------------------------------------+----------------+------+---------+----------------+------------------------+-----------+----------------+---------------------+ | Operator | Id | Details | Estimated Rows | Rows | DB Hits | Memory (Bytes) | Page Cache Hits/Misses | Time (ms) | Ordered by | Pipeline | +-----------------------------------------+----+--------------------------------------------------------------------------------------------------+----------------+------+---------+----------------+------------------------+-----------+----------------+---------------------+ | +ProduceResults | 0 | distance | 301 | 6744 | 0 | 0 | | | | | | | +----+--------------------------------------------------------------------------------------------------+----------------+------+---------+----------------+ | | | | | +Projection | 1 | cache[r.distance] AS distance | 301 | 6744 | 0 | | | | distance ASC | | | | +----+--------------------------------------------------------------------------------------------------+----------------+------+---------+----------------+ | +----------------+ | | +UndirectedRelationshipIndexSeekByRange | 2 | RANGE INDEX (anon_0)-[r:ROUTE(distance)]-(anon_1) WHERE distance < $autoint_0, cache[r.distance] | 301 | 6744 | 3373 | 248 | 2361/10 | 76.542 | r.distance ASC | Fused in Pipeline 0 | +-----------------------------------------+----+--------------------------------------------------------------------------------------------------+----------------+------+---------+----------------+------------------------+-----------+----------------+---------------------+ Total database accesses: 3373, total allocated memory: 312
Focusing on the same two points in the plan, the following has changed:
-
The recently created range index on the relationship type property
distance
is now used. -
As a result, the plan no longer needs to perform a
Sort
operation to order the results (because thedistance
property is already ordered by the index), and this substantially reduces the cost of the query (the total memory cost of the query is now 312 bytes).
Multiple index use
Indexes are principally used to find the starting points of patterns.
If a query contains one MATCH
clause, then, as a general rule, only the index that best suits the predicates in that clause will be selected by the planner.
If, however, a query contains two or more MATCH
clauses, it is possible to use several indexes.
To show multiple indexes used in one query, the following example will first create a new index on the lon
(longitude) property for PointOfInterest
nodes.
It then uses a query that finds all PointOfInterest
nodes north of the William Shakespeare
statue in Central Park.
CREATE INDEX range_index_lon FOR (n:PointOfInterest) ON (n.lon)
PointOfInterest
nodes north of the William Shakespeare
statuePROFILE
MATCH (ws:PointOfInterest {name:'William Shakespeare'})
WITH ws
MATCH (poi:PointOfInterest)
WHERE poi.lon > ws.lon
RETURN poi.name AS name
+-------------------------+----+-----------------------------------------------------------------+----------------+------+---------+----------------+------------------------+-----------+---------------------+ | Operator | Id | Details | Estimated Rows | Rows | DB Hits | Memory (Bytes) | Page Cache Hits/Misses | Time (ms) | Pipeline | +-------------------------+----+-----------------------------------------------------------------+----------------+------+---------+----------------+------------------------+-----------+---------------------+ | +ProduceResults | 0 | name | 9 | 143 | 0 | 0 | | | | | | +----+-----------------------------------------------------------------+----------------+------+---------+----------------+ | | | | +Projection | 1 | poi.name AS name | 9 | 143 | 283 | | | | | | | +----+-----------------------------------------------------------------+----------------+------+---------+----------------+ | | | | +Apply | 2 | | 9 | 143 | 0 | | | | | | |\ +----+-----------------------------------------------------------------+----------------+------+---------+----------------+ | | | | | +NodeIndexSeekByRange | 3 | RANGE INDEX poi:PointOfInterest(lon) WHERE lon > ws.lon | 9 | 143 | 146 | 2280 | 233/1 | 1.460 | Fused in Pipeline 1 | | | +----+-----------------------------------------------------------------+----------------+------+---------+----------------+------------------------+-----------+---------------------+ | +NodeIndexSeek | 4 | RANGE INDEX ws:PointOfInterest(name) WHERE name = $autostring_0 | 2 | 1 | 2 | 376 | 1/0 | 0.635 | In Pipeline 0 | +-------------------------+----+-----------------------------------------------------------------+----------------+------+---------+----------------+------------------------+-----------+---------------------+ Total database accesses: 431, total allocated memory: 2616
This plan shows that a separate index is used to improve the performance of each MATCH
clause (first by utilizing the index on the name
property to find the William Shakespeare
node, and then by using the index on the lon
property to find all nodes with a greater longitudinal value).
Indexes and null
values
Neo4j indexes do not store null
values.
This means that the planner must be able to rule out the possibility of null
values in order for queries to use an index.
The following query demonstrates the incompatibility between null
values and indexes by counting all PointOfInterest
nodes with an unset name
property:
null
name
valuePROFILE
MATCH (n:PointOfInterest)
WHERE n.name IS NULL
RETURN count(n) AS nodes
nodes |
---|
|
Rows:1 |
+-------------------+----+-------------------+----------------+------+---------+----------------+------------------------+-----------+---------------------+ | Operator | Id | Details | Estimated Rows | Rows | DB Hits | Memory (Bytes) | Page Cache Hits/Misses | Time (ms) | Pipeline | +-------------------+----+-------------------+----------------+------+---------+----------------+------------------------+-----------+---------------------+ | +ProduceResults | 0 | nodes | 1 | 1 | 0 | 0 | 0/0 | 0.012 | In Pipeline 1 | | | +----+-------------------+----------------+------+---------+----------------+------------------------+-----------+---------------------+ | +EagerAggregation | 1 | count(n) AS nodes | 1 | 1 | 0 | 32 | | | | | | +----+-------------------+----------------+------+---------+----------------+ | | | | +Filter | 2 | n.name IS NULL | 141 | 3 | 373 | | | | | | | +----+-------------------+----------------+------+---------+----------------+ | | | | +NodeByLabelScan | 3 | n:PointOfInterest | 188 | 188 | 189 | 376 | 115/0 | 0.769 | Fused in Pipeline 0 | +-------------------+----+-------------------+----------------+------+---------+----------------+------------------------+-----------+---------------------+ Total database accesses: 562, total allocated memory: 472
The plan shows that neither of the two available indexes (range and text) on the name
property is used to solve the predicate.
However, if a query predicate is added which is able to exclude the presence of any null
values, then an index can be used.
The following query shows this by adding a substring predicate to the above query:
null
name
value or nodes with a name
property containing 'William'
PROFILE
MATCH (n:PointOfInterest)
WHERE n.name IS NULL OR n.name CONTAINS 'William'
RETURN count(n) AS nodes
nodes |
---|
|
Rows: 1 |
The query result now includes both the three nodes with an unset name
value found in the previous query and the two nodes with a name
value containing William
(William Shakespeare
and William Tecumseh Sherman
).
+--------------------------+----+----------------------------------------------------------------------+----------------+------+---------+----------------+------------------------+-----------+---------------------+ | Operator | Id | Details | Estimated Rows | Rows | DB Hits | Memory (Bytes) | Page Cache Hits/Misses | Time (ms) | Pipeline | +--------------------------+----+----------------------------------------------------------------------+----------------+------+---------+----------------+------------------------+-----------+---------------------+ | +ProduceResults | 0 | nodes | 1 | 1 | 0 | 0 | 0/0 | 0.010 | In Pipeline 3 | | | +----+----------------------------------------------------------------------+----------------+------+---------+----------------+------------------------+-----------+---------------------+ | +EagerAggregation | 1 | count(n) AS nodes | 1 | 1 | 0 | 32 | | | | | | +----+----------------------------------------------------------------------+----------------+------+---------+----------------+ | | | | +Distinct | 2 | n | 141 | 5 | 0 | 352 | | | | | | +----+----------------------------------------------------------------------+----------------+------+---------+----------------+ | | | | +Union | 3 | | 142 | 5 | 0 | 352 | 0/0 | 0.220 | Fused in Pipeline 2 | | |\ +----+----------------------------------------------------------------------+----------------+------+---------+----------------+------------------------+-----------+---------------------+ | | +NodeIndexContainsScan | 4 | TEXT INDEX n:PointOfInterest(name) WHERE name CONTAINS $autostring_0 | 1 | 2 | 3 | 376 | 4/0 | 0.456 | In Pipeline 1 | | | +----+----------------------------------------------------------------------+----------------+------+---------+----------------+------------------------+-----------+---------------------+ | +Filter | 5 | n.name IS NULL | 141 | 3 | 373 | | | | | | | +----+----------------------------------------------------------------------+----------------+------+---------+----------------+ | | | | +NodeByLabelScan | 6 | n:PointOfInterest | 188 | 188 | 189 | 376 | 115/0 | 0.673 | Fused in Pipeline 0 | +--------------------------+----+----------------------------------------------------------------------+----------------+------+---------+----------------+------------------------+-----------+---------------------+ Total database accesses: 565, total allocated memory: 1352
This plan shows that an index is only used to solve the second part of the WHERE
clause, which excludes the presence of null
values.
The presence of null
values within an indexed property therefore does not negate the use of an index.
Index use is only negated if the planner is unable to rule out the inclusion of any unset properties in the matching process.
The presence of null
values may not be known in advance, and this can cause unexpected instances of indexes not being used. There are, however, a few strategies to ensure that an index will be used.
Property existence checks
One method to ensure an index is used is to explicitly filter out any null
values by appending IS NOT NULL
to the queried property.
The following example uses the same query as above but exchanges IS NULL
with IS NOT NULL
in the WHERE
clause:
PointOfInterest
nodes without a null
name
valuePROFILE
MATCH (n:PointOfInterest)
WHERE n.name IS NOT NULL
RETURN count(n) AS nodes
nodes |
---|
|
Rows: 1 |
+-------------------+----+------------------------------------------------------------+----------------+------+---------+----------------+------------------------+-----------+---------------------+ | Operator | Id | Details | Estimated Rows | Rows | DB Hits | Memory (Bytes) | Page Cache Hits/Misses | Time (ms) | Pipeline | +-------------------+----+------------------------------------------------------------+----------------+------+---------+----------------+------------------------+-----------+---------------------+ | +ProduceResults | 0 | nodes | 1 | 1 | 0 | 0 | 0/0 | 0.013 | In Pipeline 1 | | | +----+------------------------------------------------------------+----------------+------+---------+----------------+------------------------+-----------+---------------------+ | +EagerAggregation | 1 | count(n) AS nodes | 1 | 1 | 0 | 32 | | | | | | +----+------------------------------------------------------------+----------------+------+---------+----------------+ | | | | +NodeIndexScan | 2 | RANGE INDEX n:PointOfInterest(name) WHERE name IS NOT NULL | 185 | 185 | 186 | 376 | 0/1 | 0.691 | Fused in Pipeline 0 | +-------------------+----+------------------------------------------------------------+----------------+------+---------+----------------+------------------------+-----------+---------------------+ Total database accesses: 186, total allocated memory: 472
This plan shows that the previously created range index on the name
property is now used to solve the predicate.
Text indexes and type predicate expressions
Text indexes require that predicates only include STRING
properties.
To use text indexes in situations where any of the queried properties may be either of an incompatible type or null
rather than a STRING
value, add the type predicate expression IS :: STRING NOT NULL
(or its alias, introduced in Neo4j 5.14, IS :: STRING!
) to the query.
This will enforce both the existence of a property and its STRING
type, discarding any rows where the property is missing or not of type STRING
, and thereby enable the use of text indexes.
For example, if the WHERE
predicate in the previous query is altered to instead append IS :: STRING NOT NULL
, then the text index rather than the range index is used (range indexes do not support type predicate expressions):
PROFILE
MATCH (n:PointOfInterest)
WHERE n.name IS :: STRING NOT NULL
RETURN count(n) AS nodes
+-------------------+----+-----------------------------------------------------------+----------------+------+---------+----------------+------------------------+-----------+---------------------+ | Operator | Id | Details | Estimated Rows | Rows | DB Hits | Memory (Bytes) | Page Cache Hits/Misses | Time (ms) | Pipeline | +-------------------+----+-----------------------------------------------------------+----------------+------+---------+----------------+------------------------+-----------+---------------------+ | +ProduceResults | 0 | nodes | 1 | 1 | 0 | 0 | 0/0 | 0.009 | In Pipeline 1 | | | +----+-----------------------------------------------------------+----------------+------+---------+----------------+------------------------+-----------+---------------------+ | +EagerAggregation | 1 | count(n) AS nodes | 1 | 1 | 0 | 32 | | | | | | +----+-----------------------------------------------------------+----------------+------+---------+----------------+ | | | | +NodeIndexScan | 2 | TEXT INDEX n:PointOfInterest(name) WHERE name IS NOT NULL | 185 | 185 | 186 | 376 | 0/0 | 0.343 | Fused in Pipeline 0 | +-------------------+----+-----------------------------------------------------------+----------------+------+---------+----------------+------------------------+-----------+---------------------+ Total database accesses: 186, total allocated memory: 472
While type predicate expressions were introduced in Neo4j 5.9, the IS :: STRING NOT NULL syntax only became an index-compatible predicate in Neo4j 5.15.
For more information, see the page about type predicate expressions.
|
The toString
function can also be used to convert an expression to STRING
values, and thereby help the planner to select a text index.
Property type constraints
For indexes that are compatible only with specific types (i.e. text and point indexes), the Cypher planner needs to deduce that a predicate will evaluate to null
for non-compatible values in order to use the index.
If a predicate is not explicitly defined as the required type (STRING
or POINT
), this can lead to situations where a text or point index is not used.
Since property type constraints guarantee that a property is always of the same type, they can be used to extend the scenarios in which text and point indexes are compatible with a predicate.
To show this, the following example will first drop the existing range index on the name
property (this is necessary because property type constraints only extend the compatibility of type-specific indexes - range indexes are not limited by a value type).
It will then run the same query with a WHERE
predicate on the name
property (for which there exists a previously created text index) before and after creating a property type constraint, and compare the resulting execution plans.
DROP INDEX range_index_name
PointOfInterest
nodes without a null
name
valuePROFILE
MATCH (n:PointOfInterest)
WHERE n.name IS NOT NULL
RETURN count(n) AS nodes
+-------------------+----+--------------------+----------------+------+---------+----------------+------------------------+-----------+---------------------+ | Operator | Id | Details | Estimated Rows | Rows | DB Hits | Memory (Bytes) | Page Cache Hits/Misses | Time (ms) | Pipeline | +-------------------+----+--------------------+----------------+------+---------+----------------+------------------------+-----------+---------------------+ | +ProduceResults | 0 | nodes | 1 | 1 | 0 | 0 | 0/0 | 0.012 | In Pipeline 1 | | | +----+--------------------+----------------+------+---------+----------------+------------------------+-----------+---------------------+ | +EagerAggregation | 1 | count(n) AS nodes | 1 | 1 | 0 | 32 | | | | | | +----+--------------------+----------------+------+---------+----------------+ | | | | +Filter | 2 | n.name IS NOT NULL | 187 | 185 | 373 | | | | | | | +----+--------------------+----------------+------+---------+----------------+ | | | | +NodeByLabelScan | 3 | n:PointOfInterest | 188 | 188 | 189 | 376 | 259/0 | 0.363 | Fused in Pipeline 0 | +-------------------+----+--------------------+----------------+------+---------+----------------+------------------------+-----------+---------------------+ Total database accesses: 562, total allocated memory: 472
This plan shows that the available text index on the name
property was not used to solve the predicate.
This is because the planner was not able to deduce that all name
values are of type STRING
.
However, if a property type constraint is created to ensure that all name
properties have a STRING
value, a different query plan is generated.
STRING
type constraint on the name
propertyCREATE CONSTRAINT type_constraint
FOR (n:PointOfInterest) REQUIRE n.name IS :: STRING
PROFILE
MATCH (n:PointOfInterest)
WHERE n.name IS NOT NULL
RETURN count(n) AS nodes
+-------------------+----+-----------------------------------------------------------+----------------+------+---------+----------------+------------------------+-----------+---------------------+ | Operator | Id | Details | Estimated Rows | Rows | DB Hits | Memory (Bytes) | Page Cache Hits/Misses | Time (ms) | Pipeline | +-------------------+----+-----------------------------------------------------------+----------------+------+---------+----------------+------------------------+-----------+---------------------+ | +ProduceResults | 0 | nodes | 1 | 1 | 0 | 0 | 0/0 | 0.013 | In Pipeline 1 | | | +----+-----------------------------------------------------------+----------------+------+---------+----------------+------------------------+-----------+---------------------+ | +EagerAggregation | 1 | count(n) AS nodes | 1 | 1 | 0 | 32 | | | | | | +----+-----------------------------------------------------------+----------------+------+---------+----------------+ | | | | +NodeIndexScan | 2 | TEXT INDEX n:PointOfInterest(name) WHERE name IS NOT NULL | 187 | 185 | 186 | 376 | 0/0 | 0.328 | Fused in Pipeline 0 | +-------------------+----+-----------------------------------------------------------+----------------+------+---------+----------------+------------------------+-----------+---------------------+ Total database accesses: 186, total allocated memory: 472
Because of the property type constraint on the name
property, the planner is now able to deduce that all name
properties are of type STRING
, and therefore use the available text index.
Point indexes can be extended in the same way if a property type constraint is created to ensure that all properties are POINT
values.
Note that property existence constraints do not currently leverage index use in the same way.
Heuristics: deciding what to index
While it is impossible to give exact directions on when a search-performance index might be beneficial for a particular use-case, the following points provide some useful heuristics for when creating an index might improve query performance:
-
Frequent property-based queries: if some properties are used frequently for filtering or matching, consider creating an index on them.
-
Performance optimization: If certain queries are too slow, re-examine the properties that are filtered on, and consider creating indexes for those properties that may cause bottlenecking.
-
High cardinality properties: high cardinality properties have many distinct values (e.g., unique identifiers, timestamps, or user names). Queries that seek to retrieve such properties will likely benefit from indexing.
-
Complex queries: if queries traverse complex paths in a graph (for example, by involving multiple hops and several layers of filtering), adding indexes to the properties used in those queries can improve query performance.
-
Experiment and test: It is good practice to experiment with different indexes and query patterns, and to measure the performance of critical queries with and without different indexes to evaluate their effectiveness.
Over-indexing: considerations and solutions
Search-performance indexes can significantly improve query performance. They should, however, be used judiciously for the following reasons:
-
Storage space: because each index is a secondary copy of the data in the primary database, each index essentially doubles the amount of storage space occupied by the indexed data.
-
Slower write queries: adding indexes impacts the performance of write queries. This is because indexes are updated with each write query. If a system needs to perform a lot of writes quickly, it may be counterproductive to have an index on the affected data entities. In other words, if write performance is crucial for a particular use case, it may be beneficial to only add indexes where they are necessary for read purposes.
As a result of these two points, deciding what to index (and what not to index) is an important and non-trivial task.
Keeping track of index-use: lastRead
, readCount
, and trackedSince
Unused indexes take up unnecessary storage space and it may be beneficial to remove them.
Knowing which indexes are most frequently used by the queries against a database can, however, be difficult.
As of Neo4j 5.8, there are three relevant columns returned by the SHOW INDEX
command which can help identify redundant indexes:
-
lastRead
: returns the last time the index was used for reading. -
readCount
: returns the number of read queries issued to the index. -
trackedSince
returns the time when usage statistics tracking started for an index.[2]
To return these values (along with other relevant information) for the indexes in a database, run the following query:
SHOW INDEX YIELD name, type, entityType, labelsOrTypes, properties, lastRead, readCount, trackedSince
If any unused indexes are identified, it may be beneficial to delete them using the DROP INDEX
command.
Summary
-
Range indexes can be used to solve most predicates.
-
Text indexes are used over range indexes for
CONTAINS
andENDS WITH
predicates onSTRING
properties, and if the queriedSTRING
properties exceed 8 kb. -
Point indexes are used when queries filter on distances and bounding boxes.
-
Token lookup indexes only solve node label and relationship type predicates. They do not solve any property predicates. Deleting token lookup indexes will negatively impact query performance.
-
Composite indexes are only used if the query filters on all properties indexed by the composite index. The order in which the properties are defined when creating a composite index impacts how the planner solves query predicates.
-
Queries ordering results using
ORDER BY
can leverage the pre-existing order in range indexes and thereby improve query performance. -
A Cypher query can use several indexes if the planner deems it beneficial to the performance of a query.
-
Neo4j indexes do not store
null
values, and the planner must be able to rule out any entities with properties containingnull
values in order to use an index. There are several strategies to ensure the use of indexes. -
The columns
lastRead
,readCount
, andtrackedSince
returned by theSHOW INDEX
command can be used to identify redundant indexes that take up unnecessary space.
PROFILE
. This both runs the query and generates its execution plan. For more information, see Execution plans and query tuning → Note on PROFILE and EXPLAIN.
trackedSince
column is not part of the default return columns for the SHOW INDEXES
command. To return this and all other non-default columns, use SHOW INDEXES YIELD *
. For more information, see Create, show, and delete indexes → Result columns for listing indexes.