Vector functions

Vector functions allow you to compute the similarity scores of vector pairs. These vector similarity functions are identical to those used by Neo4j’s vector search indexes.

vector.similarity.cosine()

Details

Syntax

vector.similarity.cosine(a, b)

Description

Returns a FLOAT representing the similarity between the argument vectors based on their cosine.

Arguments

Name

Type

Description

a

LIST<INTEGER | FLOAT>

A list representing the first vector.

b

LIST<INTEGER | FLOAT>

A list representing the second vector.

Returns

FLOAT

For more details, see the vector index documentation.

Considerations

vector.similarity.cosine(NULL, NULL) returns NULL.

vector.similarity.cosine(NULL, b) returns NULL.

vector.similarity.cosine(a, NULL) returns NULL.

Both vectors must be of the same dimension.

Both vectors must be valid with respect to cosine similarity.

The implementation is identical to that of the latest available vector index provider (vector-2.0).

vector.similarity.euclidean()

Details

Syntax

vector.similarity.euclidean(a, b)

Description

Returns a FLOAT representing the similarity between the argument vectors based on their Euclidean distance.

Arguments

Name

Type

Description

a

LIST<INTEGER | FLOAT>

A list representing the first vector.

b

LIST<INTEGER | FLOAT>

A list representing the second vector.

Returns

FLOAT

For more details, see the vector index documentation.

Considerations

vector.similarity.euclidean(NULL, NULL) returns NULL.

vector.similarity.euclidean(NULL, b) returns NULL.

vector.similarity.euclidean(a, NULL) returns NULL.

Both vectors must be of the same dimension.

Both vectors must be valid with respect to Euclidean similarity.

The implementation is identical to that of the latest available vector index provider (vector-2.0).

Example 1. k-Nearest Neighbors

k-nearest neighbor queries return the k entities with the highest similarity scores based on comparing their associated vectors with a query vector. Such queries can be run against vector indexes in the form of approximate k-nearest neighbor (k-ANN) queries, whose returned entities have a high probability of being among the true k nearest neighbors. However, they can also be expressed as an exhaustive search using vector similarity functions directly. While this is typically significantly slower than using an index, it is exact rather than approximate and does not require an existing index. This can be useful for one-off queries on small sets of data.

To create the graph used in this example, run the following query in an empty Neo4j database:

CREATE
  (:Node { id: 1, vector: [1.0, 4.0, 2.0]}),
  (:Node { id: 2, vector: [3.0, -2.0, 1.0]}),
  (:Node { id: 3, vector: [2.0, 8.0, 3.0]});

Given a parameter query (here set to [4.0, 5.0, 6.0]), you can query for the two nearest neighbors of that query vector by Euclidean distance. This is achieved by matching on all candidate vectors and ordering on the similarity score:

Query
MATCH (node:Node)
WITH node, vector.similarity.euclidean($query, node.vector) AS score
RETURN node, score
ORDER BY score DESCENDING
LIMIT 2;

This returns the two nearest neighbors.

Result
node score

(:Node {vector: [2.0, 8.0, 3.0], id: 3})

0.043478261679410934

(:Node {vector: [1.0, 4.0, 2.0], id: 1})

0.03703703731298447

Rows: 2