String functions
String functions operate on string expressions only, and will return an error if used on any other values.
The exception to this rule is toString()
, which also accepts numbers, booleans and temporal values (i.e. DATE
, ZONED TIME`
LOCAL TIME
, ZONED DATETIME
, LOCAL DATETIME
or DURATION
values).
Functions taking a STRING
as input all operate on Unicode characters rather than on a standard char[]
.
For example, the size()
function applied to any Unicode character will return 1
, even if the character does not fit in the 16 bits of one char
.
When |
See also String operators.
btrim()
Syntax |
|
||
Description |
Returns the given |
||
Arguments |
Name |
Type |
Description |
|
|
A value from which the leading and trailing trim character will be removed. |
|
|
|
A character to be removed from the start and end of the given string. |
|
Returns |
|
|
|
|
|
If |
RETURN btrim(' hello '), btrim('xxyyhelloxyxy', 'xy')
btrim(' hello') | btrim('xxyyhelloxyxy', 'xy') |
---|---|
|
|
Rows: 1 |
left()
Syntax |
|
||
Description |
Returns a |
||
Arguments |
Name |
Type |
Description |
|
|
A string value whose rightmost characters will be trimmed. |
|
|
|
The length of the leftmost characters to be returned. |
|
Returns |
|
|
|
|
If |
If |
RETURN left('hello', 3)
left('hello', 3) |
---|
|
Rows: 1 |
lower()
Syntax |
|
||
Description |
Returns the given |
||
Arguments |
Name |
Type |
Description |
|
|
A string to be converted into lowercase. |
|
Returns |
|
This function is an alias to the toLower()
function, and it was introduced as part of Cypher®'s GQL conformance.
|
RETURN lower('HELLO')
lower('HELLO') |
---|
|
Rows: 1 |
ltrim()
Syntax |
|
||
Description |
Returns the given |
||
Arguments |
Name |
Type |
Description |
|
|
A value from which the leading trim character will be removed. |
|
|
|
A character to be removed from the start of the given string. |
|
Returns |
|
|
|
|
|
As of Neo4j 5.20, a |
RETURN ltrim(' hello'), ltrim('xxyyhelloxyxy', 'xy')
ltrim(' hello') | ltrim('xxyyhelloxyxy', 'xy') |
---|---|
|
|
Rows: 1 |
normalize()
Syntax |
|
||
Description |
Normalize a |
||
Arguments |
Name |
Type |
Description |
|
|
A value to be normalized. |
|
|
|
A keyword specifying any of the normal forms; NFC, NFD, NFKC or NFKD. |
|
Returns |
|
Unicode normalization is a process that transforms different representations of the same string into a standardized form. For more information, see the documentation for Unicode normalization forms. |
The normalize()
function is useful for converting STRING
values into comparable forms.
When comparing two STRING
values, it is their Unicode codepoints that are compared.
In Unicode, a codepoint for a character that looks the same may be represented by two, or more, different codepoints.
For example, the character <
can be represented as \uFE64
(﹤) or \u003C
(<).
To the human eye, the characters may appear identical.
However, if compared, Cypher will return false as \uFE64
does not equal \u003C
.
Using the normalize()
function, it is possible to
normalize the codepoint \uFE64
to \u003C
, creating a single codepoint representation, allowing them to be successfully compared.
|
RETURN normalize('\u212B') = '\u00C5' AS result
result |
---|
|
Rows: 1 |
To check if a STRING
is normalized, use the IS NORMALIZED
operator.
normalize() with specified normal form
There are two main types of normalization forms:
-
Canonical equivalence: The
NFC
(default) andNFD
are forms of canonical equivalence. This means that codepoints that represent the same abstract character will be normalized to the same codepoint (and have the same appearance and behavior). TheNFC
form will always give the composed canonical form (in which the combined codes are replaced with a single representation, if possible). The`NFD` form gives the decomposed form (the opposite of the composed form, which converts the combined codepoints into a split form if possible). -
Compatability normalization:
NFKC
andNFKD
are forms of compatibility normalization. All canonically equivalent sequences are compatible, but not all compatible sequences are canonical. This means that a character normalized inNFC
orNFD
should also be normalized inNFKC
andNFKD
. Other characters with only slight differences in appearance should be compatibly equivalent.
For example, the Greek Upsilon with Acute and Hook Symbol ϓ
can be represented by the Unicode codepoint: \u03D3
.
-
Normalized in
NFC
:\u03D3
Greek Upsilon with Acute and Hook Symbol (ϓ) -
Normalized in
NFD
:\u03D2\u0301
Greek Upsilon with Hook Symbol + Combining Acute Accent (ϓ) -
Normalized in
NFKC
:\u038E
Greek Capital Letter Upsilon with Tonos (Ύ) -
Normalized in
NFKD
:\u03A5\u0301
Greek Capital Letter Upsilon + Combining Acute Accent (Ύ)
In the compatibility normalization forms (NFKC
and NFKD
) the character is visibly different as it no longer contains the hook symbol.
RETURN normalize('\uFE64', NFKC) = '\u003C' AS result
result |
---|
|
Rows: 1 |
replace()
Syntax |
|
||
Description |
Returns a |
||
Arguments |
Name |
Type |
Description |
|
|
The string to be modified. |
|
|
|
The value to replace in the original string. |
|
|
|
The value to be inserted in the original string. |
|
Returns |
|
If any argument is |
If |
RETURN replace("hello", "l", "w")
replace("hello", "l", "w") |
---|
|
Rows: 1 |
reverse()
Syntax |
|
||
Description |
Returns a |
||
Arguments |
Name |
Type |
Description |
|
|
The string or list to be reversed. |
|
Returns |
|
|
See also List functions → reverse. |
RETURN reverse('palindrome')
reverse('palindrome') |
---|
|
Rows: 1 |
right()
Syntax |
|
||
Description |
Returns a |
||
Arguments |
Name |
Type |
Description |
|
|
A string value whose leftmost characters will be trimmed. |
|
|
|
The length of the rightmost characters to be returned. |
|
Returns |
|
|
|
|
If |
If |
RETURN right('hello', 3)
right('hello', 3) |
---|
|
Rows: 1 |
rtrim()
Syntax |
|
||
Description |
Returns the given |
||
Arguments |
Name |
Type |
Description |
|
|
A value from which the leading and trailing trim character will be removed. |
|
|
|
A character to be removed from the start and end of the given string. |
|
Returns |
|
|
|
|
|
As of Neo4j 5.20, a |
RETURN rtrim('hello '), rtrim('xxyyhelloxyxy', 'xy')
rtrim('hello ') | rtrim('xxyyhelloxyxy', 'xy') |
---|---|
|
|
Rows: 1 |
split()
Syntax |
|
||
Description |
Returns a |
||
Arguments |
Name |
Type |
Description |
|
|
The string to be split. |
|
|
|
The string with which to split the original string. |
|
Returns |
|
|
|
RETURN split('one,two', ',')
split('one,two', ',') |
---|
|
Rows: 1 |
substring()
Syntax |
|
||
Description |
Returns a substring of a given |
||
Arguments |
Name |
Type |
Description |
|
|
The string to be shortened. |
|
|
|
The start position of the new string. |
|
|
|
The length of the new string. |
|
Returns |
|
|
If |
If |
If either |
If |
If |
RETURN substring('hello', 1, 3), substring('hello', 2)
substring('hello', 1, 3) | substring('hello', 2) |
---|---|
|
|
Rows: 1 |
toLower()
Syntax |
|
||
Description |
Returns the given |
||
Arguments |
Name |
Type |
Description |
|
|
A string to be converted into lowercase. |
|
Returns |
|
|
RETURN toLower('HELLO')
toLower('HELLO') |
---|
|
Rows: 1 |
toString()
Syntax |
|
||
Description |
Converts an |
||
Arguments |
Name |
Type |
Description |
|
|
A value to be converted into a string. |
|
Returns |
|
|
If |
This function will return an error if provided with an expression that is not an |
RETURN
toString(11.5),
toString('already a string'),
toString(true),
toString(date({year: 1984, month: 10, day: 11})) AS dateString,
toString(datetime({year: 1984, month: 10, day: 11, hour: 12, minute: 31, second: 14, millisecond: 341, timezone: 'Europe/Stockholm'})) AS datetimeString,
toString(duration({minutes: 12, seconds: -60})) AS durationString
toString(11.5) | toString('already a string') | toString(true) | dateString | datetimeString | durationString |
---|---|---|---|---|---|
|
|
|
|
|
|
Rows: 1 |
toStringOrNull()
Syntax |
|
||
Description |
Converts an |
||
Arguments |
Name |
Type |
Description |
|
|
A value to be converted into a string or null. |
|
Returns |
|
|
If the |
RETURN toStringOrNull(11.5),
toStringOrNull('already a string'),
toStringOrNull(true),
toStringOrNull(date({year: 1984, month: 10, day: 11})) AS dateString,
toStringOrNull(datetime({year: 1984, month: 10, day: 11, hour: 12, minute: 31, second: 14, millisecond: 341, timezone: 'Europe/Stockholm'})) AS datetimeString,
toStringOrNull(duration({minutes: 12, seconds: -60})) AS durationString,
toStringOrNull(['A', 'B', 'C']) AS list
toStringOrNull(11.5) | toStringOrNull('already a string') | toStringOrNull(true) | dateString | datetimeString | durationString | list |
---|---|---|---|---|---|---|
|
|
|
|
|
|
|
Rows: 1 |
toUpper()
Syntax |
|
||
Description |
Returns the given |
||
Arguments |
Name |
Type |
Description |
|
|
A string to be converted into uppercase. |
|
Returns |
|
|
RETURN toUpper('hello')
toUpper('hello') |
---|
|
Rows: 1 |
trim()
Syntax |
|
||
Description |
Returns the given |
||
Arguments |
Name |
Type |
Description |
|
|
The parts of the string to trim; LEADING, TRAILING, BOTH |
|
|
|
The characters to be removed from the start and/or end of the given string. |
|
|
|
A value from which all leading and/or trailing trim characters will be removed. |
|
Returns |
|
|
|
|
|
As of Neo4j 5.20, a |
RETURN trim(' hello '), trim(BOTH 'x' FROM 'xxxhelloxxx')
trim(' hello ') | trim(BOTH 'x' FROM 'xxxhelloxxx') |
---|---|
|
|
Rows: 1 |
upper()
Syntax |
|
||
Description |
Returns the given |
||
Arguments |
Name |
Type |
Description |
|
|
A string to be converted into uppercase. |
|
Returns |
|
This function is an alias to the toUpper()
function, and it was introduced as part of Cypher’s GQL conformance.
|
RETURN upper('hello')
upper('hello') |
---|
|
Rows: 1 |