One Hot Encoding
The One Hot Encoding function is used to convert categorical data into a numerical format that can be used by Machine Learning libraries.
This feature is in the alpha tier. For more information on feature tiers, see API Tiers.
One Hot Encoding sample
One hot encoding will return a list equal to the length of the available values
.
In the list, selected values
are represented by 1
, and unselected values
are represented by 0
.
The following will run the algorithm on hardcoded lists:
RETURN gds.alpha.ml.oneHotEncoding(['Chinese', 'Indian', 'Italian'], ['Italian']) AS value
value |
---|
[0,0,1] |
The following will create a sample graph:
CREATE (french:Cuisine {name:'French'}),
(italian:Cuisine {name:'Italian'}),
(indian:Cuisine {name:'Indian'}),
(zhen:Person {name: "Zhen"}),
(praveena:Person {name: "Praveena"}),
(michael:Person {name: "Michael"}),
(arya:Person {name: "Arya"}),
(praveena)-[:LIKES]->(indian),
(zhen)-[:LIKES]->(french),
(michael)-[:LIKES]->(french),
(michael)-[:LIKES]->(italian)
The following will return a one hot encoding for each user and the types of cuisine that they like:
MATCH (cuisine:Cuisine)
WITH cuisine
ORDER BY cuisine.name
WITH collect(cuisine) AS cuisines
MATCH (p:Person)
RETURN p.name AS name, gds.alpha.ml.oneHotEncoding(cuisines, [(p)-[:LIKES]->(cuisine) | cuisine]) AS value
ORDER BY name
name | value |
---|---|
Arya |
[0,0,0] |
Michael |
[1,0,1] |
Praveena |
[0,1,0] |
Zhen |
[1,0,0] |
Name | Type | Default | Optional | Description |
---|---|---|---|---|
|
list |
null |
yes |
The available values. If null, the function will return an empty list. |
|
list |
null |
yes |
The selected values. If null, the function will return a list of all 0’s. |
Type | Description |
---|---|
|
One hot encoding of the selected values. |