apoc.nlp.gcp.entities.stream
Procedure Apoc Extended
Returns a stream of entities for provided text
Signature
apoc.nlp.gcp.entities.stream(source :: ANY?, config = {} :: MAP?) :: (node :: NODE?, value :: MAP?, error :: MAP?)
Config parameters
The procedure support the following config parameters:
name | type | default | description |
---|---|---|---|
key |
String |
null |
API Key for Google Natural Language API |
nodeProperty |
String |
text |
The property on the provided node that contains the unstructured text to be analyzed |
Install Dependencies
The NLP procedures have dependencies on Kotlin and client libraries that are not included in the APOC Extended library.
These dependencies are included in apoc-nlp-dependencies-5.21.0-all.jar, which can be downloaded from the releases page.
Once that file is downloaded, it should be placed in the plugins
directory and the Neo4j Server restarted.
Setting up API Key
We can generate an API Key that has access to the Cloud Natural Language API by going to console.cloud.google.com/apis/credentials. Once we’ve created a key, we can populate and execute the following command to create a parameter that contains these details.
apiKey
parameter:param apiKey => ("<api-key-here>")
Alternatively we can add these credentials to apoc.conf
and load them using the static value storage functions.
apoc.static.gcp.apiKey=<api-key-here>
apoc.conf
RETURN apoc.static.getAll("gcp") AS gcp;
gcp |
---|
{apiKey: "<api-key-here>"} |
Usage Examples
The examples in this section are based on the following sample graph:
CREATE (:Article {
uri: "https://neo4j.com/blog/pokegraph-gotta-graph-em-all/",
body: "These days I’m rarely more than a few feet away from my Nintendo Switch and I play board games, card games and role playing games with friends at least once or twice a week. I’ve even organised lunch-time Mario Kart 8 tournaments between the Neo4j European offices!"
});
CREATE (:Article {
uri: "https://en.wikipedia.org/wiki/Nintendo_Switch",
body: "The Nintendo Switch is a video game console developed by Nintendo, released worldwide in most regions on March 3, 2017. It is a hybrid console that can be used as a home console and portable device. The Nintendo Switch was unveiled on October 20, 2016. Nintendo offers a Joy-Con Wheel, a small steering wheel-like unit that a Joy-Con can slot into, allowing it to be used for racing games such as Mario Kart 8."
});
We can use this procedure to extract the entities from the Article node.
The text that we want to analyze is stored in the body
property of the node, so we’ll need to specify that via the nodeProperty
configuration parameter.
The following streams the entities for the Pokemon article:
MATCH (a:Article {uri: "https://neo4j.com/blog/pokegraph-gotta-graph-em-all/"})
CALL apoc.nlp.gcp.entities.stream(a, {
key: $apiKey,
nodeProperty: "body"
})
YIELD value
UNWIND value.entities AS entity
RETURN entity;
entity |
---|
{name: "card games", salience: 0.17967656, metadata: {}, type: "CONSUMER_GOOD", mentions: [{type: "COMMON", text: {content: "card games", beginOffset: -1}}]} |
{name: "role playing games", salience: 0.16441391, metadata: {}, type: "OTHER", mentions: [{type: "COMMON", text: {content: "role playing games", beginOffset: -1}}]} |
{name: "Switch", salience: 0.143287, metadata: {}, type: "OTHER", mentions: [{type: "COMMON", text: {content: "Switch", beginOffset: -1}}]} |
{name: "friends", salience: 0.13336793, metadata: {}, type: "PERSON", mentions: [{type: "COMMON", text: {content: "friends", beginOffset: -1}}]} |
{name: "Nintendo", salience: 0.12601112, metadata: {mid: "/g/1ymzszlpz"}, type: "ORGANIZATION", mentions: [{type: "PROPER", text: {content: "Nintendo", beginOffset: -1}}]} |
{name: "board games", salience: 0.08861496, metadata: {}, type: "CONSUMER_GOOD", mentions: [{type: "COMMON", text: {content: "board games", beginOffset: -1}}]} |
{name: "tournaments", salience: 0.0603245, metadata: {}, type: "EVENT", mentions: [{type: "COMMON", text: {content: "tournaments", beginOffset: -1}}]} |
{name: "offices", salience: 0.034420907, metadata: {}, type: "LOCATION", mentions: [{type: "COMMON", text: {content: "offices", beginOffset: -1}}]} |
{name: "Mario Kart 8", salience: 0.029095741, metadata: {wikipedia_url: "https://en.wikipedia.org/wiki/Mario_Kart_8", mid: "/m/0119mf7q"}, type: "PERSON", mentions: [{type: "PROPER", text: {content: "Mario Kart 8", beginOffset: -1}}]} |
{name: "European", salience: 0.020393685, metadata: {mid: "/m/02j9z", wikipedia_url: "https://en.wikipedia.org/wiki/Europe"}, type: "LOCATION", mentions: [{type: "PROPER", text: {content: "European", beginOffset: -1}}]} |
{name: "Neo4j", salience: 0.020393685, metadata: {mid: "/m/0b76t3s", wikipedia_url: "https://en.wikipedia.org/wiki/Neo4j"}, type: "ORGANIZATION", mentions: [{type: "PROPER", text: {content: "Neo4j", beginOffset: -1}}]} |
{name: "8", salience: 0, metadata: {value: "8"}, type: "NUMBER", mentions: [{type: "TYPE_UNKNOWN", text: {content: "8", beginOffset: -1}}]} |
We get back 12 different entities.
We could then apply a Cypher statement that creates one node per entity and an ENTITY
relationship from each of those nodes back to the Article
node.
The following streams the entities for the Pokemon article and then creates nodes for each entity:
MATCH (a:Article {uri: "https://neo4j.com/blog/pokegraph-gotta-graph-em-all/"})
CALL apoc.nlp.gcp.entities.stream(a, {
key: $apiKey,
nodeProperty: "body"
})
YIELD value
UNWIND value.entities AS entity
MERGE (e:Entity {name: entity.name})
SET e.type = entity.type
MERGE (a)-[:ENTITY]->(e);
If we want to automatically create an entity graph, see apoc.nlp.gcp.entities.graph.