Multiple (Unique) Graph Databases On Same Server (Neo4j)
Prerequisites for this article
- General understanding of graph databases
- General understanding of Neo4j
While working on a product at It Key Nexus we reached at an interesting situation.
It was an NLP
related project where we were mapping entities on nodes
in a graph database
. We were using Neo4j
to achieve that.
Problem
Anyway, we had one instance of Neo4j
server running and what we wanted to do was to create multiple independent databases on it. Seems simple enough right?
Well, what caused the problem was, if there were two nodes with same data (which is totally possible, like two same values inputted by two different users), Neo4j
created two nodes and by itself assigned them two different ids
. Like if we run the below statement twice
CREATE (Person: p {title: 'Jon Doe'})
Neo4j created two nodes like this
It didn’t (obviously) met our business requirements as even though the nodes had same data they might not necessarily have the same relations with other nodes right? And how would we retrieve a certain node and its related nodes if we don’t know the id
assigned to it by Neo4j
?
Solution
To tackle this problem, what we did was, we added another property to the node,like this
CREATE (Person: p {title: 'Jon Doe', assigned_id: '1'})
That property id
is unique and is a part of every node belonging to a certain database. Now when we wanted to retrieve a certain database/graph, what we did was just add a condition in where
clause to retrieve only the nodes belonging to a certain database i.e. having id
equal to a unique value we assigned to a certain database, something like this
MATCH (e: Person)-[]->(l: Location)
WHERE e.assigned_id = '$id' AND l.assigned_id = '$id'
RETURN (e)-[]->(l);
Now even if some nodes had same data they would never have same id
thus making or not making it a part of a certain database and giving us power to create multiple independent instances of it in different databases on same server. Cool isn’t it?