Should you choose graph database over relational database?
This was the question I was trying to answer for work. I need to store network topology in a database. Which means, the data is certainly a (one or more) graph(s).
The key thing to consider is that graph databases store relations are treated separately. In a graph, the relations are translated to edges.
Now, let us think of the operations one might want to do on the type of data - network topology in our case. Following are some examples and what we might have to do to fetch the data:
|Operation||Relational DB||Graph DB|
|Get config info of a switch||Query based on property search||Query based on property search|
|Get immediate neighbors of a switch||Query a switch, query switches how are related to the parent||Query nodes whose parent is the given switch|
|Get full topology given a switch||Recursively query switches||Query all related switch of a given switch|
|Get full topology||Get all records||Get all nodes and edges|
|Update information of a switch||Find the switch, update the property||Find the switch, update the property|
|Remove a switch and connected devices||Recursively find the devices, remove them, remove the switch||Remove all the nodes related to the given switch|
As you can see, the difference is in doing the recursive calls. All other operations are fairly similar in nature. But when querying nested records or records related over multiple relations, SQL will have you write recursive queries. Graph database’s languages handle this for you.
One might argue - “So what? Graph database will be slightly more efficient given my little amount of data, and it anyway makes multiple queries. With relational database, I have to write them, with graph database, the system writes them for me.”. This might be true, but let us look at the time complexity. With graph database, such a recursive operation would take O(log(n)) + O(1) time; while it would take multiple O(log(n)) for lookups, then more time to join it. Of course, the point of “little amount of data” is valid. If there is not much data, the time logarithmic time complexity won’t help much.
But wait! This is not all! We live in a time when “cloud” is preferred. That means, you will end up sending each query over network connection while doing recursive calls! And this alone is good enough an argument for me to choose graph systems over relational system in this use case.
I hope this analysis gives you an idea as to how to go about evaluating for yourself.