Reference to earlier discussion - Implement multiple store backends by pdf · Pull Request #127 · dgraph-io/dgraph · GitHub
I suppose we are deliberating two related but different questions.
Should Dgraph have pluggable architecture for its storage backend?
This question is more with to do with the philosophy behind the software and what we want it to be than the actual design.
There are many popular and successful open source project both with and without pluggable architectures. I would continue with the example of two of the most successful open source databases – MySQL and Postgres.
-
MySQL does have pluggable storage engines – which means when a person needs to pick up MySQL they also have to make another choice about their requirements. MySQL’s philosophy is “user knows best – give them options and let them choose”. It works well, but this flexibility has some costs.
-
Usability is a feature - The user must be knowledgeable enough to make the right decision for their needs. As the number of such pluggable options becomes significant in number – say even 2 or 3 pluggable components would get us 4 to 8 different possible combinations. A non-expert user would find it overwhelming.
-
The performance of MySQL as the database – hence the reputation – is limited by the quality of their pluggable components. Which may not be the best place to be if MySQL does not have inherent control over those components.
-
The product is bound by the subset of the functionality of dependencies. If the pluggable component A has 3 features and pluggable component B has 5 features, the product – in order to be extensible and flexible across all pluggable components – can only chose the 3 overlapping features.
-
Postgres on the other hand chose to not make their components configurable – which means when a person picks up Postgres, they do not have the flexibility about the internals. Postgres’ philosophy is “We know best – lets model and worry about the internal architecture as best as we can”. While this has the downside of not being flexible enough, these lack of flexibility has some advantages.
-
The user can just pick up Postgres, know SQL, and run with it.
-
It also allows postgres to have complete control over the storage engine, and tweak it as they see fit.
In Dgraph’s context, tomorrow we can get rid of the third party engine and write our own, if there is any value (eg. performance, additional feature etc.) -
It makes understanding, contributing to Postgres arguably comparatively easier as there are less moving pieces.
I am not trying to convince you that a graph database with pluggable storage engine is not a valid approach as it absolutely is. Just that dgraph is choosing to go with another approach also valid – historically for other successful projects and hopefully for us as well.
Which storage backend to use – RocksDB vs. Bolt vs Custom vs xyz … ?
I do not know enough about Rocks, Bolt, Go, Cgo or even Graphs for that matter to have a qualified opinion – so I’ll refrain to have a judgement on it one way or the other. I, however trust @mrjn and the rest of dgraph team @core-devs spent enough brain cycles to arrive at a robust decision. I’ll let them do the talking on it, and convince you, if they choose to do so. I would however mention, in my experience, at the early stage of a startup/project few good faith decisions are to be made based on hunch, past experiences, theory and common sense.