Groupby on multiple fields using DQL

Question(Big O notation related), who is your client that needs 100k nodes, each node with N predicates. Each predicate with N number of characters? I think no DB can handle requests like 100k per second 24 hours per day, 7 days per week and no cool down. Such query must be eventually. Not always. Right? Obviously I’m exaggerating to expand our interpretation of things a bit. I know you’re not at that level of resource consumption. But in part your question implies resources usage.

If it’s a big query done eventually. Dgraph handles it perfectly. If it’s a series of concurrent queries(In this case, 10k or 100k nodes divided into many requests over the hours), Dgraph handles them perfectly. And it can go further if you have a well-planned Cluster along with well-planned clients and application.

But 10k nodes per request and per client is very complicated. If you have 1 million customers making a single request per second. Certainly their average consumption would be between 1% to 10%. That is, between 10 thousand nodes sec and 100 thousand nodes. In this context, you will already have a successful project(with a lot of money) and your cluster will certainly be very robust and Dgraph will support so many requests.

Application logic gives you freedom. To not rely on the DB only to grow. And change fast when you need to. Also, let’s say you are bored with Dgraph. What’s your plan to move out? as you are completely dependent on us? This change would be painful for you. Never depend in a single thing. Always have plan B, C, D, E… for everything in life. I love Dgraph, but I would be a fool if I say to myself that I can only rely in a single thing instead of create a mix of things and build my own things. I see no self growing thinking like this. IMHO.

By now I think I’m being just obvious, right? I think everyone thinks like that. Right?

1 Like