If google would use dgraph, should it use for everything one single dgraph DB, or for every service (Maps, YouTube, GMail...) an own dgraph DB?

thanks a lot now I understand that setup! The photo is a 3x4node HA(high availability) dgraph cluster setup. Every alpha node is an own machine. A2 A5 A8 are replicas (Z2 manages them), A1 A4 A7 are replicas (Z1 manages them), A3 A6 A9 (Z3 manages them). The groups are as you said logical containers for these machines because they are replicas of each other, they all have the same data

but that means, if we forget now multi-tenant instancing and HA (even though it would be the same it doesnt matter if we have multitenant or HA or not), and have a basic sharding setup, that dgraph will still have a bad performance with many different data (because of services). Because when it comes to sharding to scale horizontally, dgraph will shard/balance predicated based on disk usage. This will create a big mix because dgraph doesn’t care about whether the data belongs together or not (so it will mix different services). So we will have many hops between disks and this will cause latency. Is that true now or not? Because I can remember dgraph had some kind of 3-predicate architecture (that solves the issue I read on the website in the dgraph introduction) but it wasn’t explained that much further (it’s explained in the whitepaper but I understand only banana reading the whitepaper)

can you guys maybe shed a light on me (._.)

also you said ‘dgraph will balance the predicates’ NOT ‘the data’, that means there is some more logic behind that for better performance, so that one dgraph database is able to manage multiple services. can you maybe explain that 3 predicate thing .-.