We see this too, in a cluster that doesn’t appear to be under stress. I suspect it’s due to the way dgraph handles a mixed read/write scenario and our dev team leaves the cluster over-provisioned in an attempt to compensate for this. I’m not convinced we understand the core issue.
So in addition to the above questions I would add;
- what is the recommended way to diagnose which queries are triggering these 429s?
- If dgraph is struggling in a mixed read/write scenario, what are best practices we can put in place to compensate?
- is there a reference implementation (in any language, but we use typescript), that shows how we should handle these 429s which are not based on any fixed rate limits that I’m aware of?