Interesting performance issue with high cardinality indexing

Hey @dmai!

Yep we do have the @count index in our schema.

To clarify, here’s what our actual schema looks like, please note I’ve also updated the posted github gist of our queries to match the below:

item_canonicalhash      string  exact
tagkey_name             string  exact
tagkey_value            uid     reverse
tagvalue_name           string  exact, trigram
tagvalue_item           uid     reverse         count

A typical nquad mutation for the above would be:

_:item <item_canonicalhash> "90126705c8a72b7978b61be654da1fa4" .

_:tagserver <tagkey_name> "server" .
_:tagserver <tagkey_value> _:value _:valueserver-0001 .
_:valueserver-0001 <tagvalue_name> "server-0001" .
_:valueserver-0001 <tagvalue_item> _:item .

_:tagdc <tagkey_name> "dc" .
_:tagdc <tagkey_value> _:value _:valuedc1 .
_:valuedc1 <tagvalue_name> "dc1" .
_:valuedc1 <tagvalue_item> _:item .

_:tagrack <tagkey_name> "rack" .
_:tagrack <tagkey_value> _:value _:valuerack-0001 .
_:valuerack-0001 <tagvalue_name> "rack-0001" .
_:valuerack-0001 <tagvalue_item> _:item .

We have three “types” of node:

  • tagkeys
  • tagvalues
  • items

tagkeys to tagvalues are 1..N
tagvalues to items aslo 1..N

We search for items by intersecting sets of items found by tagkey/tagvalue patterns.
However if a tagkey/tagvalue pair was not used for filtering that does not matter, we still return all pairs associated with the items in the result set.

When using multiple filters to intersect the result sets the ordering mattered for performance. We’re looking for a way to ensure that the ordering of our tag_value filters results in the most optimal query, as choosing the wrong “root” tag_value filter (eg. the @filter(regexp(tagvalue_name, /web1000.*/)) from our query example) can result in extremely slow response times.

We are currently using the “indexed count query” as a way of figuring out beforehand which filter matches the least items to ensure we build the optimal query but this is still quite slow, we’re pretty sure we’re doing something wrong :wink:

If there is anything you can suggest we do to optimise this that would be so helpful! We’re currently evaluating dgraph and it looks very promising apart from this one stumbling block.