Sorting and indexing

Yeah, that’s what I had in mind. Unless the query specifies the minimum year, we’d just iterate over $dob, and find the min year from there, do the intersections etc. Once we find the number of results we’re looking for, we’ll stop. But, remember to always pick up all the reults from each bucket in full. Because the uids within the same bucket aren’t sorted by dob, they’re sorted by uint64 themselves.

Not sure, why the hash map has to do anything here. When we iterate over rocksdb, we can get the key, then ask for that PL. The optimization that we could make here is to only retrieve keys from RocksDB and not values. That way, our existing logic would work just fine.

That is something that our clients should decide. If they have small data which can fit entirely in RAM; or have deeper pockets to fit all data on RAM, they could pass a flag to Dgraph, which would instruct us to open the tables using that particular setting. This isn’t something we should use by default.

Iteration seems like a simpler logic here. If you have other ideas, which are similar in terms of simplicity, we could also discuss them. But, try to avoid complexity early on. Iteration might just give us all we need, and/or we might be able to cache these keys in local memory to help us for further lookups, etc.

Note that we’ll stop iterating once we receive N results, where we’ll have a default max value of N, if the client doesn’t specify it. So, it still won’t be prohibitively expensive.