Arrays, dense, sparse and queries

Hello guys,

I’m starting to play with dgraph. I’m wondering some things in my workflow case: dense and sparse arrays. I would like to know more how/if dgraph/badger support this.

As always an example is the shortest path:

Here we got plenty of hierarchical files (hdf5), which consists of mostly 3d to 1d arrays - series of images and timeseries. As always every file is accompanied by a plethora of metadata [variable names, times, parameters , geo-location, etc]. Sometimes, what we call metadata is an array sometimes is a simple number. Our idea is to enable ourselves to cut and dice this dataset through a single layer of requests - dgraph/badger.

Dgraph sounds perfect to our metadata exploration and type of queries but I’m wondering about some operations: like time slicing. example: pick me the sensors of type A, within region B, which failed between 0am to 12pm . Failed here means X[X<0]. The queries can be more complex when x got more dimensions and is sparse. Are these kind of queries easy and sppedy?

Also there are some Specialized functions/lib we use for more deep analysis. Is this kind of thing easy to hook on dgraph for several

More simple questions: Any way of compressing the data in memory or Disk? I’m asking because some of our time series or images are sparse ( lot of missing values ) and hdf compress them beautifully. It’s easy to hook a file format/library so dgeaph write/read stuff in/from it?

Hello!

It strikes me that dgraph is great at dealing with varied and inconsistent data and a lot of the queries you talk about seem perfectly doable. Sadly, if you want to search by hour rather than date there appears to be no built-in way to do that, as I found out when I asked a question specifically about this:

I now just have two datetime triples on my nodes: one for the date and one for the time. If you find a more elegant solution please let me know!