Hey folks,
I’m affected by the same issue as the original poster. I see this was ported over from Github back in 2020. @jarifibrahim mentioned they’d add it to the backlog, but I imagine as with most nontrivial projects, you probably get new tickets and feature requests quite often, and it can be hard to get to older things like this one. Is there any chance you could please revisit this?
I found BadgerDB recently, and was very impressed by it. I’d like to use it to replace SQLite as the catalog layer for a storage system. I’ve gotten to a point where the SQL queries are a significant portion of the system’s complexity, and Badger seems to be a great solution for that.
I need to store metadata about files, some of which are streamed in and stored in fragments. I need to reference individual fragments, for deduplication purposes. There can easily be 100k fragments in a file, potentially more. Of course, I need atomicity for storing a given file. However, with the limits I’ve seen described here, it seems there’s a good chance I won’t be able to store some files in a single transaction. Which is a non-starter as the DB would be in an inconsistent state with dangling references, if the process dies at the wrong time.
I also have other operations which need atomicity and can grow quite large, e.g. deleting a “partition” of sorts. A partition would typically have 100k-1M files, possibly more. This would be much easier to implement correctly with atomicity; I could probably get away with something like db.DropPrefix for the part of it if I design the schema just right, but there will be dangling references (e.g. indexes) which need updating. Without atomicity, that again means the process might die at the wrong time, and the DB will be left inconsistent.
I could try to implement a journaling layer on top of Badger, but that really defeats the whole purpose. I’m bound to get it subtly wrong in hard to detect ways, and would basically just be implementing a DBMS on top of another DB…
Some of this might be doable if we could write a tree of keys on a separate namespace (different prefix) and then atomically move it (i.e. rename all keys with that prefix). That way, we could do things in an RCU style, as long as the atomic rename didn’t have size limitations. The ideal solution, however, would be to have arbitrarily large transactions.
Could you please revive this request?