Rollup - Error in readListPart()

@Tom_Hollingworth This was an issue before v22. If you are on the latest data it shouldn’t happen to you anymore. I will explain the reason, when the posting list grows in size, it gets split. We write these split keys indiviually in badger and the main original key just store the list of the split keys. Whenever we write new split keys, we delete the old split keys. Now here was the bug. If just after you delet teh old split keys, there’s a cluster restart, the main key would get overwritten due to wal replay. It will forget it made new splits. It will try to go back to the old splits, which have been deleted.
There was a fix for wal replay in v22, but it was causing other inconsistency issues instead. But the essence of the fix is the same, we prevent this overwrite during restart.
There’s a way to fix this corruption. I wrote a script to fix the p directory about 3 years ago, let me see if I can find it, but mostly it’s lost. Basically what you need to do is, find the new timestamp for the lost split key, then update the main key to point to that new timestamp.