Understanding Commit Log Package

Hey @mrjn

I was reading up the commit log package and also the posting package to understand how they interact and work before I get to implementing Write Ahead Logs. I think I have some idea now. I had a couple of queries.

  1. What exactly does the cache do?
    Are mutations written to the cache and then to the logs?

  2. Could you give me a general picture of what happens here? How the cache, logs and the posting lists interact?

I think I’d have a very good understanding of this package by the weekend and then can start modifying it.

Let’s do a whiteboarding session? I’ll go pick up my Wacom tablet which is biting dust in the office, and put it to some use.

But just to quickly answer the questions:

  1. The problem I had with commit logs was that PLs were being lazily initialized. So, every time a PL gets init(), we had to replay the commit logs and check if it had anything for the PL in hand. That’s where Cache came handy. We’d lazily load the entire log in cache, so our PL inits would be faster; and all the PLs can reuse this cache. In retrospect, lazy init is complicated. We shouldn’t be doing it anymore.

  2. Mutations to PL get written to it’s 2 mutation layers, then written to the latest commit log and the cache corresponding to that commit log. This allows any new PL init() to automatically get the latest mutations from the cache, without hitting disk for the commit log.

1 Like

That would be awesome.

Might be later at night Sydney time. Don’t want to get stuck in traffic :-).

Yeah, we could do the whiteboarding session tomorrow too. Let’s get the release out today.

1 Like

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.