In my case, the snapshot creation process started and took a considerable amount of RAM, causing dgraph alpha to die due OOM. I’ve solved it by increasing the RAM memory available from 8GB to 16GB and now I would like to know more about the snapshots in dgraph. Unfortunately I could not find so much in the documentation so here are my questions:
How often is it run?
According to my findings it happens when a certain number of queries have been committed, am I right?
Is it possible to run it manually? (hence running it more often and reducing the quantity of RAM used)
What does the process internally? Is it for log compaction?
Is it safe to start alpha again after OOM or could there be data corruption?
dgraph alpha -h | grep snapshot
--snapshot_after int
Create a new Raft snapshot after this many number of Raft entries.
The lower this number, the more frequent snapshot creation would be.
Also determines how often Rollups would happen. (default 10000)
Yes, and I think it is a combination of things.
From docs:
When an Alpha pod restarts in a replicated cluster, it will join as a new member of the cluster, be assigned a group and an unused index from Zero, and receive the latest snapshot from the Alpha leader of the group.
Dgraph supports distributed ACID transactions through snapshot isolation.
One of the ways to do this is snapshotting. As soon as the state machine is synced to disk, the logs can be discarded. Get started with Dgraph