I use dgraph v22. When I delete all nodes and releations, the size of p folder increased 100G, mostly is the vlog files, and sst file also did not decrease.
I used following Nquad to delete all nodes and releations:
Without knowing the specifics of your graph, I ran the following test:
Load the 1million movie database via live loader, ls -l of the p directory:
/Users/matthew/test-dgraph-data/p:
total 152224
-rw-r--r-- 1 matthew staff 22424252 Dec 28 13:34 000001.sst
-rw-r--r-- 1 matthew staff 2147483646 Dec 28 13:27 000001.vlog
-rw-r--r-- 1 matthew staff 21006320 Dec 28 13:34 000002.sst
-rw-r--r-- 1 matthew staff 134217728 Dec 28 13:34 00003.mem
-rw-r--r-- 1 matthew staff 1048576 Dec 28 13:27 DISCARD
-rw------- 1 matthew staff 28 Dec 28 13:26 KEYREGISTRY
-rw-r--r-- 1 matthew staff 2 Dec 28 13:26 LOCK
-rw------- 1 matthew staff 44 Dec 28 13:34 MANIFEST
-rw-r--r-- 1 matthew staff 22424252 Dec 28 13:34 000001.sst
-rw-r--r-- 1 matthew staff 2147483646 Dec 28 13:59 000001.vlog
-rw-r--r-- 1 matthew staff 21006320 Dec 28 13:34 000002.sst
-rw-r--r-- 1 matthew staff 18763120 Dec 28 13:59 000003.sst
-rw-r--r-- 1 matthew staff 134217728 Dec 28 13:59 00004.mem
-rw-r--r-- 1 matthew staff 1048576 Dec 28 13:27 DISCARD
-rw------- 1 matthew staff 28 Dec 28 13:26 KEYREGISTRY
-rw-r--r-- 1 matthew staff 2 Dec 28 13:26 LOCK
-rw------- 1 matthew staff 58 Dec 28 13:59 MANIFEST
Note that the write-ahead/value log (.vlog) didn’t increase in size from the initial 2GB size. Now this was a fresh graph, so guessing things are different for you if the size increase was >100GB.
One thing I’ve done in the past is to backup the db using badger, which seems to “clear out” (my words) the vlog. Stop the cluster and then badger backup --dir p. After this, the vlogs are smaller:
total 136264
-rw-r--r-- 1 matthew staff 22424252 Dec 28 13:34 000001.sst
-rw-r--r-- 1 matthew staff 1144245 Dec 28 14:15 000001.vlog
-rw-r--r-- 1 matthew staff 21006320 Dec 28 13:34 000002.sst
-rw-r--r-- 1 matthew staff 20 Dec 28 14:17 000002.vlog
-rw-r--r-- 1 matthew staff 18763120 Dec 28 13:59 000003.sst
-rw-r--r-- 1 matthew staff 5359340 Dec 28 14:15 000004.sst
-rw-r--r-- 1 matthew staff 1048576 Dec 28 13:27 DISCARD
-rw------- 1 matthew staff 28 Dec 28 13:26 KEYREGISTRY
-rw------- 1 matthew staff 72 Dec 28 14:17 MANIFEST
One final note, if your goal is to remove ALL nodes (as you mentioned in the title), a more efficient way is available thru the admin endpoint: curl -X POST localhost:8080/alter -d '{"drop_op": "DATA"}'
I won’t pretend to understand when/why new .sst files (kv-stores) are created, but at some point RunValueLogGC (a badger function) will be called by the alpha to clean up vlogs. Also, running badger info on the p directory actually deleted one of the stale vlogs.
Check out this thread for more details on how to flatten sst files and other insights on how Dgraph manages the files in the p folder: Database becomes much smaller when reimported
Yes, at some point the alpha seems to get around to calling RunValueLogGC and things get cleaned up.
One other thing I’ve discovered is that if I invoke /admin/shutdown to stop an alpha, the p folder is in a better state than if I simply stop the container (SIGTERM) the alpha.