UPDATE: We have upgraded the machine to have 96 GB RAM.
After figuring out w directory is reason for most of RAM consumption(it is the only thing shown in heap profile which consumes most of RAM), I ran flatten on w directory. With existing w and p directory RAM(RES) consumption was going upto 60 GB. Before running flatten on w directory sizes of p and w directories were as below:
510G ./p
28G ./w
358M ./zw
After running flatten on w directory, size of w directory was:
1.8GB ./w
After restart of cluster, I continued running live loader. Currently we have following size of directories:
701G ./p
27G ./w
545M ./zw
We can see that w directory has again gained the size.
I also captured time to open p(~700 GB) and w(~27 GB):
W directory
I0728 14:38:45.641366 18411 log.go:34] 99 tables out of 1525 opened in 3.012s
I0728 14:38:48.634343 18411 log.go:34] 206 tables out of 1525 opened in 6.005s
I0728 14:38:51.643964 18411 log.go:34] 325 tables out of 1525 opened in 9.015s
I0728 14:38:54.653474 18411 log.go:34] 424 tables out of 1525 opened in 12.025s
I0728 14:38:57.637190 18411 log.go:34] 528 tables out of 1525 opened in 15.008s
I0728 14:39:00.639431 18411 log.go:34] 599 tables out of 1525 opened in 18.01s
I0728 14:39:03.676010 18411 log.go:34] 683 tables out of 1525 opened in 21.047s
I0728 14:39:06.649594 18411 log.go:34] 748 tables out of 1525 opened in 24.021s
I0728 14:39:09.642489 18411 log.go:34] 850 tables out of 1525 opened in 27.014s
I0728 14:39:12.632476 18411 log.go:34] 961 tables out of 1525 opened in 30.004s
I0728 14:39:15.632171 18411 log.go:34] 1065 tables out of 1525 opened in 33.003s
I0728 14:39:18.632756 18411 log.go:34] 1186 tables out of 1525 opened in 36.004s
I0728 14:39:21.668542 18411 log.go:34] 1278 tables out of 1525 opened in 39.04s
I0728 14:39:24.646270 18411 log.go:34] 1401 tables out of 1525 opened in 42.017s
I0728 14:39:27.928709 18411 log.go:34] 1480 tables out of 1525 opened in 45.3s
I0728 14:39:29.747162 18411 log.go:34] All 1525 tables opened in 47.118s
P directory
I0728 14:39:33.312447 18411 log.go:34] 539 tables out of 6798 opened in 3.006s
I0728 14:39:36.329768 18411 log.go:34] 907 tables out of 6798 opened in 6.023s
I0728 14:39:39.321231 18411 log.go:34] 1136 tables out of 6798 opened in 9.015s
I0728 14:39:42.307248 18411 log.go:34] 1317 tables out of 6798 opened in 12.001s
I0728 14:39:45.308684 18411 log.go:34] 2051 tables out of 6798 opened in 15.002s
I0728 14:39:48.321764 18411 log.go:34] 2343 tables out of 6798 opened in 18.015s
I0728 14:39:51.312440 18411 log.go:34] 2436 tables out of 6798 opened in 21.006s
I0728 14:39:54.327446 18411 log.go:34] 2550 tables out of 6798 opened in 24.021s
I0728 14:39:57.312190 18411 log.go:34] 2671 tables out of 6798 opened in 27.006s
I0728 14:40:00.314967 18411 log.go:34] 2790 tables out of 6798 opened in 30.008s
I0728 14:40:03.314505 18411 log.go:34] 2896 tables out of 6798 opened in 33.008s
I0728 14:40:06.325143 18411 log.go:34] 3016 tables out of 6798 opened in 36.018s
I0728 14:40:09.492198 18411 log.go:34] 3138 tables out of 6798 opened in 39.186s
I0728 14:40:12.310874 18411 log.go:34] 3271 tables out of 6798 opened in 42.004s
I0728 14:40:15.412023 18411 log.go:34] 3409 tables out of 6798 opened in 45.105s
I0728 14:40:18.323595 18411 log.go:34] 3531 tables out of 6798 opened in 48.017s
I0728 14:40:21.395090 18411 log.go:34] 3647 tables out of 6798 opened in 51.088s
I0728 14:40:24.308413 18411 log.go:34] 3771 tables out of 6798 opened in 54.002s
I0728 14:40:27.383428 18411 log.go:34] 3906 tables out of 6798 opened in 57.077s
I0728 14:40:30.341263 18411 log.go:34] 4011 tables out of 6798 opened in 1m0.035s
I0728 14:40:33.350558 18411 log.go:34] 4147 tables out of 6798 opened in 1m3.044s
I0728 14:40:36.316099 18411 log.go:34] 4267 tables out of 6798 opened in 1m6.009s
I0728 14:40:39.309938 18411 log.go:34] 4395 tables out of 6798 opened in 1m9.003s
I0728 14:40:42.310931 18411 log.go:34] 4540 tables out of 6798 opened in 1m12.004s
I0728 14:40:45.321744 18411 log.go:34] 4654 tables out of 6798 opened in 1m15.015s
I0728 14:40:48.349800 18411 log.go:34] 4802 tables out of 6798 opened in 1m18.043s
I0728 14:40:51.316373 18411 log.go:34] 4924 tables out of 6798 opened in 1m21.01s
I0728 14:40:54.353456 18411 log.go:34] 5050 tables out of 6798 opened in 1m24.047s
I0728 14:40:57.336612 18411 log.go:34] 5190 tables out of 6798 opened in 1m27.03s
I0728 14:41:00.308123 18411 log.go:34] 5291 tables out of 6798 opened in 1m30.001s
I0728 14:41:03.309762 18411 log.go:34] 5380 tables out of 6798 opened in 1m33.003s
I0728 14:41:06.337339 18411 log.go:34] 5522 tables out of 6798 opened in 1m36.031s
I0728 14:41:09.310620 18411 log.go:34] 5639 tables out of 6798 opened in 1m39.004s
I0728 14:41:12.324492 18411 log.go:34] 5752 tables out of 6798 opened in 1m42.018s
I0728 14:41:15.311486 18411 log.go:34] 5875 tables out of 6798 opened in 1m45.005s
I0728 14:41:18.363185 18411 log.go:34] 6006 tables out of 6798 opened in 1m48.057s
I0728 14:41:21.354727 18411 log.go:34] 6108 tables out of 6798 opened in 1m51.048s
I0728 14:41:24.359366 18411 log.go:34] 6203 tables out of 6798 opened in 1m54.053s
I0728 14:41:27.307654 18411 log.go:34] 6294 tables out of 6798 opened in 1m57.001s
I0728 14:41:30.310361 18411 log.go:34] 6374 tables out of 6798 opened in 2m0.004s
I0728 14:41:33.331970 18411 log.go:34] 6454 tables out of 6798 opened in 2m3.025s
I0728 14:41:36.310228 18411 log.go:34] 6531 tables out of 6798 opened in 2m6.004s
I0728 14:41:39.423715 18411 log.go:34] 6625 tables out of 6798 opened in 2m9.117s
I0728 14:41:42.334863 18411 log.go:34] 6713 tables out of 6798 opened in 2m12.028s
I0728 14:41:45.307700 18411 log.go:34] 6781 tables out of 6798 opened in 2m15.001s
I0728 14:41:46.277541 18411 log.go:34] All 6798 tables opened in 2m15.971s
Next thing to figure out
Til now I have mostly focused on loading data into Dgraph. I start doing experiment on retrieving data now.
One thing I have observed in existing data directory is small sizes of SSTs from usual 64 MB. I am seeing many SSTs which are of size ~20 MB, trying to figure this out.