[WIP] Performance bottleneck on 1 TB data

UPDATE: We have upgraded the machine to have 96 GB RAM.

After figuring out w directory is reason for most of RAM consumption(it is the only thing shown in heap profile which consumes most of RAM), I ran flatten on w directory. With existing w and p directory RAM(RES) consumption was going upto 60 GB. Before running flatten on w directory sizes of p and w directories were as below:

510G	./p
28G	    ./w
358M	./zw

After running flatten on w directory, size of w directory was:

1.8GB   ./w

After restart of cluster, I continued running live loader. Currently we have following size of directories:

701G	./p
27G	    ./w
545M	./zw

We can see that w directory has again gained the size.

I also captured time to open p(~700 GB) and w(~27 GB):

W directory

I0728 14:38:45.641366   18411 log.go:34] 99 tables out of 1525 opened in 3.012s
I0728 14:38:48.634343   18411 log.go:34] 206 tables out of 1525 opened in 6.005s
I0728 14:38:51.643964   18411 log.go:34] 325 tables out of 1525 opened in 9.015s
I0728 14:38:54.653474   18411 log.go:34] 424 tables out of 1525 opened in 12.025s
I0728 14:38:57.637190   18411 log.go:34] 528 tables out of 1525 opened in 15.008s
I0728 14:39:00.639431   18411 log.go:34] 599 tables out of 1525 opened in 18.01s
I0728 14:39:03.676010   18411 log.go:34] 683 tables out of 1525 opened in 21.047s
I0728 14:39:06.649594   18411 log.go:34] 748 tables out of 1525 opened in 24.021s
I0728 14:39:09.642489   18411 log.go:34] 850 tables out of 1525 opened in 27.014s
I0728 14:39:12.632476   18411 log.go:34] 961 tables out of 1525 opened in 30.004s
I0728 14:39:15.632171   18411 log.go:34] 1065 tables out of 1525 opened in 33.003s
I0728 14:39:18.632756   18411 log.go:34] 1186 tables out of 1525 opened in 36.004s
I0728 14:39:21.668542   18411 log.go:34] 1278 tables out of 1525 opened in 39.04s
I0728 14:39:24.646270   18411 log.go:34] 1401 tables out of 1525 opened in 42.017s
I0728 14:39:27.928709   18411 log.go:34] 1480 tables out of 1525 opened in 45.3s
I0728 14:39:29.747162   18411 log.go:34] All 1525 tables opened in 47.118s

P directory

I0728 14:39:33.312447   18411 log.go:34] 539 tables out of 6798 opened in 3.006s
I0728 14:39:36.329768   18411 log.go:34] 907 tables out of 6798 opened in 6.023s
I0728 14:39:39.321231   18411 log.go:34] 1136 tables out of 6798 opened in 9.015s
I0728 14:39:42.307248   18411 log.go:34] 1317 tables out of 6798 opened in 12.001s
I0728 14:39:45.308684   18411 log.go:34] 2051 tables out of 6798 opened in 15.002s
I0728 14:39:48.321764   18411 log.go:34] 2343 tables out of 6798 opened in 18.015s
I0728 14:39:51.312440   18411 log.go:34] 2436 tables out of 6798 opened in 21.006s
I0728 14:39:54.327446   18411 log.go:34] 2550 tables out of 6798 opened in 24.021s
I0728 14:39:57.312190   18411 log.go:34] 2671 tables out of 6798 opened in 27.006s
I0728 14:40:00.314967   18411 log.go:34] 2790 tables out of 6798 opened in 30.008s
I0728 14:40:03.314505   18411 log.go:34] 2896 tables out of 6798 opened in 33.008s
I0728 14:40:06.325143   18411 log.go:34] 3016 tables out of 6798 opened in 36.018s
I0728 14:40:09.492198   18411 log.go:34] 3138 tables out of 6798 opened in 39.186s
I0728 14:40:12.310874   18411 log.go:34] 3271 tables out of 6798 opened in 42.004s
I0728 14:40:15.412023   18411 log.go:34] 3409 tables out of 6798 opened in 45.105s
I0728 14:40:18.323595   18411 log.go:34] 3531 tables out of 6798 opened in 48.017s
I0728 14:40:21.395090   18411 log.go:34] 3647 tables out of 6798 opened in 51.088s
I0728 14:40:24.308413   18411 log.go:34] 3771 tables out of 6798 opened in 54.002s
I0728 14:40:27.383428   18411 log.go:34] 3906 tables out of 6798 opened in 57.077s
I0728 14:40:30.341263   18411 log.go:34] 4011 tables out of 6798 opened in 1m0.035s
I0728 14:40:33.350558   18411 log.go:34] 4147 tables out of 6798 opened in 1m3.044s
I0728 14:40:36.316099   18411 log.go:34] 4267 tables out of 6798 opened in 1m6.009s
I0728 14:40:39.309938   18411 log.go:34] 4395 tables out of 6798 opened in 1m9.003s
I0728 14:40:42.310931   18411 log.go:34] 4540 tables out of 6798 opened in 1m12.004s
I0728 14:40:45.321744   18411 log.go:34] 4654 tables out of 6798 opened in 1m15.015s
I0728 14:40:48.349800   18411 log.go:34] 4802 tables out of 6798 opened in 1m18.043s
I0728 14:40:51.316373   18411 log.go:34] 4924 tables out of 6798 opened in 1m21.01s
I0728 14:40:54.353456   18411 log.go:34] 5050 tables out of 6798 opened in 1m24.047s
I0728 14:40:57.336612   18411 log.go:34] 5190 tables out of 6798 opened in 1m27.03s
I0728 14:41:00.308123   18411 log.go:34] 5291 tables out of 6798 opened in 1m30.001s
I0728 14:41:03.309762   18411 log.go:34] 5380 tables out of 6798 opened in 1m33.003s
I0728 14:41:06.337339   18411 log.go:34] 5522 tables out of 6798 opened in 1m36.031s
I0728 14:41:09.310620   18411 log.go:34] 5639 tables out of 6798 opened in 1m39.004s
I0728 14:41:12.324492   18411 log.go:34] 5752 tables out of 6798 opened in 1m42.018s
I0728 14:41:15.311486   18411 log.go:34] 5875 tables out of 6798 opened in 1m45.005s
I0728 14:41:18.363185   18411 log.go:34] 6006 tables out of 6798 opened in 1m48.057s
I0728 14:41:21.354727   18411 log.go:34] 6108 tables out of 6798 opened in 1m51.048s
I0728 14:41:24.359366   18411 log.go:34] 6203 tables out of 6798 opened in 1m54.053s
I0728 14:41:27.307654   18411 log.go:34] 6294 tables out of 6798 opened in 1m57.001s
I0728 14:41:30.310361   18411 log.go:34] 6374 tables out of 6798 opened in 2m0.004s
I0728 14:41:33.331970   18411 log.go:34] 6454 tables out of 6798 opened in 2m3.025s
I0728 14:41:36.310228   18411 log.go:34] 6531 tables out of 6798 opened in 2m6.004s
I0728 14:41:39.423715   18411 log.go:34] 6625 tables out of 6798 opened in 2m9.117s
I0728 14:41:42.334863   18411 log.go:34] 6713 tables out of 6798 opened in 2m12.028s
I0728 14:41:45.307700   18411 log.go:34] 6781 tables out of 6798 opened in 2m15.001s
I0728 14:41:46.277541   18411 log.go:34] All 6798 tables opened in 2m15.971s

Next thing to figure out

Til now I have mostly focused on loading data into Dgraph. I start doing experiment on retrieving data now.
One thing I have observed in existing data directory is small sizes of SSTs from usual 64 MB. I am seeing many SSTs which are of size ~20 MB, trying to figure this out.

2 Likes