[WIP] Performance bottleneck on 1 TB data

@mrjn I am running Dgraph in normal mode.

@Paras found the reason for its slowness. Thanks to @harshil_goel. Live loader works well it schema if provided to it before starting it. I was passing schema in schema file and was running live loader on fresh cluster. Since there was no schema, I was seeing lot of transaction abortions. This can an enhancement in live loader.

With new run(setting schema before running live loader), I found ran for around ~4 hours, before OOM.
Just to confirm the reason I ran it again. Currently logs for live loader are as follows:

[15:32:54Z] Elapsed: 01h47m15s Txns: 448319 N-Quads: 448319000 N-Quads/s [last 5s]: 70200 Aborts: 98
[15:32:59Z] Elapsed: 01h47m20s Txns: 448640 N-Quads: 448640000 N-Quads/s [last 5s]: 64200 Aborts: 98
[15:33:04Z] Elapsed: 01h47m25s Txns: 449003 N-Quads: 449003000 N-Quads/s [last 5s]: 72600 Aborts: 98
[15:33:09Z] Elapsed: 01h47m30s Txns: 449334 N-Quads: 449334000 N-Quads/s [last 5s]: 66200 Aborts: 98
[15:33:14Z] Elapsed: 01h47m35s Txns: 449712 N-Quads: 449712000 N-Quads/s [last 5s]: 75600 Aborts: 98
[15:33:19Z] Elapsed: 01h47m41s Txns: 450064 N-Quads: 450064000 N-Quads/s [last 5s]: 70400 Aborts: 98

RES shown by htop are ~23GB for alpha and ~16GB for live loader.

Heap profile is follow:

File: dgraph
Build ID: 7048c9a5b650a3d32ace7cfaec4253a766ddb1ab
Type: inuse_space
Time: Jul 21, 2020 at 3:29pm (UTC)
Entering interactive mode (type "help" for commands, "o" for options)
(pprof) top
Showing nodes accounting for 7720.75MB, 97.55% of 7914.99MB total
Dropped 250 nodes (cum <= 39.57MB)
Showing top 10 nodes out of 80
      flat  flat%   sum%        cum   cum%
 5306.12MB 67.04% 67.04%  5347.73MB 67.56%  github.com/dgraph-io/badger/v2/table.OpenTable
  489.11MB  6.18% 73.22%   716.61MB  9.05%  github.com/dgraph-io/badger/v2/pb.(*TableIndex).Unmarshal
  448.04MB  5.66% 78.88%   448.04MB  5.66%  github.com/dgraph-io/ristretto/z.(*Bloom).Size (inline)
     414MB  5.23% 84.11%      414MB  5.23%  github.com/dgraph-io/badger/v2/table.NewTableBuilder
     384MB  4.85% 88.96%      384MB  4.85%  github.com/dgraph-io/ristretto.newCmRow (inline)
  279.93MB  3.54% 92.50%   279.93MB  3.54%  github.com/dgraph-io/badger/v2/table.glob..func1
  227.51MB  2.87% 95.37%   227.51MB  2.87%  github.com/dgraph-io/badger/v2/pb.(*BlockOffset).Unmarshal
  166.41MB  2.10% 97.47%   166.41MB  2.10%  github.com/dgraph-io/badger/v2/skl.newArena
       5MB 0.063% 97.54%   693.04MB  8.76%  github.com/dgraph-io/badger/v2/table.(*Table).block
    0.64MB 0.008% 97.55%   576.64MB  7.29%  github.com/dgraph-io/ristretto.NewCache
(pprof) list OpenTable
Total: 7.73GB
ROUTINE ======================== github.com/dgraph-io/badger/v2/table.OpenTable in /home/ashish/projects/pkg/mod/github.com/dgraph-io/badger/v2@v2.0.1-rc1.0.20200615081930-c45d966681d4/table/table.go
    5.18GB     5.22GB (flat, cum) 67.56% of Total
         .          .    273:	id, ok := ParseFileID(filename)
         .          .    274:	if !ok {
         .          .    275:		_ = fd.Close()
         .          .    276:		return nil, errors.Errorf("Invalid filename: %s", filename)
         .          .    277:	}
  512.11kB   512.11kB    278:	t := &Table{
         .          .    279:		fd:         fd,
         .          .    280:		ref:        1, // Caller is given one reference.
         .          .    281:		id:         id,
         .          .    282:		opt:        &opts,
         .          .    283:		IsInmemory: false,
         .          .    284:	}
         .          .    285:
         .          .    286:	t.tableSize = int(fileInfo.Size())
         .          .    287:
         .          .    288:	switch opts.LoadingMode {
         .          .    289:	case options.LoadToRAM:
         .          .    290:		if _, err := t.fd.Seek(0, io.SeekStart); err != nil {
         .          .    291:			return nil, err
         .          .    292:		}
    5.18GB     5.18GB    293:		t.mmap = make([]byte, t.tableSize)
         .          .    294:		n, err := t.fd.Read(t.mmap)
         .          .    295:		if err != nil {
         .          .    296:			// It's OK to ignore fd.Close() error because we have only read from the file.
         .          .    297:			_ = t.fd.Close()
         .          .    298:			return nil, y.Wrapf(err, "Failed to load file into RAM")
         .          .    299:		}
         .          .    300:		if n != t.tableSize {
         .          .    301:			return nil, errors.Errorf("Failed to read all bytes from the file."+
         .          .    302:				"Bytes in file: %d Bytes actually Read: %d", t.tableSize, n)
         .          .    303:		}

Profile points to w directory Badger, that where we keep SSTs in RAM(options.LoadToRAM)
I am looking into this.