If there’s a significant slowdown, it could be:
- You were running some other program, which interfered with the bulk loader. This could be a download which would affect the disk throughput, hence affecting the loader.
- There’s something about having 2000 files, which might not be fitting well with the loader. We haven’t tested with a directory with so many RDF files; I don’t see why that could be a problem; but who knows.
I’d say check for 1. Retry, and see if you still see that issue. If you still see it, you could try doing a cpu profiling, if you know how to do that (using http): Profiling Go Programs - The Go Programming Language
Otherwise, you could try to merge these files to drop their number. Merging could be done with linux + bash, via zcat, and gzip.
Bulk loader is really meant to be used in one go.
Update
I’m looking at the code,
And I don’t really see anything here, which won’t work well with increased number of files. In fact, we’re ensuring that we don’t create as many goroutines as the number of files, by using a throttle. So, cpu performance should be the same as one file.