The issue was that we assumed the second to overwrite the first, but it looked to be the other way around.
Live load the second rdf file and It will overwrite the duplicates. I am guessing your data already has uids. If you bulk load _:a <foo> "12" . and then live load _:a <foo> 13 . , both the _:a would be considered as different nodes.
You have clarified that order is not guaranteed, but is it completely random or is it random within chunks of data loaded with bulk loader?
It is not completely random. We read chunks and these chunks are processed parallelly.
If you look at the following code, you’ll notice that the files are being processed parallelly