I need so help regarding the usage of sambamba markdup. I have read the documentation but I don't quite understand.
What is meant by insert size here?
--hash-table-size=HASH_TABLE_SIZE size of hash table for finding read pairs (default is 262144 reads); will be rounded down to the nearest power of two; should be > (average coverage) * (insert size) for good performance
To get 100 GB here, should I just write: --sort-buffer-size 102400 ? The reason I wonder is that in sambamba sort you should specify e.g. Mb or Gb after the integer.
--sort-buffer-size=SORT_BUFFER_SIZE total amount of memory (in *megabytes*) used for sorting purposes; the default is 2048, increasing it will reduce the number of created temporary files and the time spent in the main thread
thx / Jonas