As part of my pipeline I'm using the Picard program SortSam to order the reads in my BAM file by their position (SORT_ORDER=coordinate
). However when I run this code, my output file has less space.
java -Djava.io.tmpdir=[tmp-directory] -jar picard.jar SortSam \
I=before-sort.bam \
O=after-sort.bam \
SORT_ORDER=coordinate
du before-sort.bam
= 44131980 KB
du after-sort.bam
= 28874760 KB
Do I have a loss of data, or does SortSam have a filtering step I dont' know of?