[SOLVED] changing the order of input changes samtools merge ouput
1
0
Entering edit mode
2.3 years ago
zz105 ▴ 20

I realized that this is a stupid mistake I have made. Since samtools do not overwrite the files by default, the output that I get from samtools merge output.bam f2.bam f1.bam wan't what I thought it was

below is my original post


I'm using samtools/1.9.0 and I'm trying to merge 2 files, but what I observed is when I give the inputs in different order I get very different output.

I have f1.bam 3G and f2.bam 200M. When I do samtools merge output.bam f1.bam f2.bam, I get a that's slightly larger than 3G. When I do samtools merge output.bam f2.bam f1.bam, I get a that's slightly larger than 200M.

I'm sure they are not just f1.bam or f2.bam, but I wonder what could have been wrong or is it a issue with samtools/1.9.0?

I also observed that the total counts I get from the output bam files of samtools merge output.bam f1.bam f2.bam are 10 times larger than that from samtools merge output.bam f2.bam f1.bam

This is the flagstat output from samtools merge output.bam f1.bam f2.bam

164862366 + 0 in total (QC-passed reads + QC-failed reads)
0 + 0 secondary
0 + 0 supplementary
0 + 0 duplicates
164862366 + 0 mapped (100.00% : N/A)
0 + 0 paired in sequencing
0 + 0 read1
0 + 0 read2
0 + 0 properly paired (N/A : N/A)
0 + 0 with itself and mate mapped
0 + 0 singletons (N/A : N/A)
0 + 0 with mate mapped to a different chr
0 + 0 with mate mapped to a different chr (mapQ>=5)

This is the flagstat output from samtools merge output.bam f2.bam f1.bam

11009638 + 0 in total (QC-passed reads + QC-failed reads)
0 + 0 secondary
0 + 0 supplementary
0 + 0 duplicates
11009638 + 0 mapped (100.00% : N/A)
0 + 0 paired in sequencing
0 + 0 read1
0 + 0 read2
0 + 0 properly paired (N/A : N/A)
0 + 0 with itself and mate mapped
0 + 0 singletons (N/A : N/A)
0 + 0 with mate mapped to a different chr
0 + 0 with mate mapped to a different chr (mapQ>=5)
samtools bam • 959 views
ADD COMMENT
0
Entering edit mode

File sizes should not be used as a metric for anything but a qualitative assessment (e.g. something worked there is a file with stuff in it). Have you tried to sort the final files after merging? That final result should be perhaps identical (if not very similar) in terms of size.

ADD REPLY
0
Entering edit mode

Agree.I also observed that the total counts I get from the output bam files of samtools merge output.bam f1.bam f2.bam are 10 time larger than that from samtools merge output.bam f2.bam f1.bam

ADD REPLY
0
Entering edit mode

Something odd is going on here since the total number of reads is changing in two files after the merge. You are running samtools v.1.9 (current is samtools v.1.14) so one suggestion would be to try the latest to see if that fixes the problem.

ADD REPLY
1
Entering edit mode
2.3 years ago
zz105 ▴ 20

Sorry that was my stupid mistake. it was that samtools won't overwrite files by default and so one of the runs aren't what I think it was. but really appreciated your help!

ADD COMMENT

Login before adding your answer.

Traffic: 2394 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6