Comparison between .bam files
1
0
Entering edit mode
7.8 years ago
mew225 ▴ 20

Hi,

I need to do a complement between two bam files. I have tried to use the diff function in bamUtils to find the differences between the two, but when I'm trying to use samtools to run further analyses it is telling me that I have a truncated file. I haven't been able to find out if bamUtils is using a different end of file character than samtools is looking for or not. That is the only reason that I can think of as to why it would come up with a truncated file and not find the EOF character. Is there a way to do a complement between two bam files with samtools or possibly picard? I have not been able to find any sites online where they have said if they can or not.

Thanks!

NGS samtools • 6.0k views
0
Entering edit mode

I don't understand what you mean by complement? Are you referring to coverage?

0
Entering edit mode

I mean complement as in the set operator. I need to see what is in one file that is not in the other.

0
Entering edit mode

what do you mean 'one file is not in the other ' ? A read named R123456 is found in file 1 but not in file 2 ? or a read named R123456 s found a chr1:98798 in file 1 but the same R123456 was mapped on chrX:9879 in file2

0
Entering edit mode

The first. I have to see what reads are in one and not in the other.

2
Entering edit mode
7.8 years ago

I wrote the following tool https://github.com/lindenb/jvarkit/wiki/CmpBams but it doesn't produce a BAM.

• use samtools view -H file1.bam > out.sam to save header
• sort your BAM on read names using samtools sort -n, save as SAM (not BAM) with samtools view without the header
• use unix join -t '\t' -a 1 -1 1 -2 1 sorted1.sam sorted2.sam >> out.sam to find the read in one file but not in the other
• convert the sam file back to bam with samtools.
0
Entering edit mode

Thanks.