Comparing two bai files
0
0
Entering edit mode
8 weeks ago
irfanwustl ▴ 20

Let's assume we have a bam file. Then we made a copy of that bam file. I have sorted them and generated two bai files from that two bam files with the same content but different name. Now I want to check if the Bai files are identical. I have used md5 and the result is different. I am not sure if it is for the different names of the files? Is there a way to see the bai file like sam files?

BAM BAI • 266 views
0
Entering edit mode

Why don't you simply diff them?

0
Entering edit mode

Is there a way to see the bai file like sam files?

What does that mean?

Anyway, file names are not taken into account when creating md5 checksums afaik.

\$ ls A.bam && md5sum A.bam && cp A.bam foo.bam && md5sum foo.bam
A.bam
6a08277341919ee0bd0272b9f08afc32  A.bam
6a08277341919ee0bd0272b9f08afc32  foo.bam


Maybe something like CompareBAMs from Picard tools:

java -jar picard.jar CompareSAMs A.bam foo.bam O=out.txt


The out.txt file will then, in the last column tell you yes/no whether the files are identical plus some extra stats what is identical and what differs on an alignment level.

Output in this case with two identical files dummy BAM files named A.bam and foo.bam:

LEFT_FILE   RIGHT_FILE  MAPPINGS_MATCH  MAPPINGS_DIFFER UNMAPPED_BOTH   UNMAPPED_LEFT   UNMAPPED_RIGHT  MISSING_LEFT    MISSING_RIGHT   DUPLICATE_MARKINGS_DIFFER   ARE_EQUAL
/Users/atpoint/A.bam    /Users/atpoint/foo.bam  498 0   24502   0   0   0   0   0   Y


Mind the Y (means Yes, are identical) in the last rightmost column.

0
Entering edit mode

Actually, I am trying to compare the BAI files, not the BAM files. Can I do this with CompareSAMs?

0
Entering edit mode

If the header of the bam files is different, md5 will be different. In this case, the header will be different as the header saves the command for sorting where the file names are different.

0
Entering edit mode

Is there a way to see the bai file like sam files

See if this Q&A Make bam index human readable helps but I would guess the problem is upstream of the bai files. Maybe the two bam files are not the same...