Is It Legit For Bams To Have @Rg Headers That Are Not Used In The Alignments?
1
1
Entering edit mode
11.5 years ago
nnutter ▴ 210

Is it legit for BAMs to have @RG headers that are not used in the alignments? I am trying to decide whether I should be able to depend on the headers or if I have to instead check the alignments to see if it is used.

bam • 2.0k views
ADD COMMENT
2
Entering edit mode
11.5 years ago

If by "legit", you mean "passes the spec", then yes. I don't see anything in the sam/bam spec that requires @RG groups to actually appear in the body of the bam.

I can't think of a case where this would be proper behaviour, but I can think up a scenario where it might happen. Say a sequencing center's pipeline does a bunch of per-lane alignments, and one of the lanes fails spectacularly. It gets pushed into alignment anyway, but none of the reads align. The pipeline adds the @RG name to the header automatically. Then, some grad student gets the bright idea to save space by using a perl script to filter out all of the unmapped reads in the bam. Boom - you've got a bam with @RG names in the headers, but no reads.

Is this farfetched? Probably a little, but people do ridiculous and stupid things all the time in bioinformatics.

Perhaps more plausibly - someone splits up a bam into a separate file for each readgroup, but just copies the existing header over.

Oh, and I've definitely seen old bams (>2 years old) that don't even have @RG headers, so there's that to consider too.

Despite all that, I'd code things up initially expecting sane header behavior. If you are proved wrong at some later date (or are extremely bored one afternoon), then add in the sanity checking.

ADD COMMENT
0
Entering edit mode

I have a BAM that has no reads for one of the read groups so was trying to figure out whether to fix the code or the BAM. I guess I'm still not sure what to do but it seems like it's valid SAM so I should probably fix the code. Thanks.

ADD REPLY

Login before adding your answer.

Traffic: 2521 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6