Question

Do the lowercase bases seen in BAM files need to be reverse complemented in consensus?

0

Entering edit mode

5.3 years ago

DNAngel ▴ 250

I'm creating consensus sequences using a custom script but I am unsure if lowercase bases (the bases that are on the reverse strand) are already reverse complemented, or they need to be reverse complemented?

In IGV, when I see reads aligned to a position and I have a mix of A (forward reads), and 'a' (reverse reads) from single-end data, can I assume that 'a' is already reverse complemented where the reverse read actually had a 'T' at that position? Or will I have to account for this myself and really I have 'A' and 'T' matched at that position?

Thank you!

BAM • 883 views

ADD COMMENT • link updated 5.3 years ago by Devon Ryan 104k • written 5.3 years ago by DNAngel ▴ 250

score 2 · Answer 1 · 2019-01-08

All alignments in a BAM file are written for the + strand, so they've been reverse complemented as appropriate. IGV is just indicating their orientation with upper/lower case letters, you don't need to do anything else yourself in that regard.

Whether a reverse complemented read actually arose from the - strand will end up being library type dependent, in case you're wondering about that as well.