pileup format: what's the difference between upper and lower case?
1
0
Entering edit mode
7.0 years ago
blur ▴ 280

Hi, I read the samtools documentation for the pileup format, but am not clear about the difference between lower and upper case letters.

For example: chr2 777777 T 31 ..........,,,,,,,,,,CCcccCCCCCC

So, the refrernce is T, and the c/C difference is which strand the letter is, right? does it mean that the F strand had a C and the R strand had a G that is here written as c to account for direction? Is it correct to sum up all the Cs together?

thanks for all the help!

samtools pileup • 3.3k views
ADD COMMENT
0
Entering edit mode

Hi , Yes you understood rigth for reverse strand you read is reverse to align it , then it means c is g :) Yes you can sum all the c/C together because of complementarity of DNA. But for example if your gene is in reverse strand DNA and you want to know the translation of the protein consequences of your mutation you will need to use g. Best

ADD REPLY
0
Entering edit mode
7.0 years ago

Lower case means that base was seen on the reverse strand. Please see the online man page.

ADD COMMENT
1
Entering edit mode

Thank you for your answer - I read that part, what I fail to understand is this: if I have a T as reference and there is c in the reverse strand, does it mean that I should treat this as a T->c, or actually as a T->G (as it is on the reverse strand)? Thanks!

ADD REPLY
0
Entering edit mode

T->C

SAM/BAM and samtools always take care of reverse complementing for you, since otherwise it would just get confusing.

ADD REPLY

Login before adding your answer.

Traffic: 1641 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6