I have a hard time to figure it out how correctly calculate the Alternative Allele Frequency (AAF)
from Samtools mpileup
.
I received the code which using samtools 1.8 mpileup
(following code) get the following format:
.... samtools mpileup \
-l ... capture_targets.bed \
-t DP,AD,ADF,ADR,SP,INFO/AD,INFO/ADF,INFO/ADR \
-d100000000 \
--output-BP \
--output-MQ \
--output-QNAME \ ...
Later in the code it, samtools mpileup
data were converted to long format, clean up and add AAF
As can be seen in the following code, the "DP4"
which is " Number of high-quality ref-forward, ref-reverse, alt-forward and alt-reverse bases, FORMAT"
was divided into 4 parts of "AD1", "AD2", "AD3", "AD4".
finally used the AD2/DP
to calculate the AAF
?
... %>% separate(AD, c("AD1", "AD2", "AD3", "AD4")) %>% mutate(AAF = as.numeric( **AD2**)/as.numeric(DP))
However, to me "**AD2**
" is the "ref-reverse"
and I think it should be "AD3"
as "Alt-forward"
for calculation of the AAF?
... %>% mutate(AAF = as.numeric( AD3 )/as.numeric(DP))
`
I think following code should be correct?
Can someone help me please!
Thanks a lot!