Question: bcftools merge heterozygous calls
0
gravatar for evelyn
29 days ago by
evelyn30
evelyn30 wrote:

Hello,

I have merged three vcf files and translated the SNP calls using bcftools query and awk

$ bcftools query -f '%CHROM\t%POS\t%ID\t%REF\t%ALT[\t%TGT]\n' merge.vcf | awk -v FS="\t" -v OFS="\t" '{for(i=6;i<=NF;i++) {split($i, gt, "/"); if(gt[1]==".") $i="-"; else if(gt[1]==gt[2]) $i=gt[1]; else $i="N";} print }'
1   687 .   G   A   -   A   -
1   689 .   G   A   -   A   -
1   701 .   T   A   -   A   -
1   704 .   T   G   -   G   -
1   708 .   C   T,A T   -   A
1   6440    .   G   T   T   N   -

Here the heterozygous calls are called as N. I want to change the following heterozygous calls as IUPAC codes:

Code Base
R   A or G
Y   C or T
S   G or C
W   A or T
K   G or T
M   A or C
N   any base

I tried using:

bcftools query -f '%CHROM\t%POS\t%ID\t%REF\t%ALT[\t%TGT]\n' merge.vcf | awk -v FS="\t" -v OFS="\t" '{for(i=6;i<=NF;i++) {split($i, gt, "/"); if(gt[1]==".") $i="NA"; else if(gt[1]==gt[2]) $i=gt[1]; else if(gt[2]=="AG") $i="R";  else if(gt[2]=="CT") $i="Y"; else if(gt[2]=="GC") $i="S"; else if(gt[2]=="AT") $i="W"; else if(gt[2]=="GT") $i="K"; else if(gt[2]=="AC") $i="M"; else $i="N";} print }' > out.vcf

But it does not give IUPAC codes for the above heterozygous calls. Any suggestions/help is appreciated.

Thank you so much for your time!

snp bcftools • 177 views
ADD COMMENTlink modified 28 days ago by zx87548.2k • written 29 days ago by evelyn30
else if(gt[2]=="AG")

You are checking if the second value of the genotype after splitting by / is AG. This is not what you want. You would like to check if gt[1] is A and gt[2] is G or vice versa.

ADD REPLYlink written 27 days ago by finswimmer12k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 750 users visited in the last hour