extract DS field from a genotype file
1
0
Entering edit mode
20 months ago
rheab1230 ▴ 140

Hello,

I am trying to extract DS field from my vcf files. But it's giving me two values. This is my original vcf file:

22      20012563        22:20000086:T:C T       C       .       PASS    AF=0.00289;MAF=0.00289;R2=1;ER2=0.83285;TYPED   GT:DS:HDS:GP    0|0:0:0,0:1,0,0 0|0:0:0,0:1,0,0 0|0:0:0,0:1,0,0 0|0:0:0,0:1,0,0 0|0:0:0,0:1,0,0 0|0:0:0,0:1,0,0 

This is the command that I used:

bcftools +dosage chr22_dosage.filtered.vcf.gz > output2.tsv

This is the final output file being generated.

22      20012563        T       C       0.0     0.0     0.0     0.0     0.0     0.0     0.0     0.0     0.0     0.0     0.0     0.0     0.0     0.0     0.0
     0.0     0.0     0.0     0.0     0.0  

Generally the final output file should only have 0,1 or 2 as values and not 1.0,2.0, 0.0 Can anyone help me out with this and tell me if I am doing something wrong.

Thank you

vcf bcftools genotype • 768 views
ADD COMMENT
0
Entering edit mode
20 months ago
LChart 3.9k

The genotype dosage is a floating-point formatted value between 0 and 2; and therefore should not only have integer-valued 0, 1, and 2 as outputs. Imputation may generate values such as 1.8 or 0.9; and thus the entire field is float formatted. If you want to convert them to an integer representation where possible, you can post-process the file with sed to do so.

ADD COMMENT

Login before adding your answer.

Traffic: 1945 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6