VCF to fasta incorporating heterozygous sites
Entering edit mode
3 days ago
Sarah ▴ 20

Hello, I am trying to generate a consensus fasta file for one sample from an unphased VCF. I have been using bcftools consensus, which works well, but I am running into problems with treating the heterozygous sites. I am not able to adequately phase the data, so I would like to randomly select one allele at each heterozygous site for the reference. bcftools allows options to use ambiguity codes, or to always select the reference allele or always the alternate allele, but each of these options would cause bias in my downstream analyses.

Is there a program that can either phase a VCF randomly, or that can generate a consensus fasta while randomly selecting one allele per heterozygous site?

Thank you!

(PS this is my bcftools consensus command):

bcftools consensus --fasta-ref reference.fasta --sample SampleName -M N -a N -H 1 MyVCF.vcf.gz
fasta heterozygous VCF unphased • 96 views
Entering edit mode

I would explore writing a simple, text transformation tool to modify the genotype in the VCF file for each heterozygous genotype. Basically replacing 0/1 with either 0/0 or 1/1


Login before adding your answer.

Traffic: 1874 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6