Question: What happens to heterozygous sites when you go from reference sequence to sequence modified by variants?
0
jxiang15 • 10 wrote:
In a previous post, New Fasta Sequence From Reference Fasta And Variant Calls File?, it was recommended to use either vcftools or FastaAlternateReferenceMaker (https://www.broadinstitute.org/gatk/gatkdocs/org_broadinstitute_gatk_tools_walkers_fasta_FastaAlternateReferenceMaker.php) if you have a reference sequence and a variant file, and you to get a new FASTA file.
However, with the 1000 genomes data, the data is phased. So at heterozygous sites, should the ALT allele be substituted or should the REF allele be left in the sequence?
Thanks in advance