Hi, I have a reference fasta file and a vcf file of SNP variant calls. For each individual in the vcf file, I want to create a new, "pseudo-haploid" fasta file where every base is randomly sampled from one of the individual's alleles.
This seems to be a different problem than GATK's FastaAlternateReferenceMaker which inserts the alternate allele (not caring about individual genotypes or ever retaining the reference allele).
Can anybody offer a tool or some advice? Thanks!
Short example: Sequence: ATAAATTCCC (10 bp long) VCF: POS REF ALT Ind1 Ind2 2 T C 0/1 1/1 5 A G 0/0 0/1 Output: Ind1: A **T** AAATTCCC or A **C** AAATTCCC Ind2: A*C*AA **A** TTCCC or A*C*AA **G** TTCCC