Is there some utility that will take as input a reference sequence in fasta format and a file of variant calls, with both SNPs and short indels (i.e. VCF or similar), and output a new sequence, identical to the input reference except with the new variants introduced?
It wouldn't be a super tricky thing to script, but since it seems like something that would be relatively common to want to do, I'm wondering if there isn't already some available tools for doing this. Yet I haven't found any.
For simplicity let's assume a single haploid chromosome (trickier for diploid individuals with heterozygous calls).
Example:
reference sequence:
AAATTTAGAA
variant calls:
POS REF ALT
2 A C
5 TT T
8 G GCC
10 A T
new sequence:
ACATTAGCCAT
Hi gaffa. Would you mind putting a few example sequences and variants and resulting sequences? For a given input sequence, do you want only one output sequence or many containing all possible combinations of the variants for that sequence? Cheers
If I want all the possible combinations, is there a tool for doing that?
@Eric: I have added an example. For a given input sequence I want only a single output sequence. I.e. I have sequenced some individual, called variants in relation to a reference genome, and now I want to construct the full sequence of the individual (for simplicity assume a single haploid chromosome).