I want to ask about consensus sequence generated from variant data. Let's say I have a region like below:
From that region, I found 2 SNP on the 3rd and 10th nucleotide like below:
POS -- REF -- ALT
3 -- A -- C
10 -- T -- A
My question is, if I want to apply the consensus function, there are 2 possible sequence:
heterozygous sequences with 1 sequence only 1 mutation on 3rd nucleotide AND 1 sequence mutated on 10th nucleotide
heterozygous sequences with 1 sequence is similar to reference and other sequence consist of both mutation on 3rd and 10th nucelotides.
My question is, how to decide which is the best represntative of the consensus sequence?