Entering edit mode
19 months ago
caredevil
•
0
Dear all,
I have a phased vcf file and I would like to express the snp sequence into numeric sequence.
For unphased data, this can be easily done by expressing 0/0 as 0, 0/1 as 1 and 1/1 as 2
but for phased data this annotation kind of takes away the phasing information.
Do you have any suggestions?
Thank you in advance.
nobody can answer unless you describe your downstream analysis and the tools you will use.
Thank you for your reply.
I think my question is pretty self explanatory and there is no need for further information.
But as requested, my goal is to find similarities between chromosome sequences of different individuals and my tools are a simple python script to extract the sequence from the vcf and perform the analyses.