How to build a phylogenetic tree using fasta obtained from VCF file?
3.1 years ago
rimgubaev ▴ 280

The problem is as follows: I got VCF file obtained by GATK for 288 individuals, then I convert it to nucleotide fasta (using PGDSpider) so the length of individual's fasta sequence corresponds to the number of SNPs in VCF file. If I want to use MEGA to build a tree I first make an alignment which adds gaps that are actually are not in favor in that case. So the question is how to make an alignment file (*.meg) without gaps in order to make a tree in MEGA?

Population genetics VCF MEGA • 2.7k views
Not a direct answer but I have used this software (snpphylo) with good results http://chibba.pgml.uga.edu/snphylo/

That's pretty close to what I what to do. I think I'll try one. Thanks!

You could use VCF2POPTree program to build a tree directly or obtain all pairwise distances in MEGA format and then use MEGA construct an NJ or UPGMA tree.

Hi , I m not sure to understand " so the length of individual's fasta sequence corresponds to the number of SNPs in VCF file" , do you have an example ?

>individual1
ACGTNGCT
>individual2
AGGTAGNT
>individual3
GCGTAGCT
etc.


The letters correspond to the allele in a certain position. So the sequences are already aligned if it can be said in that situation.

Ok so i don't understand, why you don't want gaps ?

Because a letter represents a certain (fixed) position in the genome. For instance, the second letter in the above example is in the 13th chromosome (pos 156) and third is on chromosome 3 (pos 19909). They cannot be moved. I'm sorry for the not clear explanation.