Haplotype network/phylogeny of 1000 genomes project
Entering edit mode
7 months ago
Morgan • 0

I am trying to create a phylogeny and haplotype network for a specific gene in the 1000 genomes project. I am wanting to answer the following questions:

  1. What major haplotypes are seen in this specific region of interest?
  2. How do each haplotype relate to one another?
  3. Which populations are most similar at this region?

I believe I need to do the following steps but am unsure if this is the right track:

  1. Splice 1kgp VCF down to region of interest/populations of interest
  2. Remove all private SNPs/SNPs with poor coverage across samples with VCFtools
  3. Convert VCF to PHYLIP using PGDSpider
  4. Create phylogenetic tree with RaxML
  5. Visualize phylogenetic tree with FigTree; and finally,
  6. Make a haplotype network, which I am completely lost on how to do.

I know how to splice the VCFs from the 1000 genomes project down to the region of interest and use VCFTools/PGDSpider already

How do I choose an outgroup in RaxML and how do I get the sequence files that match those of the 1kgp from non-human primates? Additionally, I am interested in the individual alleles, not people, how to I show a "phylogeny" of each haplotype? I know the 1kgp is phased so it should be possible?

What is the best method to make a haplotype network from the 1kgp datasets?

Thanks so much!

haplotype 1000Genomes phylogeny • 271 views

Login before adding your answer.

Traffic: 2581 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6