Question

Forum:How to calculate Tajima's D and Fay & Wu's H for unphased data?

0

Entering edit mode

8.2 years ago

RoseString ▴ 10

Hi,

I have a small number of samples (~10) for my species of interest (non-model organism), so it's almost impossible to phase the data. I am interested in doing some site-frequency spectrum methods to detect positive selection in the genome, but they require the calculation of nucleotide diversity (pi). Is it possible to do so without phasing the data?

Thanks in advance!

Evolution Genomics Nucleotide-diversity Genetics • 4.2k views

ADD COMMENT • link updated 12 months ago by Ram 43k • written 8.2 years ago by RoseString ▴ 10

score 0 · Answer 1 · 2016-03-01

0

Entering edit mode

8.2 years ago

jsgounot ▴ 170

Maybe you could use VariScan. However, I don't know if it's the best way to do it for unphased data since you will have to produce 2 sequences for each individual, and therefore randomly assign each variant to one sequence.

ADD COMMENT • link 8.2 years ago by jsgounot ▴ 170

0

Entering edit mode

Thanks!

Do you know any literature doing the random assignment of variants if the data is unphased?

ADD REPLY • link 8.2 years ago by RoseString ▴ 10

0

Entering edit mode

Just an update. I found a study using your method. They call this process 'haploidize data'.

ADD REPLY • link 8.2 years ago by RoseString ▴ 10