Question: Tajima's D Using SNP data ONLY
0
gravatar for aberry814
2.9 years ago by
aberry81460
United States
aberry81460 wrote:

Hi all,

I am new to selection studies, so I have poor intuition regarding Tajima's D interpretation. I have a dataset with 20,000 SNPs distributed across a haploid genome for ~150 samples. The whole genome is ~50Mb. I have been using the SNP alignment to calculate summary statistics like Tajima's D, rather than the whole genome alignment (because it takes exponentially shorter to calculate stats with 3Mb than 7.5 Gb). Does this affect my Tajima's D calculation? My understanding is that pi and theta only use segregating sites anyway, so as long as the window includes the same SNPs, the total window length shouldn't matter, meaning that a graph of D over the genome will look the same assuming the window sizes are proportional.

I appreciate your input.

Alex

snp tajima's d • 2.3k views
ADD COMMENTlink modified 2.9 years ago by Pierre0 • written 2.9 years ago by aberry81460

Hi,

As you said Tajima's D is computed thanks to segregating sites, so it's ok to use only SNPs information to obtain it.

ADD REPLYlink written 2.9 years ago by guillaume.rbt770
0
gravatar for Pierre
2.9 years ago by
Pierre0
France
Pierre0 wrote:

Hello,

I handle the same question as you except that I only have SNP data (and their relative position on the genome). Looking closer, Tajima's D seems to only require segregating sites. So, i guess that using SNP should be adequate. Nevertheless, I am questionning on the accuracy of the Tajima's D output.

ADD COMMENTlink modified 2.9 years ago • written 2.9 years ago by Pierre0

Even I also have SNP data, can you please suggest how to compute Tajima's D without vcf files?

Thank you!

ADD REPLYlink written 23 months ago by amitgourav.ghosh1260
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1825 users visited in the last hour