I am using Popoolation tool to calculate Tajima's D, Pi, and Watterson's Theta of my pool population by using "Variance-sliding.pl" script from mpileup data file . But the output generated shows 0 value in 3rd and 4th column and in 5th show "na".
4 125000 0 0.000 na
4 127000 0 0.000 na
4 129000 0 0.000 na
4 131000 0 0.000 na
4 133000 0 0.000 na
I follow the whole steps as per given in manual.
So please guide me how to solve this above error, it's urgent.
Hello diva, I hope this helps, I found this comment in the popoolation wiki website
"Values of 0 and na in my case came from using a bam file generated from reads using phred33 scores when the default of the Variance-sliding.pl script is to use phred64 encoding. Specify "--fastq-type sanger" to use the phred33 scoring scheme."
I am getting a very similar error for the d output. It gives number of snps found but no tajima's d value... I tried using both "--fastq-type sanger" and "--fastq-type illumina". Please any help?
line example: dDocent_Contig_13 500 0 0.000 na
Looks like Popoolation finds no SNPs within the sliding windows. That would explain why it is not calculating the measures. You could really have no SNPs in those regions, or there are SNPs but Popoolation is not considering them due to failing the filtering steps e.g. minimum quality. Keep in mind that Popoolation assumes by default quality phred encoding offset 64 (Illumina). If you have quality phred encoding offset 33 (sanger), you will have to set the parameter --fastq-type=sanger. Also, make sure you have coverage above the threshold set with the parameter --min-coverage (default is 4).
Please anyone guide me how to solve this error.