Question: Question about VCFtools --window-pi --window-pi-step
0
gravatar for 60343011s
8 weeks ago by
60343011s10
60343011s10 wrote:

Hi all

I'm using VCFtools (v0.1.17) for estimating nucleotide diversity of my study species.

I already got a VCF file which was made form mapping to a draft genome, then I used it to calculate pi value.

As you can see, the output showed the bin size and variants(here, I used --window-pi 60000 --window-pi-step 24000), pi value = numbers of variants/Bin size

CHROM   BIN_START   BIN_END N_VARIANTS  PI    
scaffold22988   1   60000   11  0.000183333

The problem is that scaffold22988 has only 1015 bp, but it used total bin size for estimating pi, instead of the length of that scaffold. This makes the average pi value across genome under estimated when large bin size was applied.

This situation also happened on the end of large scaffold, like:

CHROM   BIN_START   BIN_END N_VARIANTS  PI    
scaffold14  18960001    19020000    14  0.000233333

Scaffold14 in fact has only 18,967,204 bp. So again, the pi value of the last window of this scaffold was underestimated (The bin size should be 18967204-18960001+1=7204 here).

I want to ask is there any methods that can specify the program not to over estimate bin size? I've been read on the manual of VCFtools, but did not see any similar function.

Will be grateful for any suggestions.

alignment assembly genome • 109 views
ADD COMMENTlink modified 8 weeks ago • written 8 weeks ago by 60343011s10
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2415 users visited in the last hour