Question: VCFtools --minDP flags not working
3
gravatar for beausoleilmo
2.5 years ago by
beausoleilmo230
McGill University
beausoleilmo230 wrote:

I have a VCF where I'm applying filters. I want to filter by minDP and maxDP.

vcftools --vcf output.vcf --minDP 4 --maxDP 100 --recode --out output.filtered4

I've tried with version v0.1.13 and v0.1.15.

Parameters as interpreted:
    --vcf output.vcf
    --maxDP 100
    --minDP 4
    --out output.filtered4
    --recode
After filtering, kept 56 out of 56 Individuals
Outputting VCF file...
After filtering, kept 626116 out of a possible 626116 Sites

Ok, maybe I have nothing to filter in that range. I tried also this:

vcftools --vcf output.vcf --minDP 10000 --maxDP 10000000 --recode --out output.filtered4

Parameters as interpreted:
    --vcf output.vcf
    --maxDP 1e+07
    --minDP 1e+04
    --out output.filtered4
    --recode

After filtering, kept 56 out of 56 Individuals
Outputting VCF file...
After filtering, kept 626116 out of a possible 626116 Sites

But it's telling me that it kept everything! I don't understand.

How to filter with minDP and maxDP?

On their website, they are saying that all sites should have the "DP" FORMAT tag. I checked with this:

grep GT:PL:DP:SP:GQ output.vcf | wc -l
626116

This is the same number of sites that I have in my filtering. So I have everything, but it's not working. Am I the only one with this problem?

vcftools read depth • 1.5k views
ADD COMMENTlink modified 2.2 years ago • written 2.5 years ago by beausoleilmo230

what's the diff ?

diff output.vcf output.filtered4
ADD REPLYlink modified 2.5 years ago • written 2.5 years ago by Pierre Lindenbaum120k

I don't really see where is the difference. The data is starting at line 84680. With diff, it's telling me that the files are different, but when I look into the files, they are not really different (with the flag -y), I can compare in two different columns)

84682,710797c84682,710797
< scaffold440   420 .   T   G   999 .   DP=72;VDB=7.55993e-06;SGB=11.7312;RPB=0.0732065;MQB=0.00151772;MQSB=0.0740741;BQB=0.72433;MQ0F=0;AF1=0.120295;AC1=13;DP4=45,2,9,0;MQ=43;FQ=999;PV4=1,0.101958,0.00623343,0.268587   GT:PL:DP:SP:GQ  0/0:0,6,74:2:0:12   0/0:0,0,0:0:0:7 0/0:0,0,0:0:0:7 0/0:0,0,0:0:0:7 0/0:0,3,44:1:0:9    0/0:0,0,0:0:0:7 0/0:0,0,0:0:0:7 0/0:0,9,102:3:0:15  0/0:0,27,186:9:0:33 0/0:0,0,0:0:0:7 0/0:0,0,0:0:0:7 0/0:0,0,0:0:0:7 0/1:46,0,104:6:0:40 0/0:0,0,0:0:0:7 0/0:0,0,0:0:0:7 0/0:0,15,112:5:0:21 0/1:66,6,0:2:0:7    0/0:0,0,0:0:0:7 0/0:0,0,0:0:0:7 0/0:0,0,0:0:0:7 0/0:0,0,0:0:0:7 0/0:0,3,44:1:0:9    0/0:0,3,44:1:0:9    0/0:0,3,50:1:0:9    0/0:0,0,0:0:0:7 0/0:0,3,44:1:0:9    0/0:0,0,0:0:0:7 0/0:0,0,0:0:0:7 0/0:0,0,0:0:0:7 0/0:0,0,0:0:0:7 0/0:0,3,44:1:0:9    0/0:0,0,0:0:0:7 0/0:0,3,44:1:0:9    0/0:0,3,41:1:0:9    0/0:0,6,75:2:0:12   0/0:0,9,100:3:0:15  0/0:0,3,44:1:0:9    0/0:0,0,0:0:0:7 0/0:0,0,0:0:0:7 0/0:0,0,0:0:0:7 0/0:0,6,49:2:0:12   0/0:0,0,0:0:0:7 0/0:0,0,0:0:0:7 0/0:0,6,85:2:0:12   0/0:0,3,32:1:0:9    0/0:0,0,0:0:0:7 0/0:0,9,110:3:0:15  0/0:0,6,70:2:0:12   0/0:0,0,0:0:0:7 0/0:0,0,0:0:0:7 0/0:0,0,0:0:0:7 0/0:0,0,0:0:0:7 0/0:0,0,0:0:0:7 0/0:0,0,0:0:0:7 1/1:126,15,0:5:0:5  0/0:0,0,0:0:0:7
< scaffold440   451 .   C   A   11.216  .   DP=71;VDB=0.0249187;SGB=-0.0412595;RPB=0.910489;MQB=0.70327;BQB=0.638446;MQ0F=0;AF1=0.0946915;AC1=10;DP4=35,0,3,0;MQ=43;FQ=11.6156;PV4=1,0.0826989,0.117893,1   GT:PL:DP:SP:GQ  0/0:0,3,35:1:0:11   0/0:0,0,0:0:0:8 0/0:0,0,0:0:0:8 0/0:0,0,0:0:0:8 0/0:0,0,0:0:0:8 0/0:0,0,0:0:0:8 0/0:0,0,0:0:0:8 0/0:0,0,0:0:0:8 0/0:0,24,141:8:0:31 0/0:0,0,0:0:0:8 0/0:0,0,0:0:0:8 0/0:0,0,0:0:0:8 0/0:0,1,93:5:0:9    0/0:0,0,0:0:0:8 0/0:0,0,0:0:0:8 0/0:0,0,0:0:0:8 0/0:0,6,56:2:0:14   0/0:0,0,0:0:0:8 0/0:0,0,0:0:0:8 0/0:0,0,0:0:0:8 0/0:0,0,0:0:0:8 0/0:0,3,44:1:0:11   0/0:0,3,44:1:0:11   0/0:0,3,47:1:0:11   0/0:0,0,0:0:0:8 0/0:0,0,0:0:0:8 0/0:0,0,0:0:0:8 0/0:0,0,0:0:0:8 0/0:0,0,0:0:0:8 0/0:0,0,0:0:0:8 0/0:0,3,44:1:0:11   0/0:0,0,0:0:0:8 0/0:0,3,44:1:0:11   0/0:0,3,35:1:0:11   0/0:0,6,53:2:0:14   0/0:0,6,58:2:0:14   0/0:0,0,0:0:0:8 0/0:0,0,0:0:0:8 0/0:0,0,0:0:0:8 0/0:0,0,0:0:0:8 0/0:0,0,0:0:0:8 0/0:0,0,0:0:0:8 0/0:0,0,0:0:0:8 0/0:0,6,85:2:0:14   0/1:32,3,0:1:0:10   0/0:0,0,0:0:0:8 0/0:0,9,110:3:0:16  0/1:35,3,0:1:0:11   0/0:0,0,0:0:0:8 0/0:0,0,0:0:0:8 0/0:0,0,0:0:0:8 0/0:0,0,0:0:0:8 0/0:0,0,0:0:0:8 0/0:0,0,0:0:0:8 0/0:0,15,104:5:0:22 0/0:0,0,0:0:0:8
< scaffold440   452 .   C   A   10.3236 .   DP=71;VDB=0.02;SGB=-0.018177;RPB=1;MQB=0.162162;BQB=0.986486;MQ0F=0;AF1=0.0848011;AC1=9;DP4=37,0,2,0;MQ=42;FQ=10.6733;PV4=1,0.272596,0.0406405,1    GT:PL:DP:SP:GQ  0/0:0,3,35:1:0:11   0/0:0,0,0:0:0:8 0/0:0,0,0:0:0:8 0/0:0,0,0:0:0:8 0/0:0,0,0:0:0:8 0/0:0,0,0:0:0:8 0/0:0,0,0:0:0:8 0/0:0,0,0:0:0:8 0/0:0,24,145:8:0:32 0/0:0,0,0:0:0:8 0/0:0,0,0:0:0:8 0/0:0,0,0:0:0:8 0/0:0,15,114:5:0:23 0/0:0,0,0:0:0:8 0/0:0,0,0:0:0:8 0/0:0,3,13:1:0:11   0/0:0,6,59:2:0:14   0/0:0,0,0:0:0:8 0/0:0,0,0:0:0:8 0/0:0,0,0:0:0:8 0/0:0,0,0:0:0:8 0/0:0,3,44:1:0:11   0/0:0,3,44:1:0:11   0/0:0,3,38:1:0:11   0/0:0,0,0:0:0:8 0/0:0,0,0:0:0:8 0/0:0,0,0:0:0:8 0/0:0,0,0:0:0:8 0/0:0,0,0:0:0:8 0/0:0,0,0:0:0:8 0/0:0,3,44:1:0:11   0/0:0,0,0:0:0:8 0/0:0,3,44:1:0:11   0/0:0,3,35:1:0:11   0/0:0,6,58:2:0:14   0/0:0,6,55:2:0:14   0/0:0,0,0:0:0:8 0/0:0,0,0:0:0:8 0/0:0,0,0:0:0:8 0/0:0,0,0:0:0:8 0/0:0,0,0:0:0:8 0/0:0,0,0:0:0:8 0/0:0,0,0:0:0:8 0/0:0,6,85:2:0:14   0/1:32,3,0:1:0:11   0/0:0,0,0:0:0:8 0/0:0,9,110:3:0:17  0/1:35,3,0:1:0:11   0/0:0,0,0:0:0:8 0/0:0,0,0:0:0:8 0/0:0,0,0:0:0:8 0/0:0,0,0:0:0:8 0/0:0,0,0:0:0:8 0/0:0,0,0:0:0:8 0/0:0,15,107:5:0:23 0/0:0,0,0:0:0:8

The odd thing is that when I use vcftools with min-meanDP and max-meanDP it's working:

--vcf output.vcf --min-meanDP 4 --max-meanDP 100 --recode --out output.filtered4
ADD REPLYlink modified 2.5 years ago • written 2.5 years ago by beausoleilmo230

Basically, there is no difference... why it's not working in that case?

ADD REPLYlink written 2.5 years ago by beausoleilmo230
1
gravatar for beausoleilmo
2.2 years ago by
beausoleilmo230
McGill University
beausoleilmo230 wrote:

Basically, I think that it's working now. It's just that VCFtools will change the genotype to an unknown state (in other words, look at the genotype field (GT) and see if there is a change before and after the filtering option. This should change). 

I thought that the filter would "delete" some lines (loci or SNPs), but it's not the case. It's just changing the Genotype field. 

ADD COMMENTlink written 2.2 years ago by beausoleilmo230
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1046 users visited in the last hour