Question: How to filter VCF where at least x% of the individuals have DP>=10 and GQ>=20
0
gravatar for hellbio
2.7 years ago by
hellbio420
hellbio420 wrote:

Hi,

I would like filter the vcf file using DP and GQ thresholds at sites where atleast 80% of the individuals meeting the thresholds. More precisely, i have the below two scenarios:

  1. Retain sites where atleast 80% of the individuals had at least depth DP >= 10 and GQ>=20 irrespective of the reference or non-reference allele.

  2. Retain sites where atleast one sample has the non-reference allele with DP>= 10 and GQ >= 20.

I checked the vcftools documentation but could not find where i could specify the minimum number of individuals. I believe there could an existing thread or solution to acheive this. Could someone refer the solution here.

gatk filter vcf • 1.4k views
ADD COMMENTlink modified 2.7 years ago by Pierre Lindenbaum131k • written 2.7 years ago by hellbio420
1
gravatar for Pierre Lindenbaum
2.7 years ago by
France/Nantes/Institut du Thorax - INSERM UMR1087
Pierre Lindenbaum131k wrote:

using vcffilterjdk : http://lindenb.github.io/jvarkit/VcfFilterJdk.html

Retain sites where atleast 80% of the individuals had at least depth DP >= 10 and GQ>=20 irrespective of the reference or non-reference allele

 java -jar dist/vcffilterjdk.jar -e 'return variant.getGenotypes().stream().filter(G->G.getDP()>=10 && G.getGQ()>=20).count()/(double)variant.getNSamples() > 0.8;' input.vcf

Retain sites where atleast one sample has the non-reference allele with DP>= 10 and GQ >= 20.

$ java -jar dist/vcffilterjdk.jar -e 'return variant.getGenotypes().stream().anyMatch(G->G.getDP()>=10 && G.getGQ()>=20 && G.getAlleles().stream().anyMatch(A->A.isCalled() && !A.isReference())) ;'
ADD COMMENTlink modified 2.7 years ago • written 2.7 years ago by Pierre Lindenbaum131k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2154 users visited in the last hour