Question: Questions on tagging low qual variants and DP filtering on a joint VCF (generated by GATK GenotypeGVCFs)
0
gravatar for caok
4.1 years ago by
caok0
United States
caok0 wrote:

Hi Folks,
I got two questions according to joint VCF (multiple samples) need your help.

  1. I need to flag SNPs and Indels in the "FILTER" column (PASS or Low_confidence) in a joint VCF generate by GenotypeGVCFs. Basically, we call a family (trio) together so a typical joint VCF contains calls from child and parents. I followed the rules that proposed by GATK: http://gatkforums.broadinstitute.org/discussion/2806/howto-apply-hard-filters-to-a-call-set
    Then I noticed that in the joint VCF, the INFO field is generated basically based on all samples in the VCF. However, I just want to tag the "FILTER" column based on  Child (Child column).  How can I apply the GATK SelectVariants on this joint vcf and use the information from Child only?  Or any other tools would help?

  2. I also want to filter the same joint VCF by DP in the child column, how can I do it with GATK or any other tools? SelectVariants seems to extract DP in the "INFO" field, which is a DP sum of  from all samples that have been joint. Any suggestions?

Thank you very much!

-Linda

next-gen • 2.8k views
ADD COMMENTlink modified 4.1 years ago • written 4.1 years ago by caok0
0
gravatar for Len Trigg
4.1 years ago by
Len Trigg1.2k
New Zealand
Len Trigg1.2k wrote:

I'm not sure exactly what you mean by your first question (do you mean annotate with the variant type based on the genotype of the Child?)

The second one is simple enough if you use Real Time Genomics tools:

rtg vcffilter -i input-variants.vcf.gz -o output-variants.vcf.gz --min-read-depth=NN --sample=Child --fail=CHILD-LOW-DP

 

 

 

ADD COMMENTlink written 4.1 years ago by Len Trigg1.2k

Thank you for your answer.  The first question is to put in "PASS" or "Low_confidence" in the "FILTER" column based on some filtering thresholds on QD and FS of SNPs and Indels.  For single sample VCF, its easy, but for joint VCF, I want to put in the flag based on one sample in the joint VCF (for example child).  But GATK VariantFilteriation is using the "INFO" column which in joint VCF, is a summary of all samples in the VCF.

I hope I made it clearly.  Any suggestions?

ADD REPLYlink written 4.1 years ago by caok0
0
gravatar for caok
4.1 years ago by
caok0
United States
caok0 wrote:

Thank you for your answer.  The first question is to put in "PASS" or "Low_confidence" in the "FILTER" column based on some filtering thresholds on QD and FS of SNPs and Indels.  For single sample VCF, its easy, but for joint VCF, I want to put in the flag based on one sample in the joint VCF (for example child).  But GATK VariantFilteriation is using the "INFO" column which in joint VCF, is a summary of all samples in the VCF.

I hope I made it clearly.  Any suggestions?

ADD COMMENTlink written 4.1 years ago by caok0
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 987 users visited in the last hour