Question: Applying hard filters for variants
0
gravatar for bioinforesearchquestions
3.9 years ago by
United States
bioinforesearchquestions260 wrote:

I am currently working on influenza virus and ebola virus. I have 45 virus samples, so I have 45 bam files aligned with the influenza reference genome.fa.

java -Xmx16g -Djava.io.tmpdir=$out_folder/tmp -jar GenomeAnalysisTK.jar \
-T UnifiedGenotyper \
-nt 12 \
-dcov 10000 \
-glm BOTH \
-R influenza.fa \
-l INFO \
-o A_California_Influenza_Virus.raw.vcf \
--sample_ploidy 1 \
$INPUT_BAM_FILES

I got the raw VCF file (A_California_Influenza_Virus.raw.vcf) for 45 samples in the single VCF. I have 1400 VCF records in the raw VCF file.

As per the GATK best practice pipeline research paper, I applied hard filtering option for small datasets.

_Is my VCF records small to go for hard filtering? _

Then I selected snps alone in a separate VCF file.

java -jar /data1/software/gatk/current/GenomeAnalysisTK.jar -T SelectVariants -R A_California_Influenza_Virus_H1N1.fa -V A_California_Influenza_Virus.raw.vcf -selectType SNP -o VariantFiltering/A_California_Influenza_Virus.raw.snps.vcf

Then I applied hard filtering for SNPs. 
java -jar GenomeAnalysisTK.jar -T VariantFiltration -R A_California_Influenza_Virus_H1N1.fa -V VariantFiltering/A_California_Influenza_Virus.raw.snps.vcf --filterExpression "QD < 2.0 || FS > 60.0 || MQ < 40.0 || MQRankSum < -12.5 || ReadPosRankSum < -8.0" --filterName "myfilter1" -o VariantFiltering/A_California_Influenza_Virus.filtered.snps.vcf

I understand that the variants matching the above conditions are bad variants. 
What does QD < 2.0 mean? 
What does FS > 60.0 means?
What does MQ < 40.0 ?
What does MQRankSum < -12.5?
What ReadPosRankSum < -8.0?
What is the threshold value of high confidence variants for QD, FS, MQ, MQRankSum, ReadPosRankSum, DP?

ADD COMMENTlink modified 4 months ago by Biostar ♦♦ 20 • written 3.9 years ago by bioinforesearchquestions260
0
gravatar for Brice Sarver
3.9 years ago by
Brice Sarver3.1k
United States
Brice Sarver3.1k wrote:

A quick search brings up this: http://gatkforums.broadinstitute.org/discussion/2806/howto-apply-hard-filters-to-a-call-set

ADD COMMENTlink written 3.9 years ago by Brice Sarver3.1k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 853 users visited in the last hour