Entering edit mode
7.4 years ago
fbsja
•
0
hiya,
Wanted to enquire about the filtering using the 'PASS' tag under the FILTER column in the VCF file for multi-sample VCF files.
I used the VariantFiltration GATK command below to generate different tags including the PASS filter.
########################################################################################
java -XX:-DoEscapeAnalysis -jar /home/software/GATK-3.5-0/GenomeAnalysisTK.jar \
-R hg19_withouM.fa \
-T VariantFiltration \
--variant /home/john/CFTR/multi_sample.vcf \
-o /home/john/CFTR/multi_sample.filtered.vcf \
--clusterWindowSize 10 \
--filterExpression "MQ0 >= 4 && ((MQ0 / (1.0 * DP)) > 0.1)" \
--filterName "HARD_TO_VALIDATE" \
--filterExpression "DP < 5 " \
--filterName "LowCoverage" \
--filterExpression "QUAL < 30.0 " \
--filterName "VeryLowQual" \
--filterExpression "QUAL > 30.0 && QUAL < 50.0 " \
--filterName "LowQual" \
--filterExpression "QD < 1.5 " \
--filterName "LowQD" \
--filterExpression "SB > -10.0 " \
--filterName "StrandBias"
########################################################################################
Does the PASS filter - correspond only to the first samples or all samples? I was planning to filter for just PASS sites as a general rule of thumb but this drastically reduces my total number of SNPs.
Wondering if I'd be better off keeping all variants and apply general QC filters like genotype rate? maf? individual missingness?
thank you very much.
cheers, J
ALL samples.
for sample/ genotypes, use
genotypeFilterExpression and genotypeFilterName https://software.broadinstitute.org/gatk/gatkdocs/org_broadinstitute_gatk_tools_walkers_filters_VariantFiltration.php#--genotypeFilterExpression