A long-time question about SNP calling and filtering using GATK is: Has GATK used read depth as a metrics for SNP filtering? Say, only SNPs covered with at least 10 reads will be preserved. There's a DP metrics in GATK, an example is as follows:
1 53139 53140 AA - 1 53138 rs199543075 TAA T 238.33 PASS AC=1;AF=0.250;AN=4;BaseQRankSum=-1.954;DB;DP=30;FS=0.000;HaplotypeScore=0.5834;MLEAC=1;MLEAF=0.250;MQ=13.35;MQ0=0;MQRankSum=-0.312;QD=14.02;RPA=3,1;RU=A;ReadPosRankSum=1.093;STR;VQSLOD=2.04;culprit=QD;set=variant GT:AD:DP:GQ:PL 0/0:5,0:5:15:0,15,255 0/1:2,6:8:78:277,0,78
From vcf header we know:
#FORMAT=<ID=DP,Number=1,Type=Integer,Description="Approximate read depth (reads with MQ=255 or with bad mates are filtered)">
So this DP metrics represent the read depth? Here DP=30; while the total read depth for my two samples is 5+8=13, so why different?
Also, I always come across SNP callings with very low coverage, like 2 or 3 reads, in my filtered list of SNP, so I hardly believe GATK ever sets up an actual read-depth as metrics for filtering.
Thanks
I think 5+8 are the number of reads that have been really used for the genotyping (QUAL> value, etc... )