Snp Filtering Based On Read-Depth In Gatk
1
0
Entering edit mode
11.3 years ago
michealsmith ▴ 790

A long-time question about SNP calling and filtering using GATK is: Has GATK used read depth as a metrics for SNP filtering? Say, only SNPs covered with at least 10 reads will be preserved. There's a DP metrics in GATK, an example is as follows:

1    53139    53140    AA    -    1    53138    rs199543075    TAA    T    238.33    PASS    AC=1;AF=0.250;AN=4;BaseQRankSum=-1.954;DB;DP=30;FS=0.000;HaplotypeScore=0.5834;MLEAC=1;MLEAF=0.250;MQ=13.35;MQ0=0;MQRankSum=-0.312;QD=14.02;RPA=3,1;RU=A;ReadPosRankSum=1.093;STR;VQSLOD=2.04;culprit=QD;set=variant    GT:AD:DP:GQ:PL    0/0:5,0:5:15:0,15,255    0/1:2,6:8:78:277,0,78

From vcf header we know:

#FORMAT=<ID=DP,Number=1,Type=Integer,Description="Approximate read depth (reads with MQ=255 or with bad mates are filtered)">

So this DP metrics represent the read depth? Here DP=30; while the total read depth for my two samples is 5+8=13, so why different?

Also, I always come across SNP callings with very low coverage, like 2 or 3 reads, in my filtered list of SNP, so I hardly believe GATK ever sets up an actual read-depth as metrics for filtering.

Thanks

gatk snp • 7.6k views
ADD COMMENT
0
Entering edit mode

I think 5+8 are the number of reads that have been really used for the genotyping (QUAL> value, etc... )

ADD REPLY
3
Entering edit mode
11.3 years ago
Jorjial ▴ 300

You can find the explanation in the GATK guide. We can read:

"While the sample-level (FORMAT) DP field describes the total depth of reads that passed the Unified Genotyper's internal quality control metrics (like MAPQ > 17, for example), the INFO field DP represents the unfiltered depth over all samples..." I think this description solves your first question.

About the second question, I think that GATK sets up a threshold of 2 reads that have passed the quality control to print the position as a covered position (GT different of "./."). If the position can be a variant, I think it will be printed with an alternative allele. Anyway, you can always ask in the GATK community. I hope this helps.

ADD COMMENT

Login before adding your answer.

Traffic: 3458 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6