Is this an appropriate command for estimating contaminatin in a germline WGS sample using GATK?
1
1
Entering edit mode
8 weeks ago
curious ▴ 900

I have 30x WGS human samples and want to estimate contamination using the GATK tools to crosscheck another software that I am using to estimate contamination, this is what I have done:

gatk GetPileupSummaries \
  -I my_sample.cram \
  -V small_exac_common_3.hg38.vcf.gz \
  -L small_exac_common_3.hg38.vcf.gz \
  -O my_sample.getpileups.table \
  --reference hs38DH.fa

gatk CalculateContamination \
  -I my_sample.getpileups.table \
  -O my_sample.contamination.table

where small_exac_common_3.hg38.vcf.gz was gotten from gs://gatk-best-practices/somatic-hg38/small_exac_common_3.hg38.vcf.gz

My main questions are:

 1. Is this an appropriate contam estimation workflow for germline human WGS?
 2. If so, is small_exac_common_3.hg38.vcf.gz a reasonable file to  pass for the -V and -L arguments?
 3. from my_sample.contamination.table I get columns "sample", "contamination", "error". If the value of "contamination" is 0.01, does this mean the sample is contaminated 1%?
gatk • 686 views
ADD COMMENT
2
Entering edit mode
19 days ago
Kevin Blighe ★ 90k

Yes, this is an appropriate workflow for estimating contamination in germline human WGS (it's the standard GATK method, though originally somatic-focused; it works well for diploid samples by modeling allelic fraction deviations at common SNPs).

Yes, small_exac_common_3.hg38.vcf.gz is the exact resource recommended by GATK for -V and -L (it's a subset of ~10k high-confidence ExAC common variants optimized for this).

And yes, a contamination value of 0.01 indicates ~1% contamination (it's the estimated fraction of reads from cross-sample admixture). For cross-checking, I'd suggest also running VerifyBamID (non-GATK alternative) if you want an orthogonal method.

Kind regards,

Kevin

ADD COMMENT

Login before adding your answer.

Traffic: 3351 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6