Is this an appropriate command for estimating contaminatin in a germline WGS sample using GATK?
0
1
Entering edit mode
8 hours ago
curious ▴ 900

I have 30x WGS human samples and want to estimate contamination using the GATK tools to crosscheck another software that I am using to estimate contamination, this is what I have done:

gatk GetPileupSummaries \
  -I my_sample.cram \
  -V small_exac_common_3.hg38.vcf.gz \
  -L small_exac_common_3.hg38.vcf.gz \
  -O my_sample.getpileups.table \
  --reference hs38DH.fa

gatk CalculateContamination \
  -I my_sample.getpileups.table \
  -O my_sample.contamination.table

where small_exac_common_3.hg38.vcf.gz was gotten from gs://gatk-best-practices/somatic-hg38/small_exac_common_3.hg38.vcf.gz

My main questions are:

 1. Is this an appropriate contam estimation workflow for germline human WGS?
 2. If so, is small_exac_common_3.hg38.vcf.gz a reasonable file to  pass for the -V and -L arguments?
 3. from my_sample.contamination.table I get columns "sample", "contamination", "error". If the value of "contamination" is 0.01, does this mean the sample is contaminated 1%?
gatk • 54 views
ADD COMMENT

Login before adding your answer.

Traffic: 3027 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6