How to speed up the GenotypeGVCF?
1
0
Entering edit mode
3.4 years ago
kumar.vinod81 ▴ 330

Hi, I am working on joint genotyping using GenotypeGVCF with around 850 samples. Interval.list contains 8 chromosomes and several contigs. The script is running since last many days. Is there a way around to speed up this process? I asked GATK guys and they suggested me this script but don't know can I just use it for GenotypeGVCF because GenomicsDBimport is already finished. And another thing if I am using this script then interval.list is no more needed or need to be mentioned in some other way. The script by GATK guys is here:

for n in {1..19}; do

  gatk GenomicsDBImport -L $n <rest of command same as before>

  gatk GenotypeGVCFs -L $n <rest of command same as before>

done

I tried it this way: for n in {1..212};do qsub genonotypeGVCF.sh -L $n; done I have 8 chromosome and rest of the contigs (212). But it didn't work, the results output is very slow.

The GenomicsDBimport is already finished on my samples and I want to run it just for GenotypeGVCF. Is that possible?

Or I need to run both of these script together to make it faster.

How people are getting faster results with genotypeGVCF, can somebody post the scripts here?

Thanks,

GenotypeGVCF GATK • 2.1k views
ADD COMMENT
2
Entering edit mode
3.4 years ago
for n in {1..212};do qsub
  

use a workflow manager like snakemake or nextflow

How people are getting faster results with genotypeGVCF

split your REF genome into intervals using ScatterIntervalsByNs https://gatk.broadinstitute.org/hc/en-us/articles/360041416072-ScatterIntervalsByNs-Picard-

ADD COMMENT

Login before adding your answer.

Traffic: 1762 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6