Question

How Can I Speed Up Gatk Unified Genotyper

1

Entering edit mode

12.6 years ago

Alex Coventry ▴ 10

I'm running the GATK on 500 samples to call variants in a few megabases of hg18. I am finding that it's going surprisingly slowly. For instance, I have UnifiedGenotyper running on some 1kb regions at the moment, and many have been running over 12 hours without completion. This could be because parts of the regions I'm targeting for caling were capture-targetted, and the pile up of illumina reads aligned to those regions can be very deep. So my next experiment is to try to mitigate the effect of these deeply covered regions by running GATK with a relatively low -dcov value, say around 50. If this could be expected to substantially affect its accuracy, I would be grateful to learn about it.

Here are the options I'm running GATK with, in case I'm doing something silly:

-T UnifiedGenotyper -glm BOTH -L $region \ -R .../human_b36_both.chr.fasta -o $outpath -I <bamfile> -I <bamfile> ...

Also, I understand there's a markov chain underlying the UG's calls. I suspect slow convergence might be the main factor. Is there an option to tell UG to punt on a site after a certain length of markov chain?

gatk • 4.6k views

ADD COMMENT • link updated 7.7 years ago by Biostar 20 • written 12.6 years ago by Alex Coventry ▴ 10

0

Entering edit mode

are you putting all 500 .bams through UG at the same time?

ADD REPLY • link 12.6 years ago by Russh ★ 1.2k

0

Entering edit mode

Yes. Is that too much?

ADD REPLY • link 12.6 years ago by Alex Coventry ▴ 10

0

Entering edit mode

I was going to recommend posting at getsatisfaction.com but it seems you already have. I thought your command line might be too long, or that you'd maxed out the memory, but that doesn't seem likely having seen your gsa post.

ADD REPLY • link 12.6 years ago by Russh ★ 1.2k

0

Entering edit mode

Cross-ref: http://getsatisfaction.com/gsa/topics/speeding_up_the_unifiedgenotyper_on_493_samples

ADD REPLY • link 12.6 years ago by Brad Chapman 9.7k